Haskell data structures: cheap immutable manipulation and nested equality?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Haskell data structures: cheap immutable manipulation and nested equality?

Anand Patil
Hi all,


I'm slowly shopping around for a functional language. The first one I
seriously considered was Clojure, because of its agents; but I'm now
realizing that most of my concurrency needs can be met using 'par' and
'seq', so I'm thinking about giving Haskell a closer look.

However, Clojure's collections (maps, lists, vectors, sets) have a
couple of features that I would really miss:


- Cheap 'manipulation'. In the following:

user=> (def m {:a 1 :b 2})
#'user/m1
user=> (assoc m :a 3)
{:a 3, :b 2}

the 'assoc' function produces a new map which actually shares
structure with m, meaning it doesn't need to make a full copy in
memory; but it also leaves m unchanged. This works efficiently even
for nested, mixed collections.


- Cheap equality by value:

user=> (= m {:a 1 :b 2 :c {:d 3 :f 4}})
false
user=> (= m {:a 1 :b 2})
true

If I understand correctly, equality is computed based on some kind of
hash rather than by comparing the two maps element by element, so it's
efficient even for large and/or nested collections.


Do Haskell's data structures have analogous properties? If not, are
there libraries providing such data structures?


Thanks,
Anand
Reply | Threaded
Open this post in threaded view
|

Re: Haskell data structures: cheap immutable manipulation and nested equality?

Heinrich Apfelmus
Anand Patil wrote:

> - Cheap 'manipulation'. In the following:
>
> user=> (def m {:a 1 :b 2})
> #'user/m1
> user=> (assoc m :a 3)
> {:a 3, :b 2}
>
> the 'assoc' function produces a new map which actually shares
> structure with m, meaning it doesn't need to make a full copy in
> memory; but it also leaves m unchanged. This works efficiently even
> for nested, mixed collections.

Since Haskell is pure, this is the default. Every data structure is
persistent and shares as much as possible with previous versions.
Example in point: an infinite list of ones

   ones = 1:ones


> - Cheap equality by value:
>
> user=> (= m {:a 1 :b 2 :c {:d 3 :f 4}})
> false
> user=> (= m {:a 1 :b 2})
> true
>
> If I understand correctly, equality is computed based on some kind of
> hash rather than by comparing the two maps element by element, so it's
> efficient even for large and/or nested collections.
>
> Do Haskell's data structures have analogous properties? If not, are
> there libraries providing such data structures?

Not present in Haskell, no existing library either. You can roll your
own, of course. (Also note that hash based comparison might yield false
positives).

Why would you need this feature? I can only think of common
subexpression elimination.


Generally, the above two points are minor compared to the other things
that Haskell has to offer, like pattern matching, Hindler Milner type
system, purity.


Regards,
apfelmus

--
http://apfelmus.nfshost.com

Reply | Threaded
Open this post in threaded view
|

Re: Haskell data structures: cheap immutable manipulation and nested equality?

Heinrich Apfelmus
Anand Patil wrote:
Heinrich Apfelmus wrote:

>> Anand Patil wrote:
>>> - Cheap equality by value:
>>>
>>> user=> (= m {:a 1 :b 2 :c {:d 3 :f 4}})
>>> false
>>> user=> (= m {:a 1 :b 2})
>>> true
>>>
>>> If I understand correctly, equality is computed based on some kind of
>>> hash rather than by comparing the two maps element by element, so it's
>>> efficient even for large and/or nested collections.
>>
>> Why would you need this feature? I can only think of common
>> subexpression elimination.
>
> I have an expensive function that takes complicated data structures as
> an argument, and I know that it will often be evaluated twice running
> with the same argument, but it will be evaluated with lots of
> different arguments over the course of the program. A cheap equality
> test would make it easy to cache the last return value. Is there a
> better way to optimize this in Haskell?

Sounds indeed like a case for memoization to me. (While complicated data
structures as keys sounds suspicious, in my opinion.)

Cheap equality won't necessarily help, though, you want at least the
hash value itself. But since calculating a hash will take O(length of
complicated data structure) anyway, you can also use a memoization
scheme based on "sums of products" which has the same complexity^1. In
particular, have a look at

   http://hackage.haskell.org/package/data-memocombinators

which is used like this:

   fib = Memo.integral fib'
      where
      fib' 0 = 0
      fib' 1 = 1
      fib' x = fib (x-1) + fib (x-2)

You'll have to combine existing memo combinators to make one for your
data structure; feel free to ask if you need more info on that.


^1: For common subexpression elimination, intermediate hashes count as
well, so this comparison doesn't apply.


Regards,
apfelmus

--
http://apfelmus.nfshost.com