Proposal: priority queues in containers

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
36 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Proposal: priority queues in containers

Louis Wasserman
PROPOSAL:  Add a priority queue implementation to the containers package.  Specific modules will include Data.PQueue.Min, Data.PQueue.Max, and Data.PQueue.

DEADLINE:  Next Friday, Mar 26.

The ticket, #3909, is here.  It had done some bouncing around glasgow-haskell-users (mostly because I forgot how to do a proper libraries proposal submission), and there have been several implementations benchmarked, compared, and argued about.

DETAILS:

The proposed implementation benchmarked competitively with every alternative implementation that we tested, and offers good asymptotics in nearly every operation.  Specifically, we use a binomial heap, which offers
  • O(log n) worst-case union
  • O(log n) worst-case extract (this in particular was a key objective of ours)
  • amortized O(1), worst-case O(log n) insertion.  (In a persistent context, the amortized performance bound does not necessarily hold.)
This implementation seems to offer the best balance between practical performance and asymptotic behavior.  Moreover, this implementation is extremely memory-efficient, resulting in better performance on large data sets.  (See the ticket for benchmarks.  The most accurate benchmarks are towards the end.)

A couple potentially contentious design decisions:
  • There is no distinction between keys and priority values.  A utility type Prio p a with the instance Ord p => Ord (Prio p a) is exported to allow usage of distinct keys and priority values.
  • Min-queues and max queues are separated the following way:
    • Data.PQueue.Min exports the type MinQueue.
    • Data.PQueue.Max exports the type MaxQueue.  (This is implemented as a wrapper around MinQueue.)  The method names are the same as Data.PQueue.Min, but I think it's acceptable to force qualified imports when using both a max-queue and a min-queue.
    • Data.PQueue adds the alias type PQueue = MinQueue, so that the "default" behavior is a min-queue.
These design decisions seem to be sufficient to handle most traditional uses for priority queues.  In particular, this approach offers a superset of the functionality offered by Java's built-in priority queue implementation, which makes the same design decisions, but of course, is all imperative and yucky!  (Also, it offers inferior asymptotics, heh.)

I'm really satisfied with the patch as-is, modulo maybe tinkering with the code style a little.  I'm also working on an article for TMR on priority queues in Haskell, some of the different structures we considered, and particularly the new more-type-safe implementation I came up with for binomial heaps in the writing of this implementation.

Anyway, discuss, complain, et cetera.



_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: priority queues in containers

Malcolm Wallace

On 16 Mar 2010, at 13:54, Louis Wasserman wrote:

> PROPOSAL:  Add a priority queue implementation to the containers  
> package.  Specific modules will include Data.PQueue.Min,  
> Data.PQueue.Max, and Data.PQueue.

I have not yet needed a priority queue for any application, so I have  
no specific technical opinion in this particular proposal.

I would suggest that if you continue to receive relative silence on  
the topic, then it may not be a good candidate for the standard  
"containers" library, simply because a demand for it has not yet been  
demonstrated.  (You can easily release it as a separate package anyway.)

This, I hope, will be sufficient of a cue for anyone who _does_ care  
about PQs to speak up now!

Regards,
     Malcolm
_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: priority queues in containers

Johan Tibell-2
In reply to this post by Louis Wasserman
On Tue, Mar 16, 2010 at 2:54 PM, Louis Wasserman
<[hidden email]> wrote:
> PROPOSAL:  Add a priority queue implementation to the containers package.
>  Specific modules will include Data.PQueue.Min, Data.PQueue.Max, and
> Data.PQueue.
> DEADLINE:  Next Friday, Mar 26.
> The ticket, #3909, is here.  It had done some bouncing around
> glasgow-haskell-users (mostly because I forgot how to do a proper libraries
> proposal submission), and there have been several implementations
> benchmarked, compared, and argued about.

Could please post the code and the generated Haddock documentation
somewhere for easier review?

Thanks!

-- Johan
_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: priority queues in containers

Ian Lynagh
In reply to this post by Louis Wasserman
On Tue, Mar 16, 2010 at 08:54:15AM -0500, Louis Wasserman wrote:
> PROPOSAL:  Add a priority queue implementation to the containers package.
>  Specific modules will include Data.PQueue.Min, Data.PQueue.Max, and
> Data.PQueue.

I don't have an opinion on whether or not a priority queue should be
added to containers, although I do wonder how this implementation
compares to packages on hackage, e.g.:

http://hackage.haskell.org/package/queuelike
http://hackage.haskell.org/package/priority-queue
http://hackage.haskell.org/package/pure-priority-queue
http://hackage.haskell.org/package/queue
http://hackage.haskell.org/package/fingertree-psqueue
http://hackage.haskell.org/package/PSQueue

Personally, I would also like to see us moving towards having a
testsuite for each library, so I would like to see new additions coming
with tests so that we aren't moving further from that goal.


Thanks
Ian

_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: priority queues in containers

Louis Wasserman
http://hackage.haskell.org/package/queuelike
I wrote this one.  Heh.  This is an improvement on it.

More importantly, what I wanted to achieve by adding priority queues to containers was having a canonical implementation for a ubiquitous data structure, which has a simple and user-friendly interface (something the queuelike package, above, attempts but isn't extraordinarily successful at), which is guaranteed to perform efficiently and correctly.

As a comparison example, there are many, many finite map implementation packages.  (I wrote TrieMap, which is a pretty involved automatic generalized trie map type inference system...yeah.)  There's a reason, however, that Data.Map exists, and that so many people use it when an algorithmically superior implementation might exist on Hackage -- because it's programmer-friendly and accessible, and because its performance is well-understood and has been thoroughly verified.  When I use Data.Map, I'm not trusting some random schlub to have written bug-free code -- I'm trusting the Haskell built-in libraries, and the people who write and maintain those libraries.

I think priority queues are ubiquitous enough to deserve the same treatment.

On another note, Ian, it's really not very clear what the proper procedure for adding test suites is.  Is it to just add a tester program separately from the patch?  Is it to put the test suites somewhere within the containers folder, and have them be an integral part of the patch?  There seem to be some sort-of-test-ish files in the containers package already, but I have no idea how to add new tests, and if the existing interface is extensible, it's not at all clear how to extend it.

_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: priority queues in containers

Jim Apple-4
In reply to this post by Ian Lynagh
> I do wonder how this implementation
> compares to packages on hackage, e.g.:

Also:

http://hackage.haskell.org/package/heap
http://hackage.haskell.org/package/TreeStructures
_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: priority queues in containers

Louis Wasserman
In reply to this post by Louis Wasserman
Also, looking through the packages you linked to: "queue" and "priority-queue" seem to have extremely icky interfaces, nothing as readable as a containers module.  "pure-priority-queue" is a wrapper around Data.Map, which is definitely slower than a specialized priority queue.  

Similarly, "fingertree-psqueue" and "PSQueue", while they expose nicely readable interfaces, are *priority search queues,* which provide considerably stronger functionality than a priority queue.  (Specifically, they provide things like lookup-priority and increase-priority.)  When being used as a vanilla priority queue, PSQs are considerably slower than specialized priority queues -- this was one of the implementations we tested.

The priority search queue is a useful structure for a number of applications, but much more commonly, a simple priority queue is all that is required.  In particular, both the C++ STL and java.util.* provide just a vanilla priority queue, reflecting those design choices in those languages.  

Since priority search queues are actually much more natural in imperative languages than in functional languages -- since we can maintain pointers to the node associated with each key -- I think it makes even less sense for Haskell to use PSQueues in containers.  I think it's fair to say that most algorithms which need it only really require priority queues, and that it makes sense to provide for that majority in containers, though we might consider recommending the PSQueue package to programmers who need the full power of priority search queues.

My own queuelike library...is decent, but not spectacular, and reflected some early experimentation.  The implementation attached to the ticket, however, is more reliable than the implementations I wrote in queuelike.  (In particular, Data.PQueue has worst-case O(n) delete-min, and in a persistent context, its amortized performance is still O(n).  Ouch.)

_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: priority queues in containers

Jim Apple-4
> Since priority search queues are actually much more natural in imperative
> languages than in functional languages -- since we can maintain pointers to
> the node associated with each key -- I think it makes even less sense for
> Haskell to use PSQueues in containers.  I think it's fair to say that most
> algorithms which need it only really require priority queues, and that it
> makes sense to provide for that majority in containers, though we might
> consider recommending the PSQueue package to programmers who need the full
> power of priority search queues.

I'm not sure about some of that. Imperative queues sometimes offer
O(1) decreaseKey and O(lg n) delete by storing pointers to elements in
a priority queue. The usual purely functional PQs can only offer O(n)
decreaseKey and O(n) delete. Purely functional PSQs can offer O(lg n)
decreaseKey and O(lg n) delete.

Minimum spanning tree is a common application for PQs that makes good
use of decreaseKey.
_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: priority queues in containers

Yitzchak Gale
In reply to this post by Louis Wasserman
Louis Wasserman wrote:
> ...what I wanted to achieve by adding priority queues to
> containers was having a canonical implementation for a ubiquitous data
> structure, which has a simple and user-friendly interface...
> which is guaranteed to perform efficiently and correctly.

This is a good goal.

> ...There's a reason, however, that Data.Map exists...
> I think priority queues are ubiquitous enough to deserve the same treatment.

This I'm not sure about. I've considered using PQ's perhaps
two or three times during the past few years, and ended up not using
them at all. True, perhaps I would have used them if there had been a
simple and reliable PQ implementation at my fingertips. But it's
definitely not a ubiquitous meme like Map or Set.

How about recording this great work as yet one more PQ package
on hackage. But make it far more usable than all of the others
by two simple steps:

1. In the package description, say simply and clearly what purpose
this package is designed to serve, as you have described in this
thread.

2. Submit this package for canonicalization as part of the Haskell
Platform. I would for one would support its inclusion.

> both the C++ STL and java.util.* provide just a vanilla priority queue,
> reflecting those design choices in those languages

As does Python. In Python, though, the PQ implementation is not
a built-in class in the default namespace (as are dict and set).
Rather, it is one of the "batteries included" libraries that come with
Python. I think that's the right place for it in Haskell, too.

Please do consider this suggestion. However, if we do not quickly
reach a consensus on this, I will withdraw my suggestion and advocate
inclusion in containers as originally proposed. The difference between
those options is far less important than the risk of losing this great work
to haggling.

Thanks,
Yitz
_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: priority queues in containers

Ross Paterson
In reply to this post by Louis Wasserman
On Tue, Mar 16, 2010 at 08:54:15AM -0500, Louis Wasserman wrote:
> A couple potentially contentious design decisions:
>
>   * There is no distinction between keys and priority values.  A utility type
>     Prio p a with the instance Ord p => Ord (Prio p a) is exported to allow
>     usage of distinct keys and priority values.

I disagree with this one.  It requires an Ord instance that isn't really
an ordering, and makes a Functor instance impossible.  I would prefer
an interface separating keys and values like that of Data.Map (which
would also increase consistency within the package).

Other comments:

The Foldable instance breaks the abstraction.  I think it should process
elements in order.

How does this implementation compare with using Map/Set as a priority
queue?

The documentation for union should mention that it's not stable.  That's a
pity, but may be justified if there's a big performance difference with
the representations that support stable union.

I think the containers package should offer general extractor functions
with a parameter of type (a -> Maybe (b, a)), so we wouldn't need special
cases like these ones.
_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: priority queues in containers

Bryan O'Sullivan
In reply to this post by Yitzchak Gale
On Wed, Mar 17, 2010 at 4:52 AM, Yitzchak Gale <[hidden email]> wrote:
This I'm not sure about. I've considered using PQ's perhaps
two or three times during the past few years, and ended up not using
them at all.

I'd prefer for personal anecdote not to be the driver of discussions like this. Personally, I've never used Data.Tree or Data.Graph from the containers package, and doubt that I ever will, but it doesn't break my arm or steal my wristwatch if they're available.

The types in the containers package have subtly different APIs and degrees of control over things like strictness, and also have almost no test coverage along with few signs of usage-driven optimisation. The bar for adding more code to that package is thus pretty low, in my opinion. If I had time, I'd replace it entirely.

My advice would be not to stress over whether priority queues go into containers. It's not some pristine thing of beauty that deserves treatment with velvet gloves. If you want a good source of stress, create a replacement package that makes me happy :-)
 
True, perhaps I would have used them if there had been a
simple and reliable PQ implementation at my fingertips. But it's
definitely not a ubiquitous meme like Map or Set.

How about recording this great work as yet one more PQ package
on hackage. But make it far more usable than all of the others
by two simple steps:

1. In the package description, say simply and clearly what purpose
this package is designed to serve, as you have described in this
thread.

2. Submit this package for canonicalization as part of the Haskell
Platform. I would for one would support its inclusion.

> both the C++ STL and java.util.* provide just a vanilla priority queue,
> reflecting those design choices in those languages

As does Python. In Python, though, the PQ implementation is not
a built-in class in the default namespace (as are dict and set).
Rather, it is one of the "batteries included" libraries that come with
Python. I think that's the right place for it in Haskell, too.

Please do consider this suggestion. However, if we do not quickly
reach a consensus on this, I will withdraw my suggestion and advocate
inclusion in containers as originally proposed. The difference between
those options is far less important than the risk of losing this great work
to haggling.

Thanks,
Yitz
_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries


_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: priority queues in containers

Ian Lynagh
In reply to this post by Louis Wasserman
On Tue, Mar 16, 2010 at 11:54:35AM -0500, Louis Wasserman wrote:
>
> On another note, Ian, it's really not very clear what the proper procedure
> for adding test suites is.  Is it to just add a tester program separately
> from the patch?  Is it to put the test suites somewhere within the
> containers folder, and have them be an integral part of the patch?  There
> seem to be some sort-of-test-ish files in the containers package already,
> but I have no idea how to add new tests, and if the existing interface is
> extensible, it's not at all clear how to extend it.

The best way is probably to make tests using the GHC testsuite framework
in tests/priority-queue/. The framework would benefit from some tweaking
so that it is better able to test a single library in isolation, though.


Thanks
Ian

_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: priority queues in containers

Gwern Branwen
In reply to this post by Bryan O'Sullivan
On Wed, Mar 17, 2010 at 11:48 AM, Bryan O'Sullivan <[hidden email]> wrote:

> On Wed, Mar 17, 2010 at 4:52 AM, Yitzchak Gale <[hidden email]> wrote:
>>
>> This I'm not sure about. I've considered using PQ's perhaps
>> two or three times during the past few years, and ended up not using
>> them at all.
>
> I'd prefer for personal anecdote not to be the driver of discussions like
> this. Personally, I've never used Data.Tree or Data.Graph from the
> containers package, and doubt that I ever will, but it doesn't break my arm
> or steal my wristwatch if they're available.

Skipping anecdote, then, let's look at dependencies. Does anyone use
the existing priority-queues? The ensemble doesn't look so bad that
one would always roll one's own priority queue, yet I don't see *any*
non-queue libraries/apps depending on any of the queue libraries:

http://bifunctor.homelinux.net/~roel/hackage/packages/archive/revdeps-list.html

(I'm not sure how often that's updated but I wouldn't expect being up
to date to make much of a difference.)

If there really are no users of priority-queue, that's a pretty good
argument for keeping priority-queue out of containers, leaving it on
Hackage, and coming back in a year or two to see whether there's been
any uptake.

--
gwern
_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: priority queues in containers

Brandon S Allbery KF8NH
In reply to this post by Bryan O'Sullivan
On Mar 17, 2010, at 11:48 , Bryan O'Sullivan wrote:
My advice would be not to stress over whether priority queues go into containers. It's not some pristine thing of beauty that deserves treatment with velvet gloves. If you want a good source of stress, create a replacement package that makes me happy :-)

Especially in the absence of guidelines for "that makes you happy", I think the answer to that one is "build it and see if they come".  (There does seem to be a lot of interest in a "cleaner" containers library, but not a whole lot of consensus.)

-- 
brandon s. allbery [solaris,freebsd,perl,pugs,haskell] [hidden email]
system administrator [openafs,heimdal,too many hats] [hidden email]
electrical and computer engineering, carnegie mellon university    KF8NH



_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries

PGP.sig (202 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: priority queues in containers

Bugzilla from jed@59a2.org
In reply to this post by Yitzchak Gale
On Wed, 17 Mar 2010 13:52:36 +0200, Yitzchak Gale <[hidden email]> wrote:
> As does Python. In Python, though, the PQ implementation is not
> a built-in class in the default namespace (as are dict and set).
> Rather, it is one of the "batteries included" libraries that come with
> Python. I think that's the right place for it in Haskell, too.

FWIW, Haskell's containers are far less central than Python's dict, upon
which the type system itself is built.

Jed
_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: priority queues in containers

Simon Marlow-7
In reply to this post by Ian Lynagh
On 17/03/2010 15:56, Ian Lynagh wrote:

> On Tue, Mar 16, 2010 at 11:54:35AM -0500, Louis Wasserman wrote:
>>
>> On another note, Ian, it's really not very clear what the proper procedure
>> for adding test suites is.  Is it to just add a tester program separately
>> from the patch?  Is it to put the test suites somewhere within the
>> containers folder, and have them be an integral part of the patch?  There
>> seem to be some sort-of-test-ish files in the containers package already,
>> but I have no idea how to add new tests, and if the existing interface is
>> extensible, it's not at all clear how to extend it.
>
> The best way is probably to make tests using the GHC testsuite framework
> in tests/priority-queue/. The framework would benefit from some tweaking
> so that it is better able to test a single library in isolation, though.

Shouldn't the tests go in the library repo itself?

Also, if we want people to use the GHC testsuite framework for
individual library testing, we ought to separate the framework from the
huge test suite itself.

Then I bet we'd get complaints that it isn't written in the right
language :-)

Cheers,
        Simon
_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: priority queues in containers

Ian Lynagh
On Thu, Mar 18, 2010 at 10:50:05AM +0000, Simon Marlow wrote:

> On 17/03/2010 15:56, Ian Lynagh wrote:
>> On Tue, Mar 16, 2010 at 11:54:35AM -0500, Louis Wasserman wrote:
>>>
>>> On another note, Ian, it's really not very clear what the proper procedure
>>> for adding test suites is.  Is it to just add a tester program separately
>>> from the patch?  Is it to put the test suites somewhere within the
>>> containers folder, and have them be an integral part of the patch?  There
>>> seem to be some sort-of-test-ish files in the containers package already,
>>> but I have no idea how to add new tests, and if the existing interface is
>>> extensible, it's not at all clear how to extend it.
>>
>> The best way is probably to make tests using the GHC testsuite framework
>> in tests/priority-queue/. The framework would benefit from some tweaking
>> so that it is better able to test a single library in isolation, though.
>
> Shouldn't the tests go in the library repo itself?

Right, I mean tests/priority-queue/ in the containers repo (there are
already a couple of tests in tests/).


Thanks
Ian

_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: priority queues in containers

Louis Wasserman
In reply to this post by Bugzilla from jed@59a2.org
Oh, god, so much to respond to...heh.

 Submit this package for canonicalization as part of the Haskell Platform. I would for one would support its inclusion.

This is an option I seriously hadn't considered.  To be fair, that's because I've never used the Platform myself, preferring rather to have the most up-to-date version of GHC at all times, heh.  That said, while I'd be okay with this option, I'd prefer putting it into containers, because I feel like a canonical, reliable priority queue implementation is the sort of thing a self-respecting language ought to have built in.

As does Python. In Python, though, the PQ implementation is not a built-in class in the default namespace (as are dict and set).  Rather, it is one of the "batteries included" libraries that come with Python. I think that's the right place for it in Haskell, too.

I don't know Python, but according to Wikipedia, dict and set are built into the language.  I don't think it's a fair comparison: set and dict in Python seem to have a role almost as ubiquitous as [] in Haskell, much more ubiquitous than e.g. Data.Set or Data.Map.  I'm also not entirely sure that "batteries included" doesn't describe containers, given all the other packages that come with GHC.

 >   * There is no distinction between keys and priority values.  A utility type
 >     Prio p a with the instance Ord p => Ord (Prio p a) is exported to allow
 >     usage of distinct keys and priority values. 
I disagree with this one.  It requires an Ord instance that isn't really an ordering, and makes a Functor instance impossible.  I would prefer an interface separating keys and values like that of Data.Map (which would also increase consistency within the package).

I'd be okay with separating out a priority/value version.  However, I'm still clueless as to what you're talking about with respect to Functor -- what's wrong with the following?
data Prio p a = Prio p a
instance Ord p => Ord (Prio p a) where ...
instance Functor (Prio p) where fmap f (Prio p a) = Prio p (f a)

I can understand if you're uncomfortable with (==) and (\ x y -> compare x y == EQ) being inequivalent, but neither the H98 Report nor the Prelude make any such claim, as far as I can tell.

 The Foldable instance breaks the abstraction.  I think it should process elements in order.

I think that wanting to iterate over the priority queue in the middle of the computation, without caring about order, is a perfectly legitimate desire for a programmer!  Moreover, the only way to make a Foldable instance process elements in order would be something like
data Ord a => PQueue a = ....
which I think is an awfully dirty approach.  I'm not at all a fan of adding restrictions like that, not least because it adds lots of awkward overhead.
Would you be okay with not exporting a Foldable instance at all, but still exporting a toList method which doesn't guarantee any ordering on the return list?

My advice would be not to stress over whether priority queues go into containers. It's not some pristine thing of beauty that deserves treatment with velvet gloves.

I'm...not sure how to respond to this claim.  At least part of me wants to say, I genuinely do think the containers package is a piece of art...and then another part pipes up, "except for the inconsistencies between the various minView/maxView versions, and the little differences between IntMap and Map, and..."  That said, I wouldn't be a fan of scrapping the style which the containers package has at least tried to create.  I could be convinced that rewriting the rest of containers would be a good thing to do, though...and I might just want to do that myself.  Hah.

 How does this implementation compare with using Map/Set as a priority queue?

Continuing the discussion of the benchmarks: first, Jim, it appears that I'm the one who made a n00b benchmarking error.  TT_____TT  That said, as you found, my implementation is still slightly faster when the benchmark is corrected.  Some comments:
  • QuickBinom needs to have O(1) findMin for a reasonable comparison.  I added that in the benchmark below, and here.
  • I can't think of any more optimizations for the sparse binomial heap -- I genuinely think it's not going to get better.
  • There is a way to optimize my implementation still further, but it makes my code much less readable.  (Specifically, I start the BinomForest at Succ Zero, and unpack the first child of every node still in the forest.  Modifying the whole implementation that way, though, makes it unreadably ugly...and I think QuickBinom is possibly already at that point.  I started implementing it, and realized just how ugly it was, and I stopped, but I can finish it if I have to.)
Sorting 500,000 elements, compiled with -O2, run with +RTS -H128m -K64m, with another few implementations thrown in for good measure:
Times (ms)
               min      mean      +/-sd    median      max  
Pairing:    1440.090  1482.893    31.501  1482.093  1532.095
Binomial:   1356.084  1389.687    26.881  1388.087  1436.090
SBinomial:  1376.086  1422.489    48.453  1400.088  1520.095
Data.Set:   1712.107  1800.912    74.880  1782.111  1976.123
Skew:       1584.099  1644.503    85.702  1602.101  1848.116

Some other benchmarks were done by Milan Straka earlier, when we hadn't decided what heap implementation to use at all.  His comments:
I think the pairing heaps are out of the question now. I would choose between Binomial and Leftist. The Binomial have O(1) amortized inserts, but beware, that this does not work in persistent setting -- the insert is O(log N) when the heaps are used persistently (there are Skew Binomial Heaps with O(1) insert in the persistent setting too). 
 I vote for current (v5) Binomial implementation, even if the O(1) amortized inserts works only non-persistently (ie. under heavy persistent usage, leftist heaps might be a _little_ better).

Conclusions: There aren't any differences in asymptotics, and as it stands, the existing implementation is just as fast.  It's also a) done, and b) full of Haskellish type goodness.

After about five hours' work (!!!) I *finally* managed to install Criterion yesterday, so I'll send out those tests ASAP.

Louis Wasserman
[hidden email]
http://profiles.google.com/wasserman.louis


_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: priority queues in containers

Evan Laforge
On Thu, Mar 18, 2010 at 7:43 AM, Louis Wasserman
<[hidden email]> wrote:
> Oh, god, so much to respond to...heh.

You did request feedback back there, didn't you :P

>> As does Python. In Python, though, the PQ implementation is not a built-in
>> class in the default namespace (as are dict and set).  Rather, it is one of
>> the "batteries included" libraries that come with Python. I think that's the
>> right place for it in Haskell, too.
>
> I don't know Python, but according to Wikipedia, dict and set are built into
> the language.  I don't think it's a fair comparison: set and dict in Python
> seem to have a role almost as ubiquitous as [] in Haskell, much more

It's not really the same.  pqueue is not in the built-in namespace in
python, that's like the Prelude in haskell.  pqueue *is* in the
default library, which every python installation will have since it
comes with the default download, this is what's meant by "batteries
included".  So that's like containers: you have to explicitly import
it, but you shouldn't worry about installations that don't have it
because it comes with the compiler.

The main difference is that python doesn't have cabal and doesn't have
anything like the haskell platform and installing new packages, while
easy, is not as automatic as cabal can be.

> After about five hours' work (!!!) I *finally* managed to install Criterion
> yesterday, so I'll send out those tests ASAP.

I wanted to use criterion too at one point, but it looked too hard to
install so I was scared away...
_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: priority queues in containers

Milan Straka
Hi,

> > After about five hours' work (!!!) I *finally* managed to install Criterion
> > yesterday, so I'll send out those tests ASAP.
>
> I wanted to use criterion too at one point, but it looked too hard to
> install so I was scared away...

Why is that? Because of the Chart depencency?

You can install progression -- a wrapper over criterion, which uses
gnuplot to draw graphs, and disable the Chart dependency.

I did just
cabal --user install -f "-Chart" criterion progression
and it installed without a problem, on ghc-6.12.1, ghc-6.13.something,
both Linux and Windows.

Cheers,
Milan
_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
12