Dear Cafe,
some years ago I formalized the mergesort implementation [1] from http://hackage.haskell.org/package/base-4.11.1.0/docs/src/Data.OldList.html#sort (as far as I can tell it didn't change since 2012) in the proof assistant Isabelle/HOL [2]. More specifically, I proved its functional correctness (the result is sorted and contains all elements of the input with exactly the same multiplicities) and that it is a stable sorting algorithm. Very recently I also formalized a complexity result in Isabelle/HOL, namely that the number of comparisons is bounded from above by n + n * ⌈log 2 n⌉ for lists of length n. For this proof I had to change the definition of "sequences", "ascending", and "descending" slightly. Now here is my question: Does anyone now of reasons why the current implementation of "sequences" is allowed to produce spurious empty lists in its result? The version I used in my formalization only differs in the following three spots: sequences [x] = [[x]] -- this is the only important change, since sequences [] = [] -- then the result does not contain empty lists instead of sequences xs = [xs] and ascending a as [] = let !x = as [a] in [x] instead of ascending a as bs = let !x = as [a] in x : sequences bs and descending a as [] = [a:as] instead of descending a as bs = (a:as) : sequences bs [1] https://www.isa-afp.org/entries/Efficient-Mergesort.html [2] http://isabelle.in.tum.de/ _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post. |
Hi,
Am Dienstag, den 18.09.2018, 10:21 +0200 schrieb Christian Sternagel: > some years ago I formalized the mergesort implementation [1] from > > > http://hackage.haskell.org/package/base-4.11.1.0/docs/src/Data.OldList.html#sort > > (as far as I can tell it didn't change since 2012) in the proof > assistant Isabelle/HOL [2]. > > More specifically, I proved its functional correctness (the result is > sorted and contains all elements of the input with exactly the same > multiplicities) and that it is a stable sorting algorithm. just a shameless plug: As part of our hs-to-coq project, we mechanically translated Data.List.sort to Coq, proved its termination (for finite inputs, of course) and functional correctness; https://github.com/antalsz/hs-to-coq/blob/master/examples/containers/theories/SortSorted.v > Very recently I also formalized a complexity result in Isabelle/HOL, > namely that the number of comparisons is bounded from above by > > n + n * ⌈log 2 n⌉ > > for lists of length n. Cool! That’s of course not easily possible with out shallow embedding into Coq. > For this proof I had to change the definition of "sequences", > "ascending", and "descending" slightly. > > Now here is my question: Does anyone now of reasons why the current > implementation of "sequences" is allowed to produce spurious empty lists > in its result? The version I used in my formalization only differs in > the following three spots: > … It _could_ have been benchmarked and measured and determined that it is actually faster to do less case analysis here. But I find that unlikely (both the investigation and this output), and it is probably simply an oversight, and I would expect that a patch to that effect would be appreciated ! Cheers, Joachim -- Joachim Breitner [hidden email] http://www.joachim-breitner.de/ _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post. signature.asc (849 bytes) Download Attachment |
In reply to this post by Christian Sternagel
If you're guaranteeing that the result won't contain empty lists, it would be worth benchmarking the effect of using NonEmpty x instead of [x] in that spot. On Tue, Sep 18, 2018, 4:21 AM Christian Sternagel <[hidden email]> wrote: Dear Cafe, _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post. |
Dear David,
I am guaranteeing (since I proved it in Isabelle/HOL) that the following version of "sequences" does not contain empty lists in its result (I am copying from my Isabelle formalization, in order to avoid typos) ;) fun sequences :: "'a list ⇒ 'a list list" and asc :: "'a ⇒ ('a list ⇒ 'a list) ⇒ 'a list ⇒ 'a list list" and desc :: "'a ⇒ 'a list ⇒ 'a list ⇒ 'a list list" where "sequences (a # b # xs) = (if key a > key b then desc b [a] xs else asc b ((#) a) xs)" | "sequences [x] = [[x]]" | "sequences [] = []" | "asc a as (b # bs) = (if ¬ key a > key b then asc b (λys. as (a # ys)) bs else as [a] # sequences (b # bs))" | "asc a as [] = [as [a]]" | "desc a as (b # bs) = (if key a > key b then desc b (a # as) bs else (a # as) # sequences (b # bs))" | "desc a as [] = [a # as]" The "key" function is an implicit first parameter for each of "sequences", "asc", and "desc" above. The fact that I am using a "key" function instead of a comparator is due to Isabelle/HOL's standard library. Also, there are no pattern guards in Isabelle/HOL. Anyway, it should be relatively straight-forward to translate these functions into Haskell. Another thing: I just realized that "merge_pairs" in my formalization also differs from "mergePairs", since with the changed "sequences" it might actually get an empty list as input, in which case the current "mergePairs" wouldn't terminate at all. So for those who are interested, the full definition of mergesort can be found here https://devel.isa-afp.org/browser_info/current/AFP/Efficient-Mergesort/Efficient_Mergesort.html where mergesort is called "msort_key". cheers chris Btw: What is "NonEmpty x"? On 09/18/2018 11:55 AM, David Feuer wrote: > If you're guaranteeing that the result won't contain empty lists, it > would be worth benchmarking the effect of using NonEmpty x instead of > [x] in that spot. > > On Tue, Sep 18, 2018, 4:21 AM Christian Sternagel <[hidden email] > <mailto:[hidden email]>> wrote: > > Dear Cafe, > > some years ago I formalized the mergesort implementation [1] from > > > http://hackage.haskell.org/package/base-4.11.1.0/docs/src/Data.OldList.html#sort > > (as far as I can tell it didn't change since 2012) in the proof > assistant Isabelle/HOL [2]. > > More specifically, I proved its functional correctness (the result is > sorted and contains all elements of the input with exactly the same > multiplicities) and that it is a stable sorting algorithm. > > Very recently I also formalized a complexity result in Isabelle/HOL, > namely that the number of comparisons is bounded from above by > > n + n * ⌈log 2 n⌉ > > for lists of length n. > > For this proof I had to change the definition of "sequences", > "ascending", and "descending" slightly. > > Now here is my question: Does anyone now of reasons why the current > implementation of "sequences" is allowed to produce spurious empty lists > in its result? The version I used in my formalization only differs in > the following three spots: > > sequences [x] = [[x]] -- this is the only important change, since > sequences [] = [] -- then the result does not contain empty lists > > instead of > > sequences xs = [xs] > > and > > ascending a as [] = let !x = as [a] in [x] > > instead of > > ascending a as bs = let !x = as [a] in x : sequences bs > > and > > descending a as [] = [a:as] > > instead of > > descending a as bs = (a:as) : sequences bs > > [1] https://www.isa-afp.org/entries/Efficient-Mergesort.html > [2] http://isabelle.in.tum.de/ > _______________________________________________ > Haskell-Cafe mailing list > To (un)subscribe, modify options or view archives go to: > http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe > Only members subscribed via the mailman list are allowed to post. > Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post. |
data NonEmpty a = a :| [a] It's a nonempty list, defined in Data.List.NonEmpty. On Tue, Sep 18, 2018, 9:14 AM Christian Sternagel <[hidden email]> wrote: Dear David, _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post. |
Dear David,
thanks for the pointer. Btw: I was able to modify my complexity proof so that "sequences" is no longer required to only contain non-empty lists. Sorry for the noise. But maybe such a "sequences" is not entirely useless: Did I understand correctly that what you are saying is that knowing that all elements in "sequences" are non-empty lists might have some impact on performance due to being able to use "NonEmpty a"? cheers chris On 09/18/2018 03:18 PM, David Feuer wrote: > data NonEmpty a = a :| [a] > > It's a nonempty list, defined in Data.List.NonEmpty. > > On Tue, Sep 18, 2018, 9:14 AM Christian Sternagel <[hidden email] > <mailto:[hidden email]>> wrote: > > Dear David, > > I am guaranteeing (since I proved it in Isabelle/HOL) that the following > version of "sequences" does not contain empty lists in its result (I am > copying from my Isabelle formalization, in order to avoid typos) ;) > > fun sequences :: "'a list ⇒ 'a list list" > and asc :: "'a ⇒ ('a list ⇒ 'a list) ⇒ 'a list ⇒ 'a list list" > and desc :: "'a ⇒ 'a list ⇒ 'a list ⇒ 'a list list" > where > "sequences (a # b # xs) = > (if key a > key b then desc b [a] xs else asc b ((#) a) xs)" > | "sequences [x] = [[x]]" > | "sequences [] = []" > | "asc a as (b # bs) = > (if ¬ key a > key b then asc b (λys. as (a # ys)) bs > else as [a] # sequences (b # bs))" > | "asc a as [] = [as [a]]" > | "desc a as (b # bs) = > (if key a > key b then desc b (a # as) bs > else (a # as) # sequences (b # bs))" > | "desc a as [] = [a # as]" > > The "key" function is an implicit first parameter for each of > "sequences", "asc", and "desc" above. The fact that I am using a "key" > function instead of a comparator is due to Isabelle/HOL's standard > library. Also, there are no pattern guards in Isabelle/HOL. Anyway, it > should be relatively straight-forward to translate these functions into > Haskell. > > Another thing: I just realized that "merge_pairs" in my formalization > also differs from "mergePairs", since with the changed "sequences" it > might actually get an empty list as input, in which case the current > "mergePairs" wouldn't terminate at all. > > So for those who are interested, the full definition of mergesort can be > found here > > > https://devel.isa-afp.org/browser_info/current/AFP/Efficient-Mergesort/Efficient_Mergesort.html > > where mergesort is called "msort_key". > > cheers > > chris > > Btw: What is "NonEmpty x"? > > On 09/18/2018 11:55 AM, David Feuer wrote: > > If you're guaranteeing that the result won't contain empty lists, it > > would be worth benchmarking the effect of using NonEmpty x instead of > > [x] in that spot. > > > > On Tue, Sep 18, 2018, 4:21 AM Christian Sternagel > <[hidden email] <mailto:[hidden email]> > > <mailto:[hidden email] <mailto:[hidden email]>>> wrote: > > > > Dear Cafe, > > > > some years ago I formalized the mergesort implementation [1] from > > > > > > > http://hackage.haskell.org/package/base-4.11.1.0/docs/src/Data.OldList.html#sort > > > > (as far as I can tell it didn't change since 2012) in the proof > > assistant Isabelle/HOL [2]. > > > > More specifically, I proved its functional correctness (the > result is > > sorted and contains all elements of the input with exactly the > same > > multiplicities) and that it is a stable sorting algorithm. > > > > Very recently I also formalized a complexity result in > Isabelle/HOL, > > namely that the number of comparisons is bounded from above by > > > > n + n * ⌈log 2 n⌉ > > > > for lists of length n. > > > > For this proof I had to change the definition of "sequences", > > "ascending", and "descending" slightly. > > > > Now here is my question: Does anyone now of reasons why the > current > > implementation of "sequences" is allowed to produce spurious > empty lists > > in its result? The version I used in my formalization only > differs in > > the following three spots: > > > > sequences [x] = [[x]] -- this is the only important change, > since > > sequences [] = [] -- then the result does not contain > empty lists > > > > instead of > > > > sequences xs = [xs] > > > > and > > > > ascending a as [] = let !x = as [a] in [x] > > > > instead of > > > > ascending a as bs = let !x = as [a] in x : sequences bs > > > > and > > > > descending a as [] = [a:as] > > > > instead of > > > > descending a as bs = (a:as) : sequences bs > > > > [1] https://www.isa-afp.org/entries/Efficient-Mergesort.html > > [2] http://isabelle.in.tum.de/ > > _______________________________________________ > > Haskell-Cafe mailing list > > To (un)subscribe, modify options or view archives go to: > > http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe > > Only members subscribed via the mailman list are allowed to post. > > > Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post. |
It may. Or it may not. The performance impact of such small changes can be hard to predict. On Tue, Sep 18, 2018, 10:38 AM Christian Sternagel <[hidden email]> wrote: Dear David, _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post. |
Free forum by Nabble | Edit this page |