To [] Or Not To []

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
27 messages Options
12
Reply | Threaded
Open this post in threaded view
|

To [] Or Not To []

Johannes Waldmann-2
Dear Cafe,

I wrote a text (well, you might call it a rant)
on the tragic over-use of lists in Haskell,
based on my experience in programming and teaching.

http://www.imn.htwk-leipzig.de/~waldmann/etc/untutorial/list-or-not-list/

* If your program accesses a list by index (with the !! operator),
  then your program is wrong.
* If your program uses the length function, then your program is wrong.
* If your program sorts a list, then your program is wrong.
* If you wrote this sort function yourself, then it is doubly wrong.
* The ideal use of a list is such that will be removed by the compiler.
* The enlightened programmer writes list-free code with Foldable.

Have fun reading, and avoiding lists from here on!

- J.W.
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: To [] Or Not To []

Robin Palotai
> Have fun reading, and avoiding lists from here on!

Except maybe the haskell-cafe list.

2017-03-09 16:07 GMT+01:00 Johannes Waldmann <[hidden email]>:
Dear Cafe,

I wrote a text (well, you might call it a rant)
on the tragic over-use of lists in Haskell,
based on my experience in programming and teaching.

http://www.imn.htwk-leipzig.de/~waldmann/etc/untutorial/list-or-not-list/

* If your program accesses a list by index (with the !! operator),
  then your program is wrong.
* If your program uses the length function, then your program is wrong.
* If your program sorts a list, then your program is wrong.
* If you wrote this sort function yourself, then it is doubly wrong.
* The ideal use of a list is such that will be removed by the compiler.
* The enlightened programmer writes list-free code with Foldable.

Have fun reading, and avoiding lists from here on!

- J.W.
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.


_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: To [] Or Not To []

Geraldus
In reply to this post by Johannes Waldmann-2
That was quite reasonable and very interesting and insightful reading.

Thanks!

P.S.  Will you share something about Strings?

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: To [] Or Not To []

Brandon Allbery
In reply to this post by Johannes Waldmann-2

On Thu, Mar 9, 2017 at 10:07 AM, Johannes Waldmann <[hidden email]> wrote:
* The enlightened programmer writes list-free code with Foldable.

I'd like to point out that FTP is still very recent and there are quite a few learning resources that know nothing of it.

--
brandon s allbery kf8nh                               sine nomine associates
[hidden email]                                  [hidden email]
unix, openafs, kerberos, infrastructure, xmonad        http://sinenomine.net

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: To [] Or Not To []

Johannes Waldmann-2
In reply to this post by Geraldus
> P.S.  Will you share something about Strings?

there is a short section on string(like) data types at
http://www.imn.htwk-leipzig.de/~waldmann/etc/untutorial/data/
but I am really not an expert on the finer points.

- J.W.

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: To [] Or Not To []

Richard A. O'Keefe
In reply to this post by Johannes Waldmann-2

> On 10/03/2017, at 4:07 AM, Johannes Waldmann <[hidden email]> wrote:
>
> * If your program sorts a list, then your program is wrong.

This seems a very strange claim.

The whole thing is an abuse of the word "wrong".
A program can be all of ugly, inefficient, unidiomatic,
&c &c without being WRONG.


_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: To [] Or Not To []

Doug McIlroy
In reply to this post by Johannes Waldmann-2
This stirred ancient memories in praise of wrong programs:

> > * If your program sorts a list, then your program is wrong.
>
> This seems a very strange claim.
>
> The whole thing is an abuse of the word "wrong".
> A program can be all of ugly, inefficient, unidiomatic,
> &c &c without being WRONG.

Fifty-plus years ago, when computing was 1000 times slower
and cost $600/hour, it was typical for professional programmers
to mediate between scientists and computers so that
those expensive machines would be used efficiently. At Bell
Labs, though, Dick Hamming insisted on open-shop computing
because "It is more important to have the right problem
done the wrong way, than to have the wrong problem done
the right way."

Doug
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: To [] Or Not To []

Geraldus
"It is more important to have the right problem
done the wrong way, than to have the wrong problem done
the right way." 

Sounds like a nonsense.  Does  "problem done the wrong way" implies the problem indeed isn't solved at all, doesn't it?

пт, 10 мар. 2017 г. в 10:42, Doug McIlroy <[hidden email]>:
This stirred ancient memories in praise of wrong programs:

> > * If your program sorts a list, then your program is wrong.
>
> This seems a very strange claim.
>
> The whole thing is an abuse of the word "wrong".
> A program can be all of ugly, inefficient, unidiomatic,
> &c &c without being WRONG.

Fifty-plus years ago, when computing was 1000 times slower
and cost $600/hour, it was typical for professional programmers
to mediate between scientists and computers so that
those expensive machines would be used efficiently. At Bell
Labs, though, Dick Hamming insisted on open-shop computing
because "It is more important to have the right problem
done the wrong way, than to have the wrong problem done
the right way."

Doug
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: To [] Or Not To []

Joachim Durchholz
Am 10.03.2017 um 08:19 schrieb Geraldus:
> "It is more important to have the right problem
> done the wrong way, than to have the wrong problem done
> the right way."
>
> Sounds like a nonsense.  Does  "problem done the wrong way" implies the
> problem indeed isn't solved at all, doesn't it?

No, in that context, "done the wrong way" meant that Spaghetti code and
similar abominations are okay if the program delivers.

For which even today a case can be made if the program is just a
one-shot explorative thing where nobody will ever spend a second look at
the results or at the program code.
Scientific computing often means fiddling with code to fiddle with data;
you do dozens of throwaway programs, and if something of interest comes
out, you can still write a reviewable one. Or not release the code at
all because anybody in the community will write his/her own data
validation code anyway.
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: To [] Or Not To []

Brandon Allbery
In reply to this post by Geraldus

On Fri, Mar 10, 2017 at 2:19 AM, Geraldus <[hidden email]> wrote:
Sounds like a nonsense.  Does  "problem done the wrong way" implies the problem indeed isn't solved at all, doesn't it?

Inefficiently. (Although yours is the extreme case of that.)

--
brandon s allbery kf8nh                               sine nomine associates
[hidden email]                                  [hidden email]
unix, openafs, kerberos, infrastructure, xmonad        http://sinenomine.net

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: To [] Or Not To []

Olaf Klinke
In reply to this post by Johannes Waldmann-2
Johannes,

Thanks for writing this.

Actually, your negative abstract turns out to be for a quite positive article. List are like iterators, that's the essence, and as long as you are iterating, it's fine to use []. The prominence of singly linked lists in Haskell has tought me to write my data processing programs in a streaming style. To think about what data must be held in memory (use Maps or whatever for this) and read the other data in as a list.

I'd be interested whether there is a way to check which of my lists in the source code the compiler managed to "deforest" away. Which intermediate files should I look at? What are the tools to inspect?

Olaf
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: To [] Or Not To []

Jerzy Karczmarczuk

Hi.

Le 11/03/2017 à 00:28, Olaf Klinke a écrit :
Actually, your negative abstract turns out to be for a quite positive article. List are like iterators, that's the essence, and as long as you are iterating, it's fine to use []. The prominence of singly linked lists in Haskell has tought me to write my data processing programs in a streaming style.
Yes. I've been teaching not just "data processing" - after all almost everything we program is "data processing", no?... but such concrete stuff as physics simulation (diff. eqs.), some other numerics (asymptotic expansions, etc.) signal processing (including sound generation), and I liked to present several examples in a dataflow style with plenty of co-recursive contraptions. Haskell lazy lists were natural, concise, and easy to manipulate.  We enjoyed it, wrong or not.

Nothing is perfect, not only  mister Nobody; calling ANY approach to programming "wrong" is sectarian. A professional coder working on a concrete project may say bad words about anything he wishes, but for a teacher this is a pedagogical sin, and inefficient programs can be more inspiring than some "correct" doctrines.

Jerzy Karczmarczuk

Garanti sans virus. www.avast.com

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: To [] Or Not To []

Deven Lahoti
I think you are missing the point of the article, as that is pretty much exactly the type of situation where it recommends lists be used. 

On Mar 10, 2017 7:45 PM, "JK" <[hidden email]> wrote:

Hi.

Le 11/03/2017 à 00:28, Olaf Klinke a écrit :
Actually, your negative abstract turns out to be for a quite positive article. List are like iterators, that's the essence, and as long as you are iterating, it's fine to use []. The prominence of singly linked lists in Haskell has tought me to write my data processing programs in a streaming style.
Yes. I've been teaching not just "data processing" - after all almost everything we program is "data processing", no?... but such concrete stuff as physics simulation (diff. eqs.), some other numerics (asymptotic expansions, etc.) signal processing (including sound generation), and I liked to present several examples in a dataflow style with plenty of co-recursive contraptions. Haskell lazy lists were natural, concise, and easy to manipulate.  We enjoyed it, wrong or not.

Nothing is perfect, not only  mister Nobody; calling ANY approach to programming "wrong" is sectarian. A professional coder working on a concrete project may say bad words about anything he wishes, but for a teacher this is a pedagogical sin, and inefficient programs can be more inspiring than some "correct" doctrines.

Jerzy Karczmarczuk

Garanti sans virus. www.avast.com

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: To [] Or Not To []

Olaf Klinke
In reply to this post by Johannes Waldmann-2
And I was merely pointing that out: That there are recommended use cases in the article, while the abstract given by the author on this list was quite negative. (Possibly it had to be, in order to gain the necessary attention.) I was not arguing against the syntactic preference Haskell gives to singly linked lists over other data structures. If Haskell was designed from scratch, what would take the place of []? Possibly Data.Sequence?

Mildly complex data processing programs are somewhat special: You read in and parse one or more input files, do some transformation on the parsed data and write some results back to disk. As the article explained, the Haskell compiler will be able to optimise away the intermediate list structure (at least in simple cases). If data dependency is local, you'll be fine with data input to String rather than Text or ByteString, with map rather than fmap or foldl', and you'll be fine to use a mapM_ on a list for output.

Olaf

> I think you are missing the point of the article, as that is pretty much
> exactly the type of situation where it recommends lists be used.
>
> On Mar 10, 2017 7:45 PM, "JK" <[hidden email]> wrote:
>
>> Hi.
>> Le 11/03/2017 à 00:28, Olaf Klinke a écrit :
>>
>> Actually, your negative abstract turns out to be for a quite positive article. List are like iterators, that's the essence, and as long as you are iterating, it's fine to use []. The prominence of singly linked lists in Haskell has tought me to write my data processing programs in a streaming style.
>>
>> *Yes.* I've been teaching not just "data processing" - after all almost
>> everything we program is "data processing", no?... but such concrete stuff
>> as physics simulation (diff. eqs.), some other numerics (asymptotic
>> expansions, etc.) signal processing (including sound generation), and I
>> liked to present several examples in a *dataflow style* with plenty of
>> co-recursive contraptions. Haskell lazy lists were natural, concise, and
>> easy to manipulate.  We enjoyed it, wrong or not.
>>
>> Nothing is perfect, not only  mister Nobody; calling ANY approach to
>> programming "wrong" is sectarian. A professional coder working on a
>> concrete project may say bad words about anything he wishes, but for a
>> teacher this is a pedagogical sin, and inefficient programs can be more
>> inspiring than some "correct" doctrines.
>>
>> Jerzy Karczmarczuk
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: To [] Or Not To []

Richard A. O'Keefe
In reply to this post by Geraldus

> On 10/03/2017, at 8:19 PM, Geraldus <[hidden email]> wrote:
>
> "It is more important to have the right problem
> done the wrong way, than to have the wrong problem done
> the right way."
>
> Sounds like a nonsense.  Does  "problem done the wrong way" implies the problem indeed isn't solved at all, doesn't it?

Absolutely not.

I remember back in about 1979 reading a procedure published
in an engineering journal and noticing at once that it took
O(n**5) time when simple manipulations familiar to any good
programmer would have made it O(n**2). To my way of thought
that was "done the wrong way", but the engineers who wrote
it were happy with what it did for them and the reviewers
are apparently happy that it gave the right answers.

Similarly, a program that needs 2MB of temporary data and
opens a connection to Oracle, stuffs the data into a
scratch table, and pulls it back later (without needing
ACID and without using anything more than SELECT * FROM TABLE)
is again to my mind doing it the wrong way, but if it gives
the right answers fast enough, the goal donor may not care.

The classic example from Java, bringing us back to the topic
of lists and strings, is

    String dumpMap(HashMap<String,Thing> map) {
        String s = "";
        for (Map.Entry<String,Thing> e : map.entrySet()) {
            s += "\t" + e.getKey() + "='" +
                e.getValue().toString() + "'\n";
        }
        return s;
    }

which takes quadratic time -- assuming Thing::toString() is
cheap -- when using a StringBuilder would make it linear.
This is taken from a real program I have been trying to
clean up to make Java 1.8 stop complaining about it.  Not
my code, I hasten to add.  The result is the result that
the author intended -- at least that's true of the original
rather messier code -- but it is done the wrong way.

I have known something not unlike this make the difference
between a program taking 0.5 seconds and the same program
taking 5 minutes, so "done the wrong way" but not "wrong
answers".  At 5 minutes, the program was still in fact useful.
This happened just last week and that one *was* my code.
(Not in Java, and not string concatenation.  Failing to purge
useless but logically harmless data from a collection.)






_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: To [] Or Not To []

Johannes Waldmann-2
In reply to this post by Olaf Klinke
Hi Olaf -


> I'd be interested whether there is a way to check
> which of my lists in the source code the compiler managed to "deforest" away.
> Which intermediate files should I look at? What are the tools to inspect?

Good question, and I don't know an easy answer.

The general advice is running ghc with "-ddump-simpl"
but I find it quite challenging to scan the output.


Here is a simple case where it works:

$ cat Fuse.hs

main = print $ sum $ map (^2) [1 .. 1000 :: Int]


$ ghc -fforce-recomp -ddump-simpl -O0 Fuse.hs

no optimisation - shows that "main" calls "map" etc.


$ ghc -fforce-recomp -ddump-simpl -O2 Fuse.hs

fusion works - nice non-allocating inner loop
(lists gone, and Int replaced by Int#)

Rec {
-- RHS size: {terms: 18, types: 3, coercions: 0}
Main.$wgo [InlPrag=[0], Occ=LoopBreaker]
  :: GHC.Prim.Int# -> GHC.Prim.Int# -> GHC.Prim.Int#
[GblId, Arity=2, Caf=NoCafRefs, Str=DmdType <S,1*U><S,U>]
Main.$wgo =
  \ (w_s5kV :: GHC.Prim.Int#) (ww_s5kZ :: GHC.Prim.Int#) ->
    case w_s5kV of wild_Xn {
      __DEFAULT ->
        Main.$wgo
          (GHC.Prim.+# wild_Xn 1#)
          (GHC.Prim.+# ww_s5kZ (GHC.Prim.*# wild_Xn wild_Xn));
      1000# -> GHC.Prim.+# ww_s5kZ 1000000#
    }
end Rec }

but the challenge is to find the path from "main" to that,
wading through several other functions that may or may not be related.


I can imagine a source annotation like "in the code compiled
from this function  f,  that constructor  C  should never be called"
but this is certainly not easy. Do we really mean "never", or do we mean
"only a bounded number of times" (that is, not in the inner loop).
Perhaps there is no code for  f  itself, because it gets inlined.

But yes, *some* automated analysis (and human-readable print-out)
of the code after simplification would be nice.

This could be done as a compiler plug-in?
https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/extending_ghc.html#compiler-plugins


- J.


_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: To [] Or Not To []

Tikhon Jelvis
Personally, I disagree with the whole premise. Lists are simple and elegant; you *should* use them most of them time. I'm not saying you should use lists as the data structure for an in-memory database or whatever, but that's the point—most applications only have a handful of data structures that "matter", and lists are great everywhere else.

I actually went through and replaced [] with Vector in all of the types we parse from JSON at work, some of which get relatively large. It made the code uglier and didn't meaningfully affect the performance. I undid that change, even though it's exactly the sort of thing this article recommends. In this day and age, simple things *scale*. That's enough most of the time; if you can get away with it you *should*.

The real advantage of lists comes at an intersection of two points: lists are effective in place of iterators in Haskell and, even misused as data structures, they're *not that bad* most of the time. This means that a good 80% of the time, the advantage of using a type that's compatible with the rest of my code and APIs that use lists "correctly" as iterators easily outweighs any small performance penalty. A list has to get pretty large—or my usage pattern pretty convoluted—before another type is worth the complexity.

On Mon, Mar 13, 2017 at 3:43 AM, Johannes Waldmann <[hidden email]> wrote:
Hi Olaf -


> I'd be interested whether there is a way to check
> which of my lists in the source code the compiler managed to "deforest" away.
> Which intermediate files should I look at? What are the tools to inspect?

Good question, and I don't know an easy answer.

The general advice is running ghc with "-ddump-simpl"
but I find it quite challenging to scan the output.


Here is a simple case where it works:

$ cat Fuse.hs

main = print $ sum $ map (^2) [1 .. 1000 :: Int]


$ ghc -fforce-recomp -ddump-simpl -O0 Fuse.hs

no optimisation - shows that "main" calls "map" etc.


$ ghc -fforce-recomp -ddump-simpl -O2 Fuse.hs

fusion works - nice non-allocating inner loop
(lists gone, and Int replaced by Int#)

Rec {
-- RHS size: {terms: 18, types: 3, coercions: 0}
Main.$wgo [InlPrag=[0], Occ=LoopBreaker]
  :: GHC.Prim.Int# -> GHC.Prim.Int# -> GHC.Prim.Int#
[GblId, Arity=2, Caf=NoCafRefs, Str=DmdType <S,1*U><S,U>]
Main.$wgo =
  \ (w_s5kV :: GHC.Prim.Int#) (ww_s5kZ :: GHC.Prim.Int#) ->
    case w_s5kV of wild_Xn {
      __DEFAULT ->
        Main.$wgo
          (GHC.Prim.+# wild_Xn 1#)
          (GHC.Prim.+# ww_s5kZ (GHC.Prim.*# wild_Xn wild_Xn));
      1000# -> GHC.Prim.+# ww_s5kZ 1000000#
    }
end Rec }

but the challenge is to find the path from "main" to that,
wading through several other functions that may or may not be related.


I can imagine a source annotation like "in the code compiled
from this function  f,  that constructor  C  should never be called"
but this is certainly not easy. Do we really mean "never", or do we mean
"only a bounded number of times" (that is, not in the inner loop).
Perhaps there is no code for  f  itself, because it gets inlined.

But yes, *some* automated analysis (and human-readable print-out)
of the code after simplification would be nice.

This could be done as a compiler plug-in?
https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/extending_ghc.html#compiler-plugins


- J.


_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.


_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: To [] Or Not To []

Evan Laforge
I've also had this experience, though on a smaller scale.  The things
I thought were big were not that big, and the slow things were
different from what I predicted.  So far it hasn't turned out to be
lists (except via Strings, I'm happy to use Text).  Also I rely on
persistence and lists are about the simplest persistent data
structure.

On Mon, Mar 13, 2017 at 11:00 AM, Tikhon Jelvis <[hidden email]> wrote:

> Personally, I disagree with the whole premise. Lists are simple and elegant;
> you *should* use them most of them time. I'm not saying you should use lists
> as the data structure for an in-memory database or whatever, but that's the
> point—most applications only have a handful of data structures that
> "matter", and lists are great everywhere else.
>
> I actually went through and replaced [] with Vector in all of the types we
> parse from JSON at work, some of which get relatively large. It made the
> code uglier and didn't meaningfully affect the performance. I undid that
> change, even though it's exactly the sort of thing this article recommends.
> In this day and age, simple things *scale*. That's enough most of the time;
> if you can get away with it you *should*.
>
> The real advantage of lists comes at an intersection of two points: lists
> are effective in place of iterators in Haskell and, even misused as data
> structures, they're *not that bad* most of the time. This means that a good
> 80% of the time, the advantage of using a type that's compatible with the
> rest of my code and APIs that use lists "correctly" as iterators easily
> outweighs any small performance penalty. A list has to get pretty large—or
> my usage pattern pretty convoluted—before another type is worth the
> complexity.
>
> On Mon, Mar 13, 2017 at 3:43 AM, Johannes Waldmann
> <[hidden email]> wrote:
>>
>> Hi Olaf -
>>
>>
>> > I'd be interested whether there is a way to check
>> > which of my lists in the source code the compiler managed to "deforest"
>> > away.
>> > Which intermediate files should I look at? What are the tools to
>> > inspect?
>>
>> Good question, and I don't know an easy answer.
>>
>> The general advice is running ghc with "-ddump-simpl"
>> but I find it quite challenging to scan the output.
>>
>>
>> Here is a simple case where it works:
>>
>> $ cat Fuse.hs
>>
>> main = print $ sum $ map (^2) [1 .. 1000 :: Int]
>>
>>
>> $ ghc -fforce-recomp -ddump-simpl -O0 Fuse.hs
>>
>> no optimisation - shows that "main" calls "map" etc.
>>
>>
>> $ ghc -fforce-recomp -ddump-simpl -O2 Fuse.hs
>>
>> fusion works - nice non-allocating inner loop
>> (lists gone, and Int replaced by Int#)
>>
>> Rec {
>> -- RHS size: {terms: 18, types: 3, coercions: 0}
>> Main.$wgo [InlPrag=[0], Occ=LoopBreaker]
>>   :: GHC.Prim.Int# -> GHC.Prim.Int# -> GHC.Prim.Int#
>> [GblId, Arity=2, Caf=NoCafRefs, Str=DmdType <S,1*U><S,U>]
>> Main.$wgo =
>>   \ (w_s5kV :: GHC.Prim.Int#) (ww_s5kZ :: GHC.Prim.Int#) ->
>>     case w_s5kV of wild_Xn {
>>       __DEFAULT ->
>>         Main.$wgo
>>           (GHC.Prim.+# wild_Xn 1#)
>>           (GHC.Prim.+# ww_s5kZ (GHC.Prim.*# wild_Xn wild_Xn));
>>       1000# -> GHC.Prim.+# ww_s5kZ 1000000#
>>     }
>> end Rec }
>>
>> but the challenge is to find the path from "main" to that,
>> wading through several other functions that may or may not be related.
>>
>>
>> I can imagine a source annotation like "in the code compiled
>> from this function  f,  that constructor  C  should never be called"
>> but this is certainly not easy. Do we really mean "never", or do we mean
>> "only a bounded number of times" (that is, not in the inner loop).
>> Perhaps there is no code for  f  itself, because it gets inlined.
>>
>> But yes, *some* automated analysis (and human-readable print-out)
>> of the code after simplification would be nice.
>>
>> This could be done as a compiler plug-in?
>>
>> https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/extending_ghc.html#compiler-plugins
>>
>>
>> - J.
>>
>>
>> _______________________________________________
>> Haskell-Cafe mailing list
>> To (un)subscribe, modify options or view archives go to:
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
>> Only members subscribed via the mailman list are allowed to post.
>
>
>
> _______________________________________________
> Haskell-Cafe mailing list
> To (un)subscribe, modify options or view archives go to:
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> Only members subscribed via the mailman list are allowed to post.
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: To [] Or Not To []

Johannes Waldmann-2
On 14.03.2017 04:16, Evan Laforge wrote:

> .. The things
> I thought were big were not that big, and the slow things were
> different from what I predicted.

recent project (by Chris Done) to collect performance information
for data structures from standard libraries:

https://github.com/haskell-perf

https://reddit.com/r/haskell/comments/5ym276/haskell_performance_benchmarks/

- J.

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: To [] Or Not To []

Joachim Breitner-2
In reply to this post by Olaf Klinke
Hi,

Am Samstag, den 11.03.2017, 00:28 +0100 schrieb Olaf Klinke:
> I'd be interested whether there is a way to check which of my lists
> in the source code the compiler managed to "deforest" away. Which
> intermediate files should I look at? What are the tools to inspect?

you might find my package
http://hackage.haskell.org/package/list-fusion-probe
useful.

Greetings,
Joachim

--
Joachim “nomeata” Breitner
  [hidden email]https://www.joachim-breitner.de/
  XMPP: [hidden email] • OpenPGP-Key: 0xF0FBF51F
  Debian Developer: [hidden email]
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.

signature.asc (849 bytes) Download Attachment
12