Can Haskell outperform C++?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
56 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Re: Can Haskell outperform C++?

Ertugrul Söylemez
Ryan Newton <[hidden email]> wrote:

> I think this is a bit of an unfair accusation ("simple-minded").
>  Performance is only relevant to certain domains, sure.  But program
> performance is an entire *industry*.  And I'd argue it's of massive
> importance to the world at large.  Intel has an army of "AE"s
> (application engineers) that go out to other companies to make
> applications perform better.  There are plenty of compute bound
> industries -- i.e. movie companies are limited by what kind of frame
> they can render in ~24 hrs; sequencing centers are limited by certain
> very slow bioinformatics algorithms; you can look almost anywhere you
> like for examples.
I'm sorry for the rough words.  I didn't mean to overgeneralize.  Of
course in certain domains a reasonable performance is desirable, but the
first question you should ask is whether you want high level or low
level performance.  Haskell performs amazingly at the high level and
suboptimal at the low level.

My point is:  If you need C-like performance at a certain spot there is
really no excuse for writing the entire application in C.  Haskell has a
working, powerful enough FFI.  Also idiomatic Haskell code nowadays
performs close to C.  If your code doesn't, chances are that it's not
even idiomatic.  Sorting a list is easy and beautiful in code.  But it's
far from idiomatic.  Using ST with destructive update is also not
idiomatic.  One of my performance masterpieces is the "instinct" AI
library (see Hackage).  It uses only immutable vectors and performs very
heavy Double calculations, yet performs better than the same code with
mutable arrays did.

With a few years of Haskell experience in my backpack I know how to
utilize laziness to get amazing performance for code that most people
would feel must be written with destructively updating loop.  And the
resulting code is just beautiful to read and watch at work, a great
demonstration of the wonders the GHC developers have created.


> As a community I think we have to face the fact that writing the hot
> inner loop of your application as idiomatic Haskell is not [yet] going
> to give you C/Fortran performance off the bat.  Though in some cases
> there's not really anything stopping us but more backend/codegen work
> (I'm thinking of arithmetically intensive loops with scalars only).
> For example, the following Mandel kernel is in many ways the *same* as
> the C version:
>
> https://github.com/simonmar/monad-par/blob/662fa05b2839c8a0a6473dc490ead8dd519ddd1b/examples/src/mandel.hs#L24H
>
> We have the types; we've got strictness (for this loop); but the C
> version was 6X faster when I tested it.
Well, if it's "in many ways the same as C", then again it's probably not
idiomatic Haskell.  I don't know what a Mandel kernel is, so I can't
really tell you how I would write the code without further study.
However, I'm doing a lot of number crunching and my code usually gets
really close to C, and especially for Integer-heavy calculations
actually outperforms C.  I have done the comparison.  Using GMP with all
the features it provides in the mpz_* family of functions is in fact
slower than the same in Haskell (GHC 7.4.1 on Intel i5 64 bits here),
and from my C times I have experience using GMP properly including smart
allocation, etc.

If you want high performance Haskell, writing C in Haskell is /not/ the
solution.  It /will/ result in suboptimal code, likely considerably
slower than the same code in C.


Greets,
Ertugrul

--
Key-ID: E5DD8D11 "Ertugrul Soeylemez <[hidden email]>"
FPrint: BD28 3E3F BE63 BADD 4157  9134 D56A 37FA E5DD 8D11
Keysrv: hkp://subkeys.pgp.net/

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe

signature.asc (853 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Can Haskell outperform C++?

glebovitz
This is an outstanding discussion. I am learning a lot from it.

I would find it useful to pull all this information together into a single document that discusses all the performance issues in one place and shares the real life experience is dealing with each issue. I see this as a best practice paper rather than a research document.

Does such a document exist? If not, I am willing try and start one.

Ryan writes:

With a few years of Haskell experience in my backpack I know how to
utilize laziness to get amazing performance for code that most people
would feel must be written with destructively updating loop.


I have heard similar comments  from other experienced Haskell users about how get better performance from their code by understanding the details of applying laziness. I would like to try and capture this.

Are there blogs, wiki postings, and list postings that provide information about particular issues?

Gregg

On 5/10/2012 11:53 PM, Ertugrul Söylemez wrote:
Ryan Newton [hidden email] wrote:

I think this is a bit of an unfair accusation ("simple-minded").
 Performance is only relevant to certain domains, sure.  But program
performance is an entire *industry*.  And I'd argue it's of massive
importance to the world at large.  Intel has an army of "AE"s
(application engineers) that go out to other companies to make
applications perform better.  There are plenty of compute bound
industries -- i.e. movie companies are limited by what kind of frame
they can render in ~24 hrs; sequencing centers are limited by certain
very slow bioinformatics algorithms; you can look almost anywhere you
like for examples.
I'm sorry for the rough words.  I didn't mean to overgeneralize.  Of
course in certain domains a reasonable performance is desirable, but the
first question you should ask is whether you want high level or low
level performance.  Haskell performs amazingly at the high level and
suboptimal at the low level.

My point is:  If you need C-like performance at a certain spot there is
really no excuse for writing the entire application in C.  Haskell has a
working, powerful enough FFI.  Also idiomatic Haskell code nowadays
performs close to C.  If your code doesn't, chances are that it's not
even idiomatic.  Sorting a list is easy and beautiful in code.  But it's
far from idiomatic.  Using ST with destructive update is also not
idiomatic.  One of my performance masterpieces is the "instinct" AI
library (see Hackage).  It uses only immutable vectors and performs very
heavy Double calculations, yet performs better than the same code with
mutable arrays did.

With a few years of Haskell experience in my backpack I know how to
utilize laziness to get amazing performance for code that most people
would feel must be written with destructively updating loop.  And the
resulting code is just beautiful to read and watch at work, a great
demonstration of the wonders the GHC developers have created.


As a community I think we have to face the fact that writing the hot
inner loop of your application as idiomatic Haskell is not [yet] going
to give you C/Fortran performance off the bat.  Though in some cases
there's not really anything stopping us but more backend/codegen work
(I'm thinking of arithmetically intensive loops with scalars only).
For example, the following Mandel kernel is in many ways the *same* as
the C version:

https://github.com/simonmar/monad-par/blob/662fa05b2839c8a0a6473dc490ead8dd519ddd1b/examples/src/mandel.hs#L24H

We have the types; we've got strictness (for this loop); but the C
version was 6X faster when I tested it.
Well, if it's "in many ways the same as C", then again it's probably not
idiomatic Haskell.  I don't know what a Mandel kernel is, so I can't
really tell you how I would write the code without further study.
However, I'm doing a lot of number crunching and my code usually gets
really close to C, and especially for Integer-heavy calculations
actually outperforms C.  I have done the comparison.  Using GMP with all
the features it provides in the mpz_* family of functions is in fact
slower than the same in Haskell (GHC 7.4.1 on Intel i5 64 bits here),
and from my C times I have experience using GMP properly including smart
allocation, etc.

If you want high performance Haskell, writing C in Haskell is /not/ the
solution.  It /will/ result in suboptimal code, likely considerably
slower than the same code in C.


Greets,
Ertugrul



_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Can Haskell outperform C++?

Chris Wong
On Sat, May 12, 2012 at 12:41 AM, Gregg Lebovitz <[hidden email]> wrote:
> I would find it useful to pull all this information together into a single
> document that discusses all the performance issues in one place and shares
> the real life experience is dealing with each issue. I see this as a best
> practice paper rather than a research document.
>
> Does such a document exist? If not, I am willing try and start one.

http://www.haskell.org/haskellwiki/Performance

;)

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Can Haskell outperform C++?

glebovitz
Chris,

Thanks you for indulging me. I will eventually get use to the idea that
there is a wiki page for almost every topic :-)

Gregg

On 5/12/2012 1:02 AM, Chris Wong wrote:

> On Sat, May 12, 2012 at 12:41 AM, Gregg Lebovitz<[hidden email]>  wrote:
>> I would find it useful to pull all this information together into a single
>> document that discusses all the performance issues in one place and shares
>> the real life experience is dealing with each issue. I see this as a best
>> practice paper rather than a research document.
>>
>> Does such a document exist? If not, I am willing try and start one.
> http://www.haskell.org/haskellwiki/Performance
>
> ;)

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Can Haskell outperform C++?

Ertugrul Söylemez
In reply to this post by glebovitz
Gregg Lebovitz <[hidden email]> wrote:

> Ryan writes:
>
> With a few years of Haskell experience in my backpack I know how to
> utilize laziness to get amazing performance for code that most people
> would feel must be written with destructively updating loop.

That was me actually.  Just a minor side note that some people may
consider a nitpick, but I'd like to keep being the author of my own
statements.  Thanks.


Greets,
Ertugrul

--
nightmare = unsafePerformIO (getWrongWife >>= sex)
http://ertes.de/

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe

signature.asc (853 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Can Haskell outperform C++?

Dmitry V'yal
In reply to this post by Ertugrul Söylemez
On 05/11/2012 07:53 AM, Ertugrul Söylemez wrote:

> My point is: If you need C-like performance at a certain spot there is
> really no excuse for writing the entire application in C. Haskell has
> a working, powerful enough FFI. Also idiomatic Haskell code nowadays
> performs close to C. If your code doesn't, chances are that it's not
> even idiomatic. Sorting a list is easy and beautiful in code. But it's
> far from idiomatic. Using ST with destructive update is also not
> idiomatic. One of my performance masterpieces is the "instinct" AI
> library (see Hackage). It uses only immutable vectors and performs
> very heavy Double calculations, yet performs better than the same code
> with mutable arrays did. With a few years of Haskell experience in my
> backpack I know how to utilize laziness to get amazing performance for
> code that most people would feel must be written with destructively
> updating loop. And the resulting code is just beautiful to read and
> watch at work, a great demonstration of the wonders the GHC developers
> have created.
Hello Ertugrul,

Out of the curios, did you compare the performance of Instinct with
implementations in languages, associated with numerical computations,
like Octave?

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Can Haskell outperform C++?

Ertugrul Söylemez
Dmitry Vyal <[hidden email]> wrote:

> > My point is: If you need C-like performance at a certain spot there
> > is really no excuse for writing the entire application in C.
> > Haskell has a working, powerful enough FFI. Also idiomatic Haskell
> > code nowadays performs close to C. If your code doesn't, chances are
> > that it's not even idiomatic. Sorting a list is easy and beautiful
> > in code. But it's far from idiomatic. Using ST with destructive
> > update is also not idiomatic. One of my performance masterpieces is
> > the "instinct" AI library (see Hackage). It uses only immutable
> > vectors and performs very heavy Double calculations, yet performs
> > better than the same code with mutable arrays did.  With a few years
> > of Haskell experience in my backpack I know how to utilize laziness
> > to get amazing performance for code that most people would feel must
> > be written with destructively updating loop.  And the resulting code
> > is just beautiful to read and watch at work, a great demonstration
> > of the wonders the GHC developers have created.
>
> Out of the curios, did you compare the performance of Instinct with
> implementations in languages, associated with numerical computations,
> like Octave?
No, sorry.


Greets,
Ertugrul

--
Key-ID: E5DD8D11 "Ertugrul Soeylemez <[hidden email]>"
FPrint: BD28 3E3F BE63 BADD 4157  9134 D56A 37FA E5DD 8D11
Keysrv: hkp://subkeys.pgp.net/

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe

signature.asc (853 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Can Haskell outperform C++?

Ryan Newton
In reply to this post by Ertugrul Söylemez
Well, if it's "in many ways the same as C", then again it's probably not
idiomatic Haskell.  

It's just a recursive equation for mandelbrot fractals.  I should have been precise, I didn't mean that the source is literally the same as the C source (i.e. there's no for loop, no mutable variables), rather that it presents the compiler backend with the same advantages as the C backend would have with the equivalent loop.  That is, there's no control flow obfuscation (dictionaries, calls to unknown targets), no problems with laziness, and the data representations are completely known.  


mandel :: Int -> Complex Double -> Int
mandel max_depth c = loop 0 0
  where
   loop i !z
    | i == max_depth = i
    | magnitude z >= 2.0 = i
    | otherwise = loop (i+1) (z*z + c)

It's also a loop that is readily recognized as a loop.  Now, to my knowledge, GHC doesn't explicitly recognize loops in any stage of the compiler (so as to perform loop optimizations), something which other functional compilers sometimes do.  

But, anyway, it turns out that my example above is easily transformed from a bad GHC performance story into a good one.  If you'll bear with me, I'll show how below.
   First, Manuel makes a good point about the LLVM backend.  My "6X" anecdote was from a while ago and I didn't use llvm [1].  I redid it just now with 7.4.1+LLVM, results below.  (The below table should read correctly in fixed width font, but you can also see the data in the spreadsheet here.)

                   Time (ms)   Compiled File size   Comple+Runtime (ms)
GHC 7.4.1 O0    2444        1241K
GHC 7.4.1 O2    925        1132K             1561
GHC 7.4.1 O2 llvm  931         1133K
GHC 7.0.4 O2 via-C 684         974K

So LLVM didn't help [1].  And in fact the deprecated via-C backend did the best!  Compare with GCC: 

G++ O0             300         9K                   531
G++ O3             110         7K                   347
G++ O3 recursive   116         9K

Uh oh, the "6X" gap I mentioned is now closer to 9X.  And, in fact, performing a mini "language shootout" on the above code, reveals that GHC is doing worse than not only OCaml, but Chez Scheme, in spite of dynamic type checks, and a necessarily boxed representation of complex numbers:

Chez Scheme 8.4    284         2.7K notStandalone   372
OCaml              166         180K                 301

At least Python does worse!

Python 2.6         1973        NA                   1973

So here's the catch:  If you check the Core and STG GHC 7.4 is actually compiling the above loop very well.  This microbenchmark turns into just a "magnitude" microbenchmark.  The implementation of Data.Complex uses an EXTREMELY expensive method to avoid overflow [2].

Since OCaml did well above, I took a look at their standard library's implementation, on line 51 here.  They use a nice little math trick (the extra division) that is also mentioned on Wikipedia.  By implementing the same trick in Haskell, replacing the standard "magnitude" function, we get the following results.

GHC 7.4.1 No
Overflow Avoidance   
39         1127K                674
GHC 741, OCaml-style
Overflow avoidance   
74                  1127K

Wow, now not only is the Haskell version faster than OCaml/Scheme, but it is 48% faster than C++, which is appropriate to the title of this email!  Of course, this probably means that C++'s abs is also doing something overly expensive for overflow avoidance (or that their representation of complex numbers is not fully unpacked and stored in registers)  I haven't tracked it down yet.

But in any case, I'll be submitting a library patch.  The moral, I think, is that community members could do a great deal to help "Haskell" performance by simply microbenchmarking and optimizing library routines in base!

Cheers,
  -Ryan

P.S. You can check out the above benchmarks from here: https://github.com/rrnewton/MandelMicrobench

[1] P.P.S. Most concerning to me about Haskell/C++ comparisons are David Peixotto's findings that LLVM optimizations are not very effective on Haskell-generated LLVM compared with typical clang-generated LLVM.

[2]  P.P.P.S. It turns out there was already a ticket (http://hackage.haskell.org/trac/ghc/ticket/2450) regarding magnitude's performance.  But it still has bad performance even after a small refactoring was performed.


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Can Haskell outperform C++?

Manuel M T Chakravarty
Ryan Newton:
But, anyway, it turns out that my example above is easily transformed from a bad GHC performance story into a good one.  If you'll bear with me, I'll show how below.
   First, Manuel makes a good point about the LLVM backend.  My "6X" anecdote was from a while ago and I didn't use llvm [1].  I redid it just now with 7.4.1+LLVM, results below.  (The below table should read correctly in fixed width font, but you can also see the data in the spreadsheet here.)

                   Time (ms)   Compiled File size   Comple+Runtime (ms)
GHC 7.4.1 O0    2444        1241K
GHC 7.4.1 O2    925        1132K             1561
GHC 7.4.1 O2 llvm  931         1133K
GHC 7.0.4 O2 via-C 684         974K

So LLVM didn't help [1].  And in fact the deprecated via-C backend did the best!  

I would classify that as a bug.

[1] P.P.S. Most concerning to me about Haskell/C++ comparisons are David Peixotto's findings that LLVM optimizations are not very effective on Haskell-generated LLVM compared with typical clang-generated LLVM.

There is some work underway to improve the situation, but I am sure more could be done.

Manuel


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Can Haskell outperform C++?

Yves Parès-3
In reply to this post by Chris Wong
Yet this resource seems a little outdated:
http://www.haskell.org/haskellwiki/Performance/Strings does not speak about Text
http://www.haskell.org/haskellwiki/Performance/Arrays about Vector
And what's the status of the Streams IO approach detailed in http://www.haskell.org/haskellwiki/Performance/IO ? Has it evolved into enumerators/conduits, or was that an unrelated approach?

2012/5/12 Chris Wong <[hidden email]>
On Sat, May 12, 2012 at 12:41 AM, Gregg Lebovitz <[hidden email]> wrote:
> I would find it useful to pull all this information together into a single
> document that discusses all the performance issues in one place and shares
> the real life experience is dealing with each issue. I see this as a best
> practice paper rather than a research document.
>
> Does such a document exist? If not, I am willing try and start one.

http://www.haskell.org/haskellwiki/Performance

;)

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Can Haskell outperform C++?

wren ng thornton
In reply to this post by Ryan Newton
On 5/10/12 8:44 PM, Ryan Newton wrote:
>> through the trouble of writing my algorithms in C/C++, but simple-minded
>> people often have a desire to get the best performance possible, in
>> which case you really want to use C, C++, Fortran or whatever high level
>> assembler language you like.
>
> I think this is a bit of an unfair accusation ("simple-minded").

Well, yes and no.

On the one hand, characterizing those who desire the best performance
possible as "simple-minded" is, at best, a gross over-generalization.
Like you, I work in a field where optimization is king (e.g., in machine
translation, program runtimes are measured in days).

But on the other hand, there are quite a lot of people who focus
excessively on "optimization" when the actual differences for the code
they're writing are either negligible (e.g., because of I/O bottlenecks)
or uninteresting (e.g., a 2x slowdown is a nuisance rather than a
crisis). For those who focus on optimization when the benefits are
negligible or uninteresting, I think it's fair to characterize that
focus as "simple-minded". This focus seems especially common when people
start talking about "which language is faster"--- as opposed to, say,
which library is faster, or which implementation of a given algorithm is
faster. In some cases the question of language speed is legitimate, but
IME it's far more often used to prop up prejudices about which language
is "better"--- i.e., which group of human beings (defined by their
choice of programming language) is "better".


I agree that, as a community, we need to come to a realistic
understanding of the performance characteristics of Haskell compared to
C etc. I don't like prejudice and propaganda for Haskell any more than I
like prejudice and propaganda for C etc. In some cases it's true that
Haskell out-performs C (or C++, or Java); but in many cases it does not,
and we need to acknowledge that. The real problem, I feel, is that it's
not always clear which category any given program falls into. In some
cases the "idiomatic" Haskell is the fast one, in some cases its the
slow one; but in all cases, what is meant by "idiomatic Haskell" is a
moving target with poorly specified meaning.

The advent of ByteStrings was a major coup in figuring out exactly where
and why Haskell is slow and how to fix it. Iteratees are another one,
though the details are still being worked out. However, things like
strictness annotations on (non-accumulator) function arguments, the
choice whether to use ST or not, the choice whether to CPS-transform or
not, etc, are still very much a black art. Even with a thorough
understanding of the STG-machine and the GHC compilation pipeline, it's
not always clear whether they will help or hurt. ST in particular is
especially finicky, to the point where I wonder if it's ever a win
(outside of in-place updates on arrays).

So while I think most language performance comparisons are dubious to
begin with, I don't think the comparison of Haskell to C++ (or C, or
Java,...) can even be construed as something with a meaningful answer.
Haskell is just too different, and its in-the-large cost model is too
poorly understood. I'm not even aware of any helpful generalizations
over comparing two specific programs. Even when restricting to things
like parsing or compute-intensive numeric code, the comparison comes
back with mixed answers.

--
Live well,
~wren

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Can Haskell outperform C++?

Yves Parès-3
> On the one hand, characterizing those who desire the best performance possible as "simple-minded" is, at best, a gross over-generalization. Like you, I work in a field where optimization is king (e.g., in machine translation, program runtimes are measured in days).

You misread the logical implication, Ertugrul said:

> simple-minded people often have a desire to get the best performance possible

Which means:
A is simple-minded ==> A will (most probably) want to get the best perf.

You interpreted it as:
A wants to get the best perf. ==> A is simple-minded

I agree with you, the second implication is wrong, but that's not what was meant.

(Moral: Should people use more logical operators and less words we would undertand better).

2012/5/16 wren ng thornton <[hidden email]>
On 5/10/12 8:44 PM, Ryan Newton wrote:
through the trouble of writing my algorithms in C/C++, but simple-minded
people often have a desire to get the best performance possible, in
which case you really want to use C, C++, Fortran or whatever high level
assembler language you like.

I think this is a bit of an unfair accusation ("simple-minded").

Well, yes and no.

On the one hand, characterizing those who desire the best performance possible as "simple-minded" is, at best, a gross over-generalization. Like you, I work in a field where optimization is king (e.g., in machine translation, program runtimes are measured in days).

But on the other hand, there are quite a lot of people who focus excessively on "optimization" when the actual differences for the code they're writing are either negligible (e.g., because of I/O bottlenecks) or uninteresting (e.g., a 2x slowdown is a nuisance rather than a crisis). For those who focus on optimization when the benefits are negligible or uninteresting, I think it's fair to characterize that focus as "simple-minded". This focus seems especially common when people start talking about "which language is faster"--- as opposed to, say, which library is faster, or which implementation of a given algorithm is faster. In some cases the question of language speed is legitimate, but IME it's far more often used to prop up prejudices about which language is "better"--- i.e., which group of human beings (defined by their choice of programming language) is "better".


I agree that, as a community, we need to come to a realistic understanding of the performance characteristics of Haskell compared to C etc. I don't like prejudice and propaganda for Haskell any more than I like prejudice and propaganda for C etc. In some cases it's true that Haskell out-performs C (or C++, or Java); but in many cases it does not, and we need to acknowledge that. The real problem, I feel, is that it's not always clear which category any given program falls into. In some cases the "idiomatic" Haskell is the fast one, in some cases its the slow one; but in all cases, what is meant by "idiomatic Haskell" is a moving target with poorly specified meaning.

The advent of ByteStrings was a major coup in figuring out exactly where and why Haskell is slow and how to fix it. Iteratees are another one, though the details are still being worked out. However, things like strictness annotations on (non-accumulator) function arguments, the choice whether to use ST or not, the choice whether to CPS-transform or not, etc, are still very much a black art. Even with a thorough understanding of the STG-machine and the GHC compilation pipeline, it's not always clear whether they will help or hurt. ST in particular is especially finicky, to the point where I wonder if it's ever a win (outside of in-place updates on arrays).

So while I think most language performance comparisons are dubious to begin with, I don't think the comparison of Haskell to C++ (or C, or Java,...) can even be construed as something with a meaningful answer. Haskell is just too different, and its in-the-large cost model is too poorly understood. I'm not even aware of any helpful generalizations over comparing two specific programs. Even when restricting to things like parsing or compute-intensive numeric code, the comparison comes back with mixed answers.

--
Live well,
~wren


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Can Haskell outperform C++?

glebovitz
In reply to this post by wren ng thornton
Wren,

I see at least three different issues being discussed here. I think it
is important to delineate them:

1) Does Haskell and its libraries need performance improvements?  
Probably yes. Some of the performance issues seem to be related to the
way the language is implemented and others by how it is defined.
Developers really do run into performance issues with Haskell and either
learn to work around the issue or try to fix the offending
implementation. The wiki performance page gives insight into some of the
performance issues and how address them.

2) Are language performance comparisons useful? They can be, if the
comparison is focusing on what each language does best. They aren't
useful if they focus on benchmarks that aren't relevant to the target
domain of each language. I think the problem with current comparisons,
is that they are designed to favor imperative languages.

3) Are the negative perceptions generated by popular performance
comparisons important? Only if you care about the broader adoption of
the language. If you don't care, then the point is moot. If you do care,
then yes the comparisons are unfair and simple minded, but they do have
an affect on the percepts of the language. It is not uncommon for
management to raise roadblocks to the introduction of new technologies
due to flawed reports that were published in popular magazines.

If Haskell were a product and I  was responsible for its success, then I
would be using common marketing techniques to combat the negative
perceptions. One is a technique called "changing the playing field"
which means producing documents that reset the criteria for the
evaluations and comparisons. This is not "deception", but merely making
sure that potential users understand where and how to make the proper
comparisons, or more importantly that my "champion" was well armed with
documentation to counter argue popular but flawed comparisons.

On 5/16/2012 6:54 AM, wren ng thornton wrote:
> Haskell is just too different, and its in-the-large cost model is too
> poorly understood. I'm not even aware of any helpful generalizations
> over comparing two specific programs. Even when restricting to things
> like parsing or compute-intensive numeric code, the comparison comes
> back with mixed answers.

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Can Haskell outperform C++?

Kevin Charter
On Wed, May 16, 2012 at 8:40 AM, Gregg Lebovitz <[hidden email]> wrote:
1) Does Haskell and its libraries need performance improvements?  Probably yes. Some of the performance issues seem to be related to the way the language is implemented and others by how it is defined. Developers really do run into performance issues with Haskell and either learn to work around the issue or try to fix the offending implementation. The wiki performance page gives insight into some of the performance issues and how address them.

I think there is a closely-related issue: that you'll need to learn how to debug subtle performance problems early in your Haskell programming career. In particular

  • you will have space leaks and related performance problems in naively-written programs
  • you will consequently need to learn how to use strictness annotations and other related language features, and how to use the profiler to determine where they're needed
For example, imagine you're new to the language, and as an exercise decide to write a program that counts the characters on standard input and writes the count to standard output. A naive program in, say, Python will probably use constant space and be fairly fast. A naive program in Haskell stands a good chance of having a space leak, building a long chain of thunks that isn't forced until it needs to write the final answer.  On small inputs, you won't notice. The nasty surprise comes when your co-worker says "cool, let's run it on this 100 MB log file!" and your program dies a horrible death. If your friend is a sceptic, she'll arch an eyebrow and secretly think your program -- and Haskell -- are a bit lame.

The example is contrived, but it's a very common task to consume some big stream of input and produce a much smaller summary of it. I think it's fair to say that you have to be a slightly sophisticated Haskell programmer to do those kinds of things correctly, at least compared to more mainstream languages.

My experience is that this is perhaps the most important  'problem' with Haskell performance. Not so much that it's typically two or three or ten times slower than language X, but that it's easy to have a bad experience early on, one that seems inconsistent with the language shoot-out and other performance comparisons.

Kevin
-- 
Kevin Charter
[hidden email]

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Can Haskell outperform C++?

glebovitz
Kevin,

Interesting point.

Over the past few weeks as part of my work, I have interviewed a large numbers of Haskell developers from many different industries  and have been hearing the same points you are making. Space leaks that were address by learning how to use laziness and performance issues cleared up by using strictness.

Also interesting is that in all my interviews, GHC performance was never raised. No one said "I have to drop into C to solve that performance problem."

Gregg

On 5/16/2012 1:44 PM, Kevin Charter wrote:
On Wed, May 16, 2012 at 8:40 AM, Gregg Lebovitz <[hidden email]> wrote:
1) Does Haskell and its libraries need performance improvements?  Probably yes. Some of the performance issues seem to be related to the way the language is implemented and others by how it is defined. Developers really do run into performance issues with Haskell and either learn to work around the issue or try to fix the offending implementation. The wiki performance page gives insight into some of the performance issues and how address them.

I think there is a closely-related issue: that you'll need to learn how to debug subtle performance problems early in your Haskell programming career. In particular

  • you will have space leaks and related performance problems in naively-written programs
  • you will consequently need to learn how to use strictness annotations and other related language features, and how to use the profiler to determine where they're needed
For example, imagine you're new to the language, and as an exercise decide to write a program that counts the characters on standard input and writes the count to standard output. A naive program in, say, Python will probably use constant space and be fairly fast. A naive program in Haskell stands a good chance of having a space leak, building a long chain of thunks that isn't forced until it needs to write the final answer.  On small inputs, you won't notice. The nasty surprise comes when your co-worker says "cool, let's run it on this 100 MB log file!" and your program dies a horrible death. If your friend is a sceptic, she'll arch an eyebrow and secretly think your program -- and Haskell -- are a bit lame.

The example is contrived, but it's a very common task to consume some big stream of input and produce a much smaller summary of it. I think it's fair to say that you have to be a slightly sophisticated Haskell programmer to do those kinds of things correctly, at least compared to more mainstream languages.

My experience is that this is perhaps the most important  'problem' with Haskell performance. Not so much that it's typically two or three or ten times slower than language X, but that it's easy to have a bad experience early on, one that seems inconsistent with the language shoot-out and other performance comparisons.

Kevin
-- 
Kevin Charter
[hidden email]

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Can Haskell outperform C++?

Ben Gamari
In reply to this post by Kevin Charter
Kevin Charter <[hidden email]> writes:

snip

> For example, imagine you're new to the language, and as an exercise decide
> to write a program that counts the characters on standard input and writes
> the count to standard output. A naive program in, say, Python will probably
> use constant space and be fairly fast. A naive program in Haskell stands a
> good chance of having a space leak, building a long chain of thunks that
> isn't forced until it needs to write the final answer.  On small inputs,
> you won't notice. The nasty surprise comes when your co-worker says "cool,
> let's run it on this 100 MB log file!" and your program dies a horrible
> death. If your friend is a sceptic, she'll arch an eyebrow and secretly
> think your program -- and Haskell -- are a bit lame.
>
I, for one, can say that my initial impressions of Haskell were quite
scarred by exactly this issue. Being in experimental physics, I often
find myself writing data analysis code doing relatively simple
manipulations on large(ish) data sets. My first attempt at tackling a
problem in Haskell took nearly three days to get running with reasonable
performance. I can only thank the wonderful folks in #haskell profusely
for all of their help getting through that period. That being said, it
was quite frustrating at the time and I often wondered how I could
tackle a reasonably large project if I couldn't solve even a simple
problem with halfway decent performance. If it weren't for #haskell, I
probably would have given up on Haskell right then and there, much to my
deficit.

While things have gotten easier, even now, nearly a year after writing
my first line of Haskell, I still occassionally find a performance
issue^H^H^H^H surprise. I'm honestly not really sure what technical
measures could be taken to ease this learning curve. The amazing
community helps quite a bit but I feel that this may not be a
sustainable approach if Haskell gains greater acceptance. The profiler
is certainly useful (and much better with GHC 7.4), but there are still
cases where the performance hit incurred by the profiling
instrumentation precludes this route of investigation without time
consuming guessing at how to pare down my test case. It's certainly not
an easy problem.

Cheers,

- Ben


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Can Haskell outperform C++?

Kevin Charter
In reply to this post by glebovitz
On Wed, May 16, 2012 at 1:17 PM, Gregg Lebovitz <[hidden email]> wrote:
Also interesting is that in all my interviews, GHC performance was never raised. No one said "I have to drop into C to solve that performance problem."

That's been my experience too. I've so far been able to get acceptable performance with GHC when I've made good use of the tools for finding space and time leaks. And it hasn't been hard, just something I had to learn to do.
 

Gregg

Kevin

--
Kevin Charter
[hidden email]

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Can Haskell outperform C++?

glebovitz
In reply to this post by Ben Gamari
Ben,

This is precisely the kind of problem I am currently thinking about and
why I asked for pointers to documents on best practices. The current
model for gaining the necessary experience to use Haskell effectively
does not scale well.

Here are some ideas on how to address the knowledge issue:

1) Outstanding best practices documents that capture this knowledge and
provides useful answers. Organizing this information in an online
document that can be searched by keyword or index would be really
helpful. The hard part will be maintaining it. As someone already
pointed out the wiki entry on performance is already dated.

2) Training presentations derived from #1 that provides lots of examples.

3) Online Training Videos based on #2 above

4) Profiling tools that uses the captured knowledge above, highlights
problem areas, and provides guidance on correcting them.

As I mentioned in a previous email. I am willing to take on organization
#1, but I will need help from people on this list. I can start with the
performance wiki, but we will still need examples of where people get
into trouble and the ways around them.

Gregg


On 5/16/2012 4:00 PM, Ben Gamari wrote:

> Kevin Charter<[hidden email]>  writes:
>
> snip
>> For example, imagine you're new to the language, and as an exercise decide
>> to write a program that counts the characters on standard input and writes
>> the count to standard output. A naive program in, say, Python will probably
>> use constant space and be fairly fast. A naive program in Haskell stands a
>> good chance of having a space leak, building a long chain of thunks that
>> isn't forced until it needs to write the final answer.  On small inputs,
>> you won't notice. The nasty surprise comes when your co-worker says "cool,
>> let's run it on this 100 MB log file!" and your program dies a horrible
>> death. If your friend is a sceptic, she'll arch an eyebrow and secretly
>> think your program -- and Haskell -- are a bit lame.
>>
> I, for one, can say that my initial impressions of Haskell were quite
> scarred by exactly this issue. Being in experimental physics, I often
> find myself writing data analysis code doing relatively simple
> manipulations on large(ish) data sets. My first attempt at tackling a
> problem in Haskell took nearly three days to get running with reasonable
> performance. I can only thank the wonderful folks in #haskell profusely
> for all of their help getting through that period. That being said, it
> was quite frustrating at the time and I often wondered how I could
> tackle a reasonably large project if I couldn't solve even a simple
> problem with halfway decent performance. If it weren't for #haskell, I
> probably would have given up on Haskell right then and there, much to my
> deficit.
>
> While things have gotten easier, even now, nearly a year after writing
> my first line of Haskell, I still occassionally find a performance
> issue^H^H^H^H surprise. I'm honestly not really sure what technical
> measures could be taken to ease this learning curve. The amazing
> community helps quite a bit but I feel that this may not be a
> sustainable approach if Haskell gains greater acceptance. The profiler
> is certainly useful (and much better with GHC 7.4), but there are still
> cases where the performance hit incurred by the profiling
> instrumentation precludes this route of investigation without time
> consuming guessing at how to pare down my test case. It's certainly not
> an easy problem.
>
> Cheers,
>
> - Ben
>
>
> _______________________________________________
> Haskell-Cafe mailing list
> [hidden email]
> http://www.haskell.org/mailman/listinfo/haskell-cafe

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Can Haskell outperform C++?

Yves Parès-3
In reply to this post by Ben Gamari
> The profiler is certainly useful (and much better with GHC 7.4)

What are the improvements in that matter? (I just noticed that some GHC flags wrt profiling have been renamed)

2012/5/16 Ben Gamari <[hidden email]>
Kevin Charter <[hidden email]> writes:

snip
> For example, imagine you're new to the language, and as an exercise decide
> to write a program that counts the characters on standard input and writes
> the count to standard output. A naive program in, say, Python will probably
> use constant space and be fairly fast. A naive program in Haskell stands a
> good chance of having a space leak, building a long chain of thunks that
> isn't forced until it needs to write the final answer.  On small inputs,
> you won't notice. The nasty surprise comes when your co-worker says "cool,
> let's run it on this 100 MB log file!" and your program dies a horrible
> death. If your friend is a sceptic, she'll arch an eyebrow and secretly
> think your program -- and Haskell -- are a bit lame.
>
I, for one, can say that my initial impressions of Haskell were quite
scarred by exactly this issue. Being in experimental physics, I often
find myself writing data analysis code doing relatively simple
manipulations on large(ish) data sets. My first attempt at tackling a
problem in Haskell took nearly three days to get running with reasonable
performance. I can only thank the wonderful folks in #haskell profusely
for all of their help getting through that period. That being said, it
was quite frustrating at the time and I often wondered how I could
tackle a reasonably large project if I couldn't solve even a simple
problem with halfway decent performance. If it weren't for #haskell, I
probably would have given up on Haskell right then and there, much to my
deficit.

While things have gotten easier, even now, nearly a year after writing
my first line of Haskell, I still occassionally find a performance
issue^H^H^H^H surprise. I'm honestly not really sure what technical
measures could be taken to ease this learning curve. The amazing
community helps quite a bit but I feel that this may not be a
sustainable approach if Haskell gains greater acceptance. The profiler
is certainly useful (and much better with GHC 7.4), but there are still
cases where the performance hit incurred by the profiling
instrumentation precludes this route of investigation without time
consuming guessing at how to pare down my test case. It's certainly not
an easy problem.

Cheers,

- Ben


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Can Haskell outperform C++?

Ben Gamari
Yves Parès <[hidden email]> writes:

>> The profiler is certainly useful (and much better with GHC 7.4)
>
> What are the improvements in that matter? (I just noticed that some GHC
> flags wrt profiling have been renamed)
>
The executive summary can be found in the release notes[1]. There was
also a talk I remember watching a while ago which gave a pretty nice
overview. I can't recall, but I might have been this[2]. Lastly,
profiling now works with multiple capabilities[3].

Cheers,

- Ben


[1] http://www.haskell.org/ghc/docs/7.4.1/html/users_guide/release-7-4-1.html
[2] http://www.youtube.com/watch?v=QBFtnkb2Erg
[3] https://plus.google.com/107890464054636586545/posts/hdJAVufhKrD


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
123