Why is Haskell so slow (comparing to Java/Scala)?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Why is Haskell so slow (comparing to Java/Scala)?

Станислав Черничкин
I've wrote simple Haskell benchmark program, which populated primitive vector from vector package:

import           Data.Vector.Primitive.Mutable as P

vectorBench :: Benchmark
vectorBench = bgroup "vector" [ primitive ]
  where
    primitive = bgroup "primitive" [ write1M ]
      where
        write1M =  bench "write1M" $ nfIO $ do
          v <- P.unsafeNew 1000000
          void $ for [0..1000000 - 1] $ flip (P.unsafeWrite v) (1 :: Int)
          return ()

I use `unsafeNew` to skip memory initialization and `unsafeWrite` to skip boundary checks, I guess it's fastest possible way to write something to vector. My result was about 40 ms. 

I wrote similar program in Scala:

for (_ <- 1 to 5) {
  val start = System.currentTimeMillis()
  val a = new Array[Long](1000000)
  for (i <- 0 until 1000000) {
    a(i) = 1L
  }
  val end = System.currentTimeMillis()
  println(s"${end - start} ms")
}

I skip neither boundary checks nor memory initialization, I also used generic array here (suitable for any types of objects, not just for primitive types), so I expected longer run time. But what I got was shocking:

7 ms
3 ms
2 ms
1 ms
2 ms

This program runs 20-40 times faster than Haskell after warm-up. 20-40 times, Carl! Why is Haskell soo slooooow? How can it be?
  

--
Sincerely, Stanislav Chernichkin.

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Why is Haskell so slow (comparing to Java/Scala)?

Nick Smallbone-3

Hi Stanislav,

Станислав Черничкин <[hidden email]> writes:
> I've wrote simple Haskell benchmark program, which populated primitive vector
> from vector package:
>
> ...
> void $ for [0..1000000 - 1] $ flip (P.unsafeWrite v) (1 :: Int)

'for' is a variant of 'map' - it returns a list of results (in this case
a list of a million () values). Instead you should use 'forM_' (note the
underscore) from Control.Monad, which discards the result - that cuts
the runtime by a huge amount when I try it. (And make sure to compile
with -O too.)

Nick

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Why is Haskell so slow (comparing to Java/Scala)?

Saurabh Nanda
Intresting. Wont using "void" or "_ <- forM blah" have the same effect? Why not? 

On 20-Sep-2017 9:37 PM, "Nick Smallbone" <[hidden email]> wrote:

Hi Stanislav,

Станислав Черничкин <[hidden email]> writes:
> I've wrote simple Haskell benchmark program, which populated primitive vector
> from vector package:
>
> ...
> void $ for [0..1000000 - 1] $ flip (P.unsafeWrite v) (1 :: Int)

'for' is a variant of 'map' - it returns a list of results (in this case
a list of a million () values). Instead you should use 'forM_' (note the
underscore) from Control.Monad, which discards the result - that cuts
the runtime by a huge amount when I try it. (And make sure to compile
with -O too.)

Nick

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.


_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Why is Haskell so slow (comparing to Java/Scala)?

Nick Smallbone-3
Saurabh Nanda <[hidden email]> writes:
> Intresting. Wont using "void" or "_ <- forM blah" have the same effect? Why not?

It has the same behaviour as using forM_, but the performance is worse,
as the program still builds the result list before eventually discarding it.

Whether this is something that could be optimised away in theory, I'm
not sure - but it doesn't seem to be at the moment.

Nick

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Why is Haskell so slow (comparing to Java/Scala)?

Joachim Durchholz
In reply to this post by Станислав Черничкин
Am 20.09.2017 um 16:27 schrieb Станислав Черничкин:
> I've wrote simple Haskell benchmark program,
... which means that the result is not really meaningful.

Large programs have their inefficiencies in needless recomputations,
which happen as soon as no single programmer knows everything anymore
and people start reinventing the wheel.
Small programs do not need to be particularly efficient. Well, usually -
sometimes they need to be, but these cases are very rare.

In your particular case, the base cause was that your mental model of
what Haskell was doing did not match what was really happening. This
misdirected you to the wrong function and gave you surprising results.

This is not your fault. Haskell's execution model is fundamentally
different from that of most other languages, and it takes time and
experience to explore all the consequences. It is not Haskell's fault
either, either - Haskell was built to explore specifically this
execution model. My impression is that it makes Haskell's performance
somewhat less predictable, but allows you to build larger systems that
"just work" - millions of line of code, instead of the few thousands
that you get with typical imperative code.
Of course, these things become noticeable only if you (a) have worked on
software that is substantially larger than a few thousand lines of code,
and (b) at least one of these projects was in Haskell, or any other
language with good support for referentially transparent functions.

I wouldn't concentrate on getting that benchmark fast. It's too early to
get any meaningful data out of that. Essentially, while you can speed up
Haskell that way, you are going towards a more imperative style, and
you'll lose the real advantages of Haskell.
For this reason, I wouldn't pursue those monad library functions. Not
right now, anyway; they are pretty specialized, and you should have a
better reason than performance to get back to them.
If you wish to explore performance, try to make your code faster without
too much library function use. You will get more surprising results, but
you will get a better understanding of what happens.

HTH
Jo
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Why is Haskell so slow (comparing to Java/Scala)?

Thomas DuBuisson
In reply to this post by Станислав Черничкин
To recap

You claimed a 40ms measurement.  If we use `for_` instead of `void $
for` (for reasons mentioned earlier in the thread) and `-O2` when
compiling we get just under 1 ms:

```
time                 876.2 μs   (852.8 μs .. 896.9 μs)
                     0.994 R²   (0.989 R² .. 0.998 R²)
mean                 838.4 μs   (825.0 μs .. 855.5 μs)
std dev              48.21 μs   (36.91 μs .. 74.18 μs)
variance introduced by outliers: 48% (moderately inflated)
```

Cheers,
Thomas




On Wed, Sep 20, 2017 at 7:27 AM, Станислав Черничкин
<[hidden email]> wrote:

> I've wrote simple Haskell benchmark program, which populated primitive
> vector from vector package:
>
> import           Data.Vector.Primitive.Mutable as P
>
> vectorBench :: Benchmark
> vectorBench = bgroup "vector" [ primitive ]
>   where
>     primitive = bgroup "primitive" [ write1M ]
>       where
>         write1M =  bench "write1M" $ nfIO $ do
>           v <- P.unsafeNew 1000000
>           void $ for [0..1000000 - 1] $ flip (P.unsafeWrite v) (1 :: Int)
>           return ()
>
> I use `unsafeNew` to skip memory initialization and `unsafeWrite` to skip
> boundary checks, I guess it's fastest possible way to write something to
> vector. My result was about 40 ms.
>
> I wrote similar program in Scala:
>
> for (_ <- 1 to 5) {
>   val start = System.currentTimeMillis()
>   val a = new Array[Long](1000000)
>   for (i <- 0 until 1000000) {
>     a(i) = 1L
>   }
>   val end = System.currentTimeMillis()
>   println(s"${end - start} ms")
> }
>
> I skip neither boundary checks nor memory initialization, I also used
> generic array here (suitable for any types of objects, not just for
> primitive types), so I expected longer run time. But what I got was
> shocking:
>
> 7 ms
> 3 ms
> 2 ms
> 1 ms
> 2 ms
>
> This program runs 20-40 times faster than Haskell after warm-up. 20-40
> times, Carl! Why is Haskell soo slooooow? How can it be?
>
>
> --
> Sincerely, Stanislav Chernichkin.
>
> _______________________________________________
> Haskell-Cafe mailing list
> To (un)subscribe, modify options or view archives go to:
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> Only members subscribed via the mailman list are allowed to post.
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Why is Haskell so slow (comparing to Java/Scala)?

Brandon Allbery
In reply to this post by Saurabh Nanda
Briefly, because the 'spine' of IO (and indeed most monads) imposes a boundary on laziness, so when the IO action happens, the list *will* be built. It can only be bypassed by laziness if the IO action itself is.

Hypothetically one could have RULES that rewrite to the non-list-building variants (trailing _) if the result is seen to be discarded --- but it would probably be unreliable, much as stream fusion can only happen if things go just right.

On Wed, Sep 20, 2017 at 12:12 PM, Saurabh Nanda <[hidden email]> wrote:
Intresting. Wont using "void" or "_ <- forM blah" have the same effect? Why not? 

On 20-Sep-2017 9:37 PM, "Nick Smallbone" <[hidden email]> wrote:

Hi Stanislav,

Станислав Черничкин <[hidden email]> writes:
> I've wrote simple Haskell benchmark program, which populated primitive vector
> from vector package:
>
> ...
> void $ for [0..1000000 - 1] $ flip (P.unsafeWrite v) (1 :: Int)

'for' is a variant of 'map' - it returns a list of results (in this case
a list of a million () values). Instead you should use 'forM_' (note the
underscore) from Control.Monad, which discards the result - that cuts
the runtime by a huge amount when I try it. (And make sure to compile
with -O too.)

Nick

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.


_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.



--
brandon s allbery kf8nh                               sine nomine associates
[hidden email]                                  [hidden email]
unix, openafs, kerberos, infrastructure, xmonad        http://sinenomine.net

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Why is Haskell so slow (comparing to Java/Scala)?

meburke
In reply to this post by Thomas DuBuisson
Thomas- You and Nick have been very helpful.

I tried to post this in reply to another post, but I got an error, so  
I don't know if it went through. If this ends up as a double post,  
please forgive me.

Now, the challenge is, given a set of objectives, how do you design a  
system or program to get the same results as c/c++ in roughly the same  
execution time.

If there was a modeling process (like UML for imperative OO languages)  
for Haskell, we could compare models and see the differences.

NOTE: I typically use LISP or Racket, so Haskell is not my strong suite.

A good resource is "Everything That Linguists Have Always Wanted To  
Know About Logic *But Were Ashamed to Ask" by McCawley. This has such  
as good explanation of Lambda Calculus that it is worth wading through  
the whole book.

Another NOTE: When using Symbolic Logic I use the Lukasiewicz notation  
instead of the Principia notation. This made LISP, Racket and Scheme  
really easy for me to think about. Haskell is still a little  
different, but it helps there, too.

Mike Burke
Quoting Thomas DuBuisson <[hidden email]>:

> To recap
>
> You claimed a 40ms measurement.  If we use `for_` instead of `void $
> for` (for reasons mentioned earlier in the thread) and `-O2` when
> compiling we get just under 1 ms:
>
> ```
> time                 876.2 μs   (852.8 μs .. 896.9 μs)
>                      0.994 R²   (0.989 R² .. 0.998 R²)
> mean                 838.4 μs   (825.0 μs .. 855.5 μs)
> std dev              48.21 μs   (36.91 μs .. 74.18 μs)
> variance introduced by outliers: 48% (moderately inflated)
> ```
>
> Cheers,
> Thomas
>
>
>
>
> On Wed, Sep 20, 2017 at 7:27 AM, Станислав Черничкин
> <[hidden email]> wrote:
>> I've wrote simple Haskell benchmark program, which populated primitive
>> vector from vector package:
>>
>> import           Data.Vector.Primitive.Mutable as P
>>
>> vectorBench :: Benchmark
>> vectorBench = bgroup "vector" [ primitive ]
>>   where
>>     primitive = bgroup "primitive" [ write1M ]
>>       where
>>         write1M =  bench "write1M" $ nfIO $ do
>>           v <- P.unsafeNew 1000000
>>           void $ for [0..1000000 - 1] $ flip (P.unsafeWrite v) (1 :: Int)
>>           return ()
>>
>> I use `unsafeNew` to skip memory initialization and `unsafeWrite` to skip
>> boundary checks, I guess it's fastest possible way to write something to
>> vector. My result was about 40 ms.
>>
>> I wrote similar program in Scala:
>>
>> for (_ <- 1 to 5) {
>>   val start = System.currentTimeMillis()
>>   val a = new Array[Long](1000000)
>>   for (i <- 0 until 1000000) {
>>     a(i) = 1L
>>   }
>>   val end = System.currentTimeMillis()
>>   println(s"${end - start} ms")
>> }
>>
>> I skip neither boundary checks nor memory initialization, I also used
>> generic array here (suitable for any types of objects, not just for
>> primitive types), so I expected longer run time. But what I got was
>> shocking:
>>
>> 7 ms
>> 3 ms
>> 2 ms
>> 1 ms
>> 2 ms
>>
>> This program runs 20-40 times faster than Haskell after warm-up. 20-40
>> times, Carl! Why is Haskell soo slooooow? How can it be?
>>
>>
>> --
>> Sincerely, Stanislav Chernichkin.
>>
>> _______________________________________________
>> Haskell-Cafe mailing list
>> To (un)subscribe, modify options or view archives go to:
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
>> Only members subscribed via the mailman list are allowed to post.
> _______________________________________________
> Haskell-Cafe mailing list
> To (un)subscribe, modify options or view archives go to:
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> Only members subscribed via the mailman list are allowed to post.



_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Why is Haskell so slow (comparing to Java/Scala)?

Florian Weimer
In reply to this post by Станислав Черничкин
* Станислав Черничкин:

> I wrote similar program in Scala:
>
> for (_ <- 1 to 5) {
>   val start = System.currentTimeMillis()
>   val a = new Array[Long](1000000)
>   for (i <- 0 until 1000000) {
>     a(i) = 1L
>   }
>   val end = System.currentTimeMillis()
>   println(s"${end - start} ms")
> }
>
> I skip neither boundary checks nor memory initialization, I also used
> generic array here (suitable for any types of objects, not just for
> primitive types), so I expected longer run time. But what I got was
> shocking:
>
> 7 ms
> 3 ms
> 2 ms
> 1 ms
> 2 ms
>
> This program runs 20-40 times faster than Haskell after warm-up. 20-40
> times, Carl! Why is Haskell soo slooooow? How can it be?

It is possible that Hotspot optimizes away most of the code, perhaps
even the entire array allocation.  Meaningful benchmarking is quite
difficult due to such effects.

Does the execution time even change if you increase the number of
iterations (inner or outer)?
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Why is Haskell so slow (comparing to Java/Scala)?

Joachim Durchholz
Am 23.10.2017 um 23:49 schrieb Florian Weimer:
> It is possible that Hotspot optimizes away most of the code, perhaps
> even the entire array allocation.  Meaningful benchmarking is quite
> difficult due to such effects.
>
> Does the execution time even change if you increase the number of
> iterations (inner or outer)?

I tried that in Java, with this code:

     private static final int SIZE = 400_0000;

     private static final int ITERATIONS = 20;

     public static void main(String[] args) {
         for (int i = 1; i < ITERATIONS; i++) {
             long start = System.currentTimeMillis();
             Long [] a = new Long[SIZE];
             for (int j = 0; j < SIZE; j++) {
                 a[j] = 1L;
             }
             long end = System.currentTimeMillis();
             System.out.println((end - start) + " ms");
         }
     }

Fiddling with SIZE and ITERATIONS led me to the following conclusions:

1) The first two iterations are warm-up, afterwards the times are pretty
constant (about 7-8 ms for SIZE=100_000).
2) Cranking ITERATIONS up gave semi-regular slow iterations. I blame GC.
3) Per-loop run time seems to be roughly linear with array size, that's
a strong indicator that the assignments are not optimized away. (In fact
having a running time of 1-2 ms is a strong indicator that the software
is in fact blowing some real CPU cycles.)

It is actually possible that the standard HotSpot JVM is faster than
Haskell, for this type of task - i.e. just array manipulation with
primitive values, and working set size small enough to make GC
irrelevant. It's the kind of workload where HotSpot has all the
information needed to fully optimize everything short of eliminating the
loop altogether (which is usually not worth it for a mutable-language
just-in-time compiler).
Make things a bit more complicated and real-world, and this advantage
will shrink considerably:  Throw in a few classes, subclasses that
override functions (Haskell equivalent would be unevaluated expressions
in the list), iterate over a list instead of an array (which will
actually make the Haskell program faster if the list spine can be
optimized away), that kind of stuff.

That said, HotSpot is a pretty tough opponent to beat. It has
person-decates if not person-centuries of optimization under its belt
after all.
So does GHC, so maybe the comparison isn't so unfair after all, but
saying "Haskell is slow" from a single microbenchmark simply does not
hold any value.
Still, it is possible that the JVM is indeed faster for some kinds of
workload. If that workload is actually relevant, it might be a valid
optimization potential.

Just my 2 cents :-)

Regards,
Jo
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.