Haskell performance in heavy numerical computations

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Haskell performance in heavy numerical computations

Joel Reymont
Is anyone using Haskell for heavy numerical computations? Could you  
share your experience?

My app will be mostly about running computations over huge amounts of  
stock data (time series)  so I'm thinking I might be better of with  
OCaml.

        Thanks, Joel

--
http://wagerlabs.com/





_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Haskell performance in heavy numerical computations

Greg Fitzgerald
I have tried.  Using just normal lists, I found it difficult to avoid huge space leaks, but once things are working, the end result is easy to reason about.

I'm considering giving it another try using Stream Processors or some form of Arrows instead of lists.  I think this strategy might allow me to do several transformations of the same input list in pseudo-parallel, which would allow the GC to collect old data as soon as its not needed by the computation that decides when to buy and sell.  The Stream Processor library I was looking at uses Continuation Passing Style, which I don't know too much about, and I'm not sure Arrows will work out.

Generalizing the problem, how does one express: "given multiple transformations of an infinite numeric list, perform some IO at the first instance some predicate is met"?

I feel that Arrows might somehow trivially allow me to compute on huge lists without leaking, but don't understand them well enough to apply them.

If one could write an arrow that applies the transformations to the input list in parallel, then I'd think you could solve the above example something like this:
(f &&& g) >>> dropWhile (\(x,y) -> x > 0 && y > 0) >>> performSomeIO

Maybe <+> is the arrow operator I'm looking for, not &&&, but to be honest, the documentation for Arrows blows my mind.  I think a few examples would go a long way.

Thanks,
Greg


On 7/6/06, Joel Reymont <[hidden email]> wrote:
Is anyone using Haskell for heavy numerical computations? Could you
share your experience?

My app will be mostly about running computations over huge amounts of
stock data (time series)  so I'm thinking I might be better of with
OCaml.

        Thanks, Joel

--
http://wagerlabs.com/





_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Haskell performance in heavy numerical computations

Jeff Polakow

> honest, the documentation for Arrows blows my mind.  I think a few
> examples would go a long way.
>

John Hughes' original paper on arrows is full of examples. Additionally, Hughes wrote a tutorial on programming with arrows, for the 2004 AFP summer school in Tartu, which is very accessible. Both papers are available from Hughes' publication page. Also Ross Paterson wrote a piece called Arrows and Computation, available on the arrows homepage, which contains many illuminating examples.

-Jeff
--
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Haskell performance in heavy numerical computations

Brandon Moore
In reply to this post by Joel Reymont
Joel Reymont wrote:
> Is anyone using Haskell for heavy numerical computations? Could you  
> share your experience?
>
> My app will be mostly about running computations over huge amounts of  
> stock data (time series)  so I'm thinking I might be better of with  OCaml.

If you are really serious about numerical throughput, you probably
should be looking at vector instructions, CPU cache sizes, and
parallelising your algorithms, not choosing between high-level languages.

Hopefully it's enough to do that sort of thing for a few important
primitives, and then you can write the rest of the program in whatever
langauge you like.

* MonetDB

It sounds like MonetDB might be related to your problem. This database
is designed for ad-hoc analytical queries which touch most of the rows
(but hopefully only a few of the columns) of a database. The design of
the system has a lot of interesting analysis about the performance
characteristics of modern hardware.

Publications:
http://monetdb.cwi.nl/Research/Articles/index.html

The first sections of the Boncz thesis provide an overview which will
hopefully tell you rather this stuff applies at all. The only paper with
"X100" in the title is an overview of what's been happening more lately.

* Optimizing Combinator Libraries

If you are trying to build a nice and efficient library in Haskell over
a few fast primitives, the ByteString library could be a good example.
In particular, the use of rewrite rules for loop fusion might help in
your case as well.

If you are trying to build a combinator library in Haskell over a few
efficient primitives, you might like the use of rewrite rules in
Data.ByteString to implement some kinds of loop fusion.
http://www.cse.unsw.edu.au/~dons/fps.html
I liked the slides

Perhaps idea from this paper could help if you are doing more
complicated things:
Dynamic Optimization for Functional Reactive Programming using
Generalized Algebraic Data Types
http://www.cs.nott.ac.uk/~nhn/Publications/icfp2005.pdf

Brandon

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Haskell performance in heavy numerical computations

Bulat Ziganshin-2
In reply to this post by Joel Reymont
Hello Joel,

Friday, July 7, 2006, 2:03:11 AM, you wrote:

> Is anyone using Haskell for heavy numerical computations? Could you  
> share your experience?

> My app will be mostly about running computations over huge amounts of
> stock data (time series)  so I'm thinking I might be better of with  
> OCaml.

are you sure that numerical speed will be critical for your app? may
be, for example, that speed of thread switching and passing data, or
ability to easily spread tasks between several cpu cores will be more
important?

numerical speed is poor in ghc 6.4, according to my tests. it's 10-20
times worse than of gcc. afair, the mandelbrot benchmark of Great
Language Shootout proves this - despite all optimization attempts,
GHC entry is still 5-10 times slower than gcc/ocaml/clean ones

i heard rumors that Roman Leshinsky works on improving this in 6.6,
though. try to ask GHC Team about his work

it's also possible that your computations can be run via existing
libraries or written by you in C and rest of program can be written in
Haskell. my own program uses architecture like this - it consists of
time-critical part - compression library written in C, and remaining part
written in Haskell


--
Best regards,
 Bulat                            mailto:[hidden email]

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Haskell performance in heavy numerical computations

Simon Marlow-5
Bulat Ziganshin wrote:

> numerical speed is poor in ghc 6.4, according to my tests. it's 10-20
> times worse than of gcc. afair, the mandelbrot benchmark of Great
> Language Shootout proves this - despite all optimization attempts,
> GHC entry is still 5-10 times slower than gcc/ocaml/clean ones

We average 1.3x slower than C in the shootout. Thanks to Don Stewart for
the following stats...

The numerical floating-point-intensive benchmarks:

     mandelbrot  2.7x C    (Clean 1.7x, OCaml 2.4x, C++ 2.6x)
     n-body      2.1x C    (Clean 0.9x, OCaml 1.3x, C++ 1.4x)

http://shootout.alioth.debian.org/gp4/benchmark.php?test=mandelbrot&lang=all
http://shootout.alioth.debian.org/gp4/benchmark.php?test=nbody&lang=all

So we're still only 2-3 times slower than C on these benchmarks, and we
know how to improve them.

In other cases, for example some integer/floating math, we beat GCC:
partial-sums:

http://shootout.alioth.debian.org/gp4/benchmark.php?test=partialsums&lang=all

The full list, Haskell vs. C:

     0.1x C
http://shootout.alioth.debian.org/gp4/benchmark.php?test=chameneos&lang=all
     0.1x C
http://shootout.alioth.debian.org/gp4/benchmark.php?test=message&lang=all
     0.1x C
http://shootout.alioth.debian.org/gp4/benchmark.php?test=regexdna&lang=all
     0.5x C
http://shootout.alioth.debian.org/gp4/benchmark.php?test=binarytrees&lang=all
     0.7x C
http://shootout.alioth.debian.org/gp4/benchmark.php?test=nsieve&lang=all
     0.9x C
http://shootout.alioth.debian.org/gp4/benchmark.php?test=nsievebits&lang=all
     0.9x C
http://shootout.alioth.debian.org/gp4/benchmark.php?test=partialsums&lang=all
     1.3x C
http://shootout.alioth.debian.org/gp4/benchmark.php?test=recursive&lang=all
     1.6x C
http://shootout.alioth.debian.org/gp4/benchmark.php?test=pidigits&lang=all
     1.8x C
http://shootout.alioth.debian.org/gp4/benchmark.php?test=fasta&lang=all
     1.8x C
http://shootout.alioth.debian.org/gp4/benchmark.php?test=nbody&lang=all
     2.3x C
http://shootout.alioth.debian.org/gp4/benchmark.php?test=spectralnorm&lang=all
  *  2.5x C
http://shootout.alioth.debian.org/gp4/benchmark.php?test=sumcol&lang=all
     2.7x C
http://shootout.alioth.debian.org/gp4/benchmark.php?test=mandelbrot&lang=all
  *  3.2x C
http://shootout.alioth.debian.org/gp4/benchmark.php?test=fannkuch&lang=all
  * 11.0x C
http://shootout.alioth.debian.org/gp4/benchmark.php?test=revcomp&lang=all
  * 22.0x C
http://shootout.alioth.debian.org/gp4/benchmark.php?test=knucleotide&lang=all

  * Known, or expected, to improve under Data.ByteString

Cheers,
        Simon
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re[2]: Haskell performance in heavy numerical computations

Bulat Ziganshin-2
Hello Simon,

Monday, July 10, 2006, 1:12:13 PM, you wrote:

>> numerical speed is poor in ghc 6.4, according to my tests. it's 10-20
>> times worse than of gcc. afair, the mandelbrot benchmark of Great
>> Language Shootout proves this - despite all optimization attempts,
>> GHC entry is still 5-10 times slower than gcc/ocaml/clean ones

> We average 1.3x slower than C in the shootout. Thanks to Don Stewart for
> the following stats...

as i once wrote, most of shootout benchmarks depends on the libs. for
example, multi-threading benchmark is much faster on GHC than on GCC
because former has user-level threads support pre-included and for
later the external libs should be used (testing policy prohibits using
of additional libs)

> The numerical floating-point-intensive benchmarks:

>      mandelbrot  2.7x C    (Clean 1.7x, OCaml 2.4x, C++ 2.6x)
>      n-body      2.1x C    (Clean 0.9x, OCaml 1.3x, C++ 1.4x)

that is the benchmarks that i had in mind

> http://shootout.alioth.debian.org/gp4/benchmark.php?test=mandelbrot&lang=all

the same benchmark with N=600 and Sempron processor:

http://shootout.alioth.debian.org/debian/benchmark.php?test=mandelbrot&lang=all

here GHC is 10x slower than GCC and more than 5 times compared to
Clean and Ocaml. i think that this is the real computing speed
difference while with N=2000 C/Ocaml/Clean programs was really limited
by memory speed.

at least the same situation was for trivial "a[i]=b[i]+c[i]" benchmark
discussed half-year ago - when arrays was small enough to fit in cpu
cache, speed difference was 20 times. on large arrays C program was
limited by memory speed and speed difference was only 3 times


>   *  2.5x C
>   *  3.2x C
>   * 11.0x C
>   * 22.0x C

>   * Known, or expected, to improve under Data.ByteString

one more sign that this is more testing of libraries than compilers. for
example, revcomp test performs only about 1000 operations in main
language but calls regex functions with 100kb strings. not strange
that in this test TCL is now fastest entry

about numerical apps - afair, Manuel Chakravarty works on new version
of his parallel arrays engine


--
Best regards,
 Bulat                            mailto:[hidden email]

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Re[2]: Haskell performance in heavy numerical computations

Donald Bruce Stewart
bulat.ziganshin:

> Hello Simon,
>
> Monday, July 10, 2006, 1:12:13 PM, you wrote:
>
> >> numerical speed is poor in ghc 6.4, according to my tests. it's 10-20
> >> times worse than of gcc. afair, the mandelbrot benchmark of Great
> >> Language Shootout proves this - despite all optimization attempts,
> >> GHC entry is still 5-10 times slower than gcc/ocaml/clean ones
>
> > We average 1.3x slower than C in the shootout. Thanks to Don Stewart for
> > the following stats...
>
> as i once wrote, most of shootout benchmarks depends on the libs. for
> example, multi-threading benchmark is much faster on GHC than on GCC
> because former has user-level threads support pre-included and for
> later the external libs should be used (testing policy prohibits using
> of additional libs)
>
> > The numerical floating-point-intensive benchmarks:
>
> >      mandelbrot  2.7x C    (Clean 1.7x, OCaml 2.4x, C++ 2.6x)
> >      n-body      2.1x C    (Clean 0.9x, OCaml 1.3x, C++ 1.4x)
>
> that is the benchmarks that i had in mind
>
> > http://shootout.alioth.debian.org/gp4/benchmark.php?test=mandelbrot&lang=all
>
> the same benchmark with N=600 and Sempron processor:
>
> http://shootout.alioth.debian.org/debian/benchmark.php?test=mandelbrot&lang=all
>
> here GHC is 10x slower than GCC and more than 5 times compared to
> Clean and Ocaml. i think that this is the real computing speed
> difference while with N=2000 C/Ocaml/Clean programs was really limited
> by memory speed.

Ah! In this case, on the debian, the benchmark has been compiled
_without_ -fexcess-precision, that's what's causing the big slow down.
We had to turn it on, on the gp4, but it seems the flag wasn't
propagated to the debian/sempron builds for some reason.

Looks like the ghc/mandelbrot benchmarks just needs to be rerun with
-fexcess-precision in this case.

-- Don
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Re[2]: Haskell performance in heavy numerical computations

Brent Fulgham

On Jul 10, 2006, at 4:23 AM, Donald Bruce Stewart wrote:
> Ah! In this case, on the debian, the benchmark has been compiled
> _without_ -fexcess-precision, that's what's causing the big slow down.
> We had to turn it on, on the gp4, but it seems the flag wasn't
> propagated to the debian/sempron builds for some reason.
>
> Looks like the ghc/mandelbrot benchmarks just needs to be rerun with
> -fexcess-precision in this case.

Ah!  Thanks for noticing this -- I did not catch it.

I've rerun, and the new value is more like 0.428 secs for 600.

Thanks,

-Brent
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Haskell performance in heavy numerical computations

Frederik Eaton-3
In reply to this post by Joel Reymont
Alberto Ruiz has developed a linear algebra library which could be
seen as an alternative to Matlab/Octave, using the GSL, ATLAS, LAPACK,
etc. IIRC.

http://dis.um.es/~alberto/GSLHaskell/

I've optimized it in some places, and added an interface which
guarantees operand conformability through the type system at compile
time.

http://ofb.net/~frederik/stla/

It is almost twice as fast as Octave on an example program, and
probably comparable to Matlab. However, it is currently very difficult
to debug some of the type errors which are emitted by GHC when using
it, and there are some outstanding heap-related bugs. Also, we don't
have as wide a variety of built-in functions as Matlab and Octave do.

Frederik

On Thu, Jul 06, 2006 at 11:03:11PM +0100, Joel Reymont wrote:
> Is anyone using Haskell for heavy numerical computations? Could you share your experience?
>
> My app will be mostly about running computations over huge amounts of stock data (time series)  so I'm thinking
> I might be better of with OCaml.
>
> Thanks, Joel

--
http://ofb.net/~frederik/
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe