Haskell version of ray tracer code is much slower than the original ML

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
40 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: Haskell version of ray tracer code is much slower than the original ML

Phil Armstrong-2
On Thu, Jun 21, 2007 at 01:45:04PM +0100, Philip Armstrong wrote:
>As I said, I've tried the obvious things & they didn't make any
>difference. Now I could go sprinkling $!, ! and seq around like
>confetti but that seems like giving up really.

OK. Looks like I was mistaken. Strictness annotations *do* make a
difference! Humph. Wonder what I was doing wrong yesterday?

Anyway timings follow, with all strict datatypes in the Haskell
version:

Langauge File     Time in seconds
Haskell  ray.hs   38.2
OCaml    ray.ml   23.8
g++-4.1  ray.cpp  12.6

(ML & C++ Code from
http://www.ffconsultancy.com/languages/ray_tracer/comparison.html)

Gcc seems to have got quite a bit better since Jon last benchmarked
this code.

Phil

--
http://www.kantaka.co.uk/ .oOo. public key: http://www.kantaka.co.uk/gpg.txt
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re[2]: Haskell version of ray tracer code is much slower than the original ML

Bulat Ziganshin-2
Hello Philip,

Friday, June 22, 2007, 7:36:51 PM, you wrote:
> Langauge File     Time in seconds
> Haskell  ray.hs   38.2
> OCaml    ray.ml   23.8
> g++-4.1  ray.cpp  12.6

can you share sourcecode of this variant? i'm interested to see how
much it is obfuscated

btw, *their* measurement said that ocaml is 7% faster :)

--
Best regards,
 Bulat                            mailto:[hidden email]

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Haskell version of ray tracer code is much slower than the original ML

Phil Armstrong-2
On Fri, Jun 22, 2007 at 10:11:27PM +0400, Bulat Ziganshin wrote:
>Friday, June 22, 2007, 7:36:51 PM, you wrote:
>> Langauge File     Time in seconds
>> Haskell  ray.hs   38.2
>> OCaml    ray.ml   23.8
>> g++-4.1  ray.cpp  12.6
>
>can you share sourcecode of this variant? i'm interested to see how
>much it is obfuscated

http://www.kantaka.co.uk/darcs/ray

The cpp & ml versions are precisely those available from the download
links on http://www.ffconsultancy.com/languages/ray_tracer/comparison.html

The optimisation options I used can be seen in the makefile.

>btw, *their* measurement said that ocaml is 7% faster :)

Indeed. The gcc-4.0 compilied binary runs at about 15s IIRC, but it's
still much better than 7% faster than the ocaml binary.

cheers, Phil

--
http://www.kantaka.co.uk/ .oOo. public key: http://www.kantaka.co.uk/gpg.txt
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Haskell version of ray tracer code is much slowerthan the original ML

Claus Reinke
> http://www.kantaka.co.uk/darcs/ray

try making ray_sphere and intersect' local to intersect,
then drop their constant ray parameter. saves me 25%.
claus
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Haskell version of ray tracer code is much slower than the original ML

Jon Harrop
In reply to this post by Phil Armstrong-2
On Friday 22 June 2007 19:54:16 Philip Armstrong wrote:
> On Fri, Jun 22, 2007 at 10:11:27PM +0400, Bulat Ziganshin wrote:
> >btw, *their* measurement said that ocaml is 7% faster :)
>
> Indeed. The gcc-4.0 compilied binary runs at about 15s IIRC, but it's
> still much better than 7% faster than the ocaml binary.

What architecture, platform, compiler versions and compile lines are you
using?

On my 2x 2.2GHz Athlon64 running x64 Debian I now get:

GHC 6.6.1:    26.5s    ghc -funbox-strict-fields -O3 ray.hs -o ray
OCaml 3.10.0: 14.158s  ocamlopt -inline 1000 ray.ml -o ray
g++ 4.1.3:     8.056s  g++ -O3 -ffast-math ray.cpp -o ray

Also, the benchmarks and results that I cited before are more up to date than
the ones you're using. In particular, you might be interested in these faster
versions:

  http://www.ffconsultancy.com/languages/ray_tracer/code/5/ray.ml
  http://www.ffconsultancy.com/languages/ray_tracer/code/5/ray.cpp

For "./ray 6 512", I get:

OCaml: 3.140s  ocamlopt -inline 1000 ray.ml -o ray
C++:   2.970s  g++ -O3 -ffast-math ray.cpp -o ray

--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
The OCaml Journal
http://www.ffconsultancy.com/products/ocaml_journal/?e
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Haskell version of ray tracer code is much slower than the original ML

Donald Bruce Stewart
jon:

> On Friday 22 June 2007 19:54:16 Philip Armstrong wrote:
> > On Fri, Jun 22, 2007 at 10:11:27PM +0400, Bulat Ziganshin wrote:
> > >btw, *their* measurement said that ocaml is 7% faster :)
> >
> > Indeed. The gcc-4.0 compilied binary runs at about 15s IIRC, but it's
> > still much better than 7% faster than the ocaml binary.
>
> What architecture, platform, compiler versions and compile lines are you
> using?
>
> On my 2x 2.2GHz Athlon64 running x64 Debian I now get:
>
> GHC 6.6.1:    26.5s    ghc -funbox-strict-fields -O3 ray.hs -o ray

Don't use -O3 , its *worse* than -O2, and somewhere between -Onot and -O iirc,

    ghc -O2 -funbox-strict-fields -fvia-C -optc-O2 -optc-ffast-math -fexcess-precision

Are usually fairly good.



> OCaml 3.10.0: 14.158s  ocamlopt -inline 1000 ray.ml -o ray
> g++ 4.1.3:     8.056s  g++ -O3 -ffast-math ray.cpp -o ray
>
> Also, the benchmarks and results that I cited before are more up to date than
> the ones you're using. In particular, you might be interested in these faster
> versions:
>
>   http://www.ffconsultancy.com/languages/ray_tracer/code/5/ray.ml
>   http://www.ffconsultancy.com/languages/ray_tracer/code/5/ray.cpp
>
> For "./ray 6 512", I get:
>
> OCaml: 3.140s  ocamlopt -inline 1000 ray.ml -o ray
> C++:   2.970s  g++ -O3 -ffast-math ray.cpp -o ray
>
> --
> Dr Jon D Harrop, Flying Frog Consultancy Ltd.
> The OCaml Journal
> http://www.ffconsultancy.com/products/ocaml_journal/?e
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Haskell version of ray tracer code is much slower than the original ML

Andrew Coppin
Donald Bruce Stewart wrote:
> Don't use -O3 , its *worse* than -O2, and somewhere between -Onot and -O iirc,
>
>     ghc -O2 -funbox-strict-fields -fvia-C -optc-O2 -optc-ffast-math -fexcess-precision
>
> Are usually fairly good.
>  

Is this likely to be fixed ever?

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Haskell version of ray tracer code is much slowerthan the original ML

Phil Armstrong-2
In reply to this post by Claus Reinke
On Sat, Jun 23, 2007 at 12:42:33AM +0100, Claus Reinke wrote:
>>http://www.kantaka.co.uk/darcs/ray
>
>try making ray_sphere and intersect' local to intersect,
>then drop their constant ray parameter. saves me 25%.
>claus

I see: I guess I'm paying for laziness in the first parameter to
intersect' which I don't need. 25% brings it within spitting distance
of the OCaml binary too.

Phil

--
http://www.kantaka.co.uk/ .oOo. public key: http://www.kantaka.co.uk/gpg.txt
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Haskell version of ray tracer code is much slower than the original ML

Phil Armstrong-2
In reply to this post by Andrew Coppin
On Sat, Jun 23, 2007 at 08:49:15AM +0100, Andrew Coppin wrote:
>Donald Bruce Stewart wrote:
>>Don't use -O3 , its *worse* than -O2, and somewhere between -Onot and -O
>>iirc,
>
>Is this likely to be fixed ever?

There is at least a bug report for it IIRC.

Phil

--
http://www.kantaka.co.uk/ .oOo. public key: http://www.kantaka.co.uk/gpg.txt
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Haskell version of ray tracer code is much slower than the original ML

Phil Armstrong-2
In reply to this post by Jon Harrop
On Sat, Jun 23, 2007 at 03:28:53AM +0100, Jon Harrop wrote:
>What architecture, platform, compiler versions and compile lines are you
>using?

32-bit x86, Debian unstable, gcc version 4.1.2, OCaml version
3.09.2-9, GHC version 6.6.1, compile line in the Makfile at

  http://www.kantaka.co.uk/darcs/ray/Makefile

>  http://www.ffconsultancy.com/languages/ray_tracer/code/5/ray.ml
>  http://www.ffconsultancy.com/languages/ray_tracer/code/5/ray.cpp

I gather these use algorithmic optimisations.

cheers, Phil

--
http://www.kantaka.co.uk/ .oOo. public key: http://www.kantaka.co.uk/gpg.txt
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Haskell version of ray tracer code is much slower than the original ML

Jon Harrop
On Saturday 23 June 2007 08:58:10 Philip Armstrong wrote:
> On Sat, Jun 23, 2007 at 03:28:53AM +0100, Jon Harrop wrote:
> >What architecture, platform, compiler versions and compile lines are you
> >using?
>
> 32-bit x86...

Intel or AMD?

> >  http://www.ffconsultancy.com/languages/ray_tracer/code/5/ray.ml
> >  http://www.ffconsultancy.com/languages/ray_tracer/code/5/ray.cpp
>
> I gather these use algorithmic optimisations.

Both. Versions 1-4 are progressively algorithmic, version 5 includes low-level
optimizations (mostly manual inlining and unrolling). Implementing version 1
is probably also interesting: it is the most concise version.

BTW, the ray tracer leverages the semantics of the float infinity (as the
parameter when there is no intersection). Shouldn't be a problem but some
compiler options might break it.

--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
The OCaml Journal
http://www.ffconsultancy.com/products/ocaml_journal/?e
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Haskell version of ray tracer code is much slowerthan the original ML

Jon Harrop
In reply to this post by Phil Armstrong-2
On Saturday 23 June 2007 08:54:11 Philip Armstrong wrote:

> On Sat, Jun 23, 2007 at 12:42:33AM +0100, Claus Reinke wrote:
> >>http://www.kantaka.co.uk/darcs/ray
> >
> >try making ray_sphere and intersect' local to intersect,
> >then drop their constant ray parameter. saves me 25%.
> >claus
>
> I see: I guess I'm paying for laziness in the first parameter to
> intersect' which I don't need. 25% brings it within spitting distance
> of the OCaml binary too.

Can you post the code for this?

--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
The OCaml Journal
http://www.ffconsultancy.com/products/ocaml_journal/?e
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Haskell version of ray tracer code is muchslowerthan the original ML

Claus Reinke
In reply to this post by Phil Armstrong-2
>>>http://www.kantaka.co.uk/darcs/ray
>>
>>try making ray_sphere and intersect' local to intersect,
>>then drop their constant ray parameter. saves me 25%.
>>claus

also try replacing that (foldl' intersect') with (foldr (flip intersect'))!

using a recent ghc head instead of ghc-6.6.1 also seems to
make a drastic difference (wild guess, seeing the unroll 1000
for ocaml: has there been a change to default unrolling in ghc?).

claus
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe

Ray1.hs (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Haskell version of ray tracer code is muchslowerthan the original ML

Jon Harrop
On Saturday 23 June 2007 12:05:01 Claus Reinke wrote:

> >>>http://www.kantaka.co.uk/darcs/ray
> >>
> >>try making ray_sphere and intersect' local to intersect,
> >>then drop their constant ray parameter. saves me 25%.
> >>claus
>
> also try replacing that (foldl' intersect') with (foldr (flip intersect'))!
>
> using a recent ghc head instead of ghc-6.6.1 also seems to
> make a drastic difference (wild guess, seeing the unroll 1000
> for ocaml: has there been a change to default unrolling in ghc?).

Wow! Now I get:

GHC:   15.8s
OCaml: 14s
g++:   10s

That's very impressive, and more than I was expecting. I think it is worth
noting that you are now comparing optimized Haskell to unoptimized OCaml
though, so the OCaml has more "low-hanging optimization fruit". :-)

At this point, the single most productive optimization for the OCaml is to
evade the polymorphism and closure in the "intersect" function by inlining
the call to "List.fold_left". This makes the OCaml basically as fast as the
C++ on my machine:

GHC:   15.8s
OCaml: 10.6s
g++:   10s

let delta = sqrt epsilon_float

type vec = {x:float; y:float; z:float}
let zero = {x=0.; y=0.; z=0.}
let ( *| ) s r = {x = s *. r.x; y = s *. r.y; z = s *. r.z}
let ( +| ) a b = {x = a.x +. b.x; y = a.y +. b.y; z = a.z +. b.z}
let ( -| ) a b = {x = a.x -. b.x; y = a.y -. b.y; z = a.z -. b.z}
let dot a b = a.x *. b.x +. a.y *. b.y +. a.z *. b.z
let length r = sqrt(dot r r)
let unitise r = 1. /. length r *| r

let rec intersect orig dir (l, _ as hit) (center, radius, scene) =
  let l' =
    let v = center -| orig in
    let b = dot v dir in
    let disc = sqrt(b *. b -. dot v v +. radius *. radius) in
    let t1 = b -. disc and t2 = b +. disc in
    if t2>0. then if t1>0. then t1 else t2 else infinity in
  if l' >= l then hit else match scene with
  | [] -> l', unitise (orig +| l' *| dir -| center)
  | scenes -> intersects orig dir hit scenes
and intersects orig dir hit = function
  | [] -> hit
  | scene::scenes -> intersects orig dir (intersect orig dir hit scene) scenes

let light = unitise {x=1.; y=3.; z= -2.} and ss = 4

let rec ray_trace dir scene =
  let l, n = intersect zero dir (infinity, zero) scene in
  let g = dot n light in
  if g <= 0. then 0. else
    let p = l *| dir +| sqrt epsilon_float *| n in
    if fst (intersect p light (infinity, zero) scene) < infinity then 0. else
g

let rec create level c r =
  let obj = c, r, [] in
  if level = 1 then obj else
    let a = 3. *. r /. sqrt 12. in
    let aux x' z' = create (level - 1) (c +| {x=x'; y=a; z=z'}) (0.5 *. r) in
    c, 3. *. r, [obj; aux (-.a) (-.a); aux a (-.a); aux (-.a) a; aux a a]

let level, n =
  try int_of_string Sys.argv.(1), int_of_string Sys.argv.(2) with _ -> 6, 512

let scene = create level {x=0.; y= -1.; z=4.} 1.;;

Printf.printf "P5\n%d %d\n255\n" n n;;
for y = n - 1 downto 0 do
  for x = 0 to n - 1 do
    let g = ref 0. in
    for dx = 0 to ss - 1 do
      for dy = 0 to ss - 1 do
        let aux x d = float x -. float n /. 2. +. float d /. float ss in
        let dir = unitise {x=aux x dx; y=aux y dy; z=float n} in
        g := !g +. ray_trace dir scene
      done;
    done;
    let g = 0.5 +. 255. *. !g /. float (ss*ss) in
    Printf.printf "%c" (char_of_int (int_of_float g))
  done;
done

--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
The OCaml Journal
http://www.ffconsultancy.com/products/ocaml_journal/?e
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Haskell version of ray tracer code is muchslowerthan the original ML

Phil Armstrong-2
In reply to this post by Claus Reinke
On Sat, Jun 23, 2007 at 12:05:01PM +0100, Claus Reinke wrote:
>>>>http://www.kantaka.co.uk/darcs/ray
>>>
>>>try making ray_sphere and intersect' local to intersect,
>>>then drop their constant ray parameter. saves me 25%.
>>>claus
>
>also try replacing that (foldl' intersect') with (foldr (flip intersect'))!

Thanks guys, this is exactly the kind of advice I was seeking.

OK, next question: Given that I'm using all the results from
intersect', why is the lazy version better than the strict one? Is ghc
managing to do some loop fusion?

>using a recent ghc head instead of ghc-6.6.1 also seems to
>make a drastic difference (wild guess, seeing the unroll 1000
>for ocaml: has there been a change to default unrolling in ghc?).

Um. I tried ghc head on the current version and it was about 15%
*slower* than 6.6.1

Perhaps it does better on the (slightly) optimised version?

Phil

--
http://www.kantaka.co.uk/ .oOo. public key: http://www.kantaka.co.uk/gpg.txt
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Haskell version of ray tracer code is much slower than the original ML

Phil Armstrong-2
In reply to this post by Jon Harrop
On Sat, Jun 23, 2007 at 10:32:31AM +0100, Jon Harrop wrote:
>On Saturday 23 June 2007 08:58:10 Philip Armstrong wrote:
>> On Sat, Jun 23, 2007 at 03:28:53AM +0100, Jon Harrop wrote:
>> >What architecture, platform, compiler versions and compile lines are you
>> >using?
>>
>> 32-bit x86...
>
>Intel or AMD?

AMD. Athlon 64 3000+ to be precise.

>> >  http://www.ffconsultancy.com/languages/ray_tracer/code/5/ray.ml
>> >  http://www.ffconsultancy.com/languages/ray_tracer/code/5/ray.cpp
>>
>> I gather these use algorithmic optimisations.
>
>Both. Versions 1-4 are progressively algorithmic, version 5 includes low-level
>optimizations (mostly manual inlining and unrolling). Implementing version 1
>is probably also interesting: it is the most concise version.
>
>BTW, the ray tracer leverages the semantics of the float infinity (as the
>parameter when there is no intersection). Shouldn't be a problem but some
>compiler options might break it.

Thus far the Haskell version generates identical images to the OCaml
one.

Phil

--
http://www.kantaka.co.uk/ .oOo. public key: http://www.kantaka.co.uk/gpg.txt
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Haskell version of ray tracer code is muchslowerthan the original ML

Phil Armstrong-2
In reply to this post by Phil Armstrong-2
On Sat, Jun 23, 2007 at 07:07:49PM +0100, Philip Armstrong wrote:

>On Sat, Jun 23, 2007 at 12:05:01PM +0100, Claus Reinke wrote:
>>>>>http://www.kantaka.co.uk/darcs/ray
>>>>
>>>>try making ray_sphere and intersect' local to intersect,
>>>>then drop their constant ray parameter. saves me 25%.
>>>>claus
>>
>>also try replacing that (foldl' intersect') with (foldr (flip intersect'))!
>
>Thanks guys, this is exactly the kind of advice I was seeking.
>
>OK, next question: Given that I'm using all the results from
>intersect', why is the lazy version better than the strict one? Is ghc
>managing to do some loop fusion?

Incidentally, replacing foldl' (intersect ray) hit scene with foldr
(flip (intersect ray)) hit scene makes the current version (without
the lifting of ray out of intersect & ray_scene) almost exactly as
fast as the OCaml version on my hardware. That's almost a 40% speedup!

>>using a recent ghc head instead of ghc-6.6.1 also seems to
>>make a drastic difference (wild guess, seeing the unroll 1000
>>for ocaml: has there been a change to default unrolling in ghc?).
>
>Um. I tried ghc head on the current version and it was about 15%
>*slower* than 6.6.1
>
>Perhaps it does better on the (slightly) optimised version?

Nope, just tried it on the foldr version. It's still slower than 6.6.1
(although not by as much: 26s vs 24s for the 6.6.1 binary). This is
ghc head from this Friday.

Jon is probably correct that hoisting ray out is a code level
transformation that makes the Haskell version different to the OCaml
one, but personally I'd suggest that replacing foldl with foldr is
not: the end result is the same & both have to walk the entire list in
order to get a result.

So, on my hardware at least, the Haskell and OCaml version have
equivalent performance. I think that's pretty impressive. Getting
close to the C++ would probably going to require rather more effort!

Phil

--
http://www.kantaka.co.uk/ .oOo. public key: http://www.kantaka.co.uk/gpg.txt
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Haskell version of ray tracer code ismuchslowerthan the original ML

Claus Reinke
In reply to this post by Phil Armstrong-2
>>also try replacing that (foldl' intersect') with (foldr (flip intersect'))!
> OK, next question: Given that I'm using all the results from
> intersect', why is the lazy version better than the strict one? Is ghc
> managing to do some loop fusion?

haskell tends to prefer foldr where mls prefer foldl, be it for
lazyness and short-circuiting operators, or because a tail-recursive
function with a lazy accumulator is only an efficient way to construct
inefficient expressions.

so, the very first thing i tend to look for when someone ports a
program from ml to haskell are tail recursions with non-strict
accumulators. even using foldl', when constructing pairs in the
accumulator, there's no guarantee that the pair components will
be evaluated early even if the pairs themselves are forced. so
replacing foldl with foldr when porting from ml to haskell tends
to be a useful habit, unless there are good reasons for foldl.

however, things seem to be a little bit more involved in this
example: intersect' forces the first component, and ignores
the second, never building nasty delayed combinations of old
accumulator and list head in the new accumulator. but if you
compare the outputs of -ddump-simpl, or if you export all
definitions from main and compare the outputs of --show-iface,
you'll find differences related to the the result of intersect':
whether or not that result can be passed unboxed.

    Constructed Product Result Analysis for Haskell (2000)
    http://citeseer.ist.psu.edu/baker-finch00constructed.html

i don't know the details, but in the foldr case, the result of
intersect' seems to be passed unboxed, in the foldl' case, it
isn't. i'll leave it to the experts to explain whether that has to
be the case or whether it is an omission in the optimizer.

claus
 
>>using a recent ghc head instead of ghc-6.6.1 also seems to
>>make a drastic difference

$ uname -a
CYGWIN_NT-5.1 cr3-lt 1.5.19(0.150/4/2) 2006-01-20 13:28 i686 Cygwin

$ ghc --version
The Glorious Glasgow Haskell Compilation System, version 6.6.1

$ gcc --version
gcc.exe (GCC) 3.4.2 (mingw-special)
Copyright (C) 2004 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ ghc --make Ray1.hs
[1 of 1] Compiling Main             ( Ray1.hs, Ray1.o )
Linking Ray1.exe ...

$ time ./Ray1.exe >out

real    0m55.705s
user    0m0.015s
sys     0m0.031s

$ /cygdrive/c/fptools/ghc/ghc-6.7.20070613/bin/ghc --make Ray1.hs -o Ray1head.exe
[1 of 1] Compiling Main             ( Ray1.hs, Ray1.o )
Linking Ray1head.exe ...

$ time ./Ray1head.exe >out.head

real    0m24.989s
user    0m0.031s
sys     0m0.015s

$ diff -q --binary out out.head

$


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Haskell version of ray tracer code is much slower than the original ML

Spencer Janssen-2
In reply to this post by Phil Armstrong-2
On Thu, 21 Jun 2007 11:55:04 +0100
Philip Armstrong <[hidden email]> wrote:

> In odd spare moments, I took John Harrops simple ray tracer[1] & made
> a Haskell version:
>
>   http://www.kantaka.co.uk/cgi-bin/darcsweb.cgi?r=ray
>
>   darcs get http://www.kantaka.co.uk/darcs/ray
>
> It's pretty much a straight translation into idiomatic Haskell (as far
> as my Haskell is idiomatic anyway).
>
> Unfortunately, it's a lot slower than the ML version, despite turning
> all the optimisation options up as far as they'll go. Profiling
> suggests that much of the time is spent in the intersection' function,
> and that the code is creating (and garbage collecting) an awful lot of
> (-|) vector subtraction thunks. Trying to make intersection' or
> ray_sphere stricter (with seq) appears to have no effect whatsoever:
> the output of -ddump-simpl is unchanged (with the arguments all
> staying lazy).
>
> Am I missing anything obvious? I don't want to carry out herculean
> code rewriting efforts: that wouldn't really be in the spirit of the
> thing.
>
> cheers, Phil
>
> [1] http://www.ffconsultancy.com/languages/ray_tracer/comparison.html
>
With a very minor change (attached), your Haskell ray tracer runs faster
than the OCaml version on my machine.  There's a bug GHC where it does
not recognize -fexcess-precision at the command line, but an
OPTIONS_GHC pragma does work correctly.  This flag brings runtime from
about 60s to 20s on my machine (Core Duo 1.83GHz) -- compared to 25s
for the OCaml version.

Results (each run twice to avoid OS buffering of the executable):

% uname -a
Linux localhost 2.6.22-rc4 #5 SMP Tue Jun 19 17:29:36 CDT 2007 i686
Genuine Intel(R) CPU T2400 @ 1.83GHz GenuineIntel GNU/Linux
% ghc --version
The Glorious Glasgow Haskell Compilation System, version 6.6
% ocamlopt -version
3.09.3
% (time ./hsray) | md5sum
./hsray  20.23s user 0.03s system 98% cpu 20.536 total
63a359e5c388f2004726d83d4337f56b  -
% (time ./hsray) | md5sum
./hsray  19.74s user 0.07s system 99% cpu 19.907 total
63a359e5c388f2004726d83d4337f56b  -
% (time ./mlray) | md5sum  
./mlray  25.55s user 0.00s system 98% cpu 25.831 total
63a359e5c388f2004726d83d4337f56b  -
% (time ./mlray) | md5sum
./mlray  25.63s user 0.04s system 98% cpu 25.981 total
63a359e5c388f2004726d83d4337f56b  -


Cheers,
Spencer Janssen
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe

ray_pragma.dpatch (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Haskell version of ray tracer code is much slower than the original ML

Simon Marlow-5
In reply to this post by Phil Armstrong-2
Philip Armstrong wrote:
> On Sat, Jun 23, 2007 at 08:49:15AM +0100, Andrew Coppin wrote:
>> Donald Bruce Stewart wrote:
>>> Don't use -O3 , its *worse* than -O2, and somewhere between -Onot and
>>> -O iirc,
>>
>> Is this likely to be fixed ever?
>
> There is at least a bug report for it IIRC.

It was fixed yesterday.

Cheers,
        Simon
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
12