Shootout favoring imperative code

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
30 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Shootout favoring imperative code

Chad Scherrer-2
Several people on this list have said that the shootout favors imperative code. Is this really the case? Why is it Clean seems to have no trouble (for the incomplete set of benchmarks that are written in it)?

http://shootout.alioth.debian.org/clean.php

How difficult would it be to translate the Clean algorithms into Haskell?

-Chad
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Shootout favoring imperative code

Sebastian Sylvan
On 1/4/06, Scherrer, Chad <[hidden email]> wrote:
> Several people on this list have said that the shootout favors imperative code. Is this really the case? Why is it Clean seems to have no trouble (for the incomplete set of benchmarks that are written in it)?
>
> http://shootout.alioth.debian.org/clean.php
>
> How difficult would it be to translate the Clean algorithms into Haskell?
>

IMO, the problem isn't that it's not *possible* to make the Haskell
versions as fast at the C versions. You just write them in an
imperative style using pointers, peek, poke etc. However, most people
don't use Haskell for it's facilites for writing C-style code.

Some of the problems seem to be heavily geared towards an imperative
*implementation*, meaning that a Haskell version is hardly idiomatic
Haskell (and as such I , and I suspect otehrs, really have no
inclination to work on it).
Take the fannuch benchmark for instance. Part of it is to generate
input data (all permutations of a sequence). It would be better to not
require the program to print out the first 30 lists from the input
data, since that places additional (completely unneeded) restrictions
on how to implement the program (and leads to an unnecessarily ugly
implementation if you generate the input in a non-imperative fashion).
I assume it's no coincidence that this sequence exactly matches the
"straightforward" way to generate permutations in C.

Now, there is some value to restrict implementaiton in *some* cases.
For instance if you want to test the speed of a function call with a
recursive algorithm, you need to enforce that the algorithm is written
with recursion.

So in conclusion: By placing too many requirements on the
implementations of the algorithms you make the benchmarks completely
useless for languages that aren't C-ish in nature. It all degenerates
into "are there enough programmers who are willing to spend time
writing C in Haskell?" and not how easy it is to solve the problems in
Haskell, and how fast the results are.

I should note that there are a few good "unbiased" benchmarks in there
as well (such as chamenos and others) which place very little
restriction on implementation and just says what needs to be done...

I should also note that I don't think these benchmarks mean anything
at all. It would be better to test, say, the best possible solutions
for some of the ICFP programming contests, for example. They say a lot
more about real-world speed.


/S
--
Sebastian Sylvan
+46(0)736-818655
UIN: 44640862
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Shootout favoring imperative code

Brent Fulgham
--- Sebastian Sylvan <[hidden email]>
wrote:
> Some of the problems seem to be heavily geared
> towards an imperative *implementation*, meaning that
a Haskell
> version is hardly idiomatic Haskell (and as such I ,
and I
> suspect otehrs, really have no inclination to work
on it).

I agree that several benchmarks suffer from this
problem, but
we have been trying to change this where possible.

> Take the fannuch benchmark for instance. Part of it
> is to generate input data (all permutations of a
sequence). It
> would be better to not require the program to print
out the
> first 30 lists from the input data, since that
places
> additional (completely unneeded) restrictions
> on how to implement the program (and leads to an
> unnecessarily ugly implementation if you generate
the input in
> a non-imperative fashion).

I must disagree.  We based this benchmark on a very
standard
benchmark studied by Lisp implementors (and others,
see e.g.,
http://citeseer.ist.psu.edu/315141.html)in an effort
to address
the problems of the original array access benchmark
(which was
extremely imperative in nature).  I don't think asking
Haskell
to handle this simple program is unfair; Ken Anderson
and others
dealt with this for Lisp many years ago.

> I assume it's no coincidence that this sequence
> exactly matches the "straightforward" way to
generate
> permutations in C.

But I think Haskell may face real-world cases where
data must
be produced in some known order.  For Haskell to be a
contender
in "real world" use, it sometimes has to confront ugly
requirements.

> I should also note that I don't think these
benchmarks mean
> anything at all. It would be better to test, say,
the best
> possible solutions for some of the ICFP programming
contests,
> for example. They say a lot more about real-world
speed.

Agreed.  However, it's a lot easier to get volunteers
to
implement small benchmarks (therefore, providing the
ability to
compare many languages) rather than large ICFP
entries.

-Brent
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Shootout favoring imperative code

Sebastian Sylvan
On 1/4/06, Brent Fulgham <[hidden email]> wrote:

> But I think Haskell may face real-world cases where
> data must
> be produced in some known order.  For Haskell to be a
> contender
> in "real world" use, it sometimes has to confront ugly
> requirements.

I must respectfully note that you contradict yourself somewhat.
First you state that there's no problem introducing unnecessary
requirements on the order of generating input data because it's to be
expected in the "real-world", and then you state that the reason for
not using more real-world benchmarks is to facilitate more volonteers.
Wouldn't less strict requirements where they are posible also
facilitate more contributions?

My point here was that even though you _can_ generate this data in
Haskell, there's no point in requiring (because the order doesn't
matter for the benchmark itself). This needless requirementm (for the
data to follow the order you get from an imperative solution) will
only put off contributors for functional solutions.

If you wanted to be fair here the order would be much more intricate
and require considerable obfuscation for all langauges.

/S

--
Sebastian Sylvan
+46(0)736-818655
UIN: 44640862
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Shootout favoring imperative code

Brent Fulgham
--- Sebastian Sylvan <[hidden email]>
wrote:
> My point here was that even though you _can_
> generate this data in Haskell, there's no point in
requiring
> (because the order doesn't matter for the benchmark
itself).

We do need to agree on which 30 permutations should be
used
in the validation of the benchmark (just to make sure
that
the algorithms are producing correct output).

Perhaps we could specify the 30 (or perhaps 'N')
permutations
as an input file, or perhaps require that they be
hard-coded
into the program?

The problem with using an input file is that now we
are involving
file I/O in the benchmark, which introduces questions
about
where time is being spent (i.e., file access instead
of
pancake-flipping).

-Brent
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Shootout favoring imperative code

Sebastian Sylvan
On 1/4/06, Brent Fulgham <[hidden email]> wrote:

> --- Sebastian Sylvan <[hidden email]>
> wrote:
> > My point here was that even though you _can_
> > generate this data in Haskell, there's no point in
> requiring
> > (because the order doesn't matter for the benchmark
> itself).
>
> We do need to agree on which 30 permutations should be
> used
> in the validation of the benchmark (just to make sure
> that
> the algorithms are producing correct output).

Wouldn't the "maximum number of flips" output be enough for validation?


/S

--
Sebastian Sylvan
+46(0)736-818655
UIN: 44640862
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

RE: Shootout favoring imperative code

Chad Scherrer-2
In reply to this post by Chad Scherrer-2
> --- Sebastian Sylvan <[hidden email]>
> wrote:
> Some of the problems seem to be heavily geared towards an
> imperative *implementation*, meaning that a Haskell version
> is hardly idiomatic Haskell (and as such I , and I suspect
> otehrs, really have no inclination to work on it).

This may be correct, but it still doesn't address the question.
Why does Clean fare so much better? Clean is purely functional, right?

Chad Scherrer
Computational Mathematics Group
Pacific Northwest National Laboratory

"Time flies like an arrow; fruit flies like a banana." -- Groucho Marx

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Shootout favoring imperative code

Sebastian Sylvan
On 1/4/06, Scherrer, Chad <[hidden email]> wrote:
> > --- Sebastian Sylvan <[hidden email]>
> > wrote:
> > Some of the problems seem to be heavily geared towards an
> > imperative *implementation*, meaning that a Haskell version
> > is hardly idiomatic Haskell (and as such I , and I suspect
> > otehrs, really have no inclination to work on it).
>
> This may be correct, but it still doesn't address the question.
> Why does Clean fare so much better? Clean is purely functional, right?

Ah, but is that really the interesting question? :-)
IMO no. I don't think most of these benchmarks are unfair
performance-wise (mostly because there are very few problems which are
*inherently faster* in imperative languages), but some of them are
unfair in "implementation-effort-time". So the fact that Clean fares
better than Haskell doesn't have much to do with the "unfairness" of
the shootout (but certainly is interesting!).

There are many programs which can be quite easily programmed in
Haskell but are cumbersome in C, but very few of that type of programs
are in the shootout (see basically any Haskell tutorial for many
examples :-)).

For instance, working with big lazy matrices (doing multiplication,
inversions etc.) and then extracting only a single element somewhere
for the output. That's certainly useful, and difficult to do in C or
C++, but comes "for free" (implementation-time-wise) in Haskell.

OTOH, now that I think about it, most of the cases where Haskell is
clearly more elegant than the competition is in more "real-world"
scenarios, and not in "benchmarky" code (small code performing a small
set of operations many times) - so it may be that benchmarks are
inherently favoured towards "von-Neumann-abstraction"-type langauges.

/S
--
Sebastian Sylvan
+46(0)736-818655
UIN: 44640862
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Shootout favoring imperative code

Sebastian Sylvan
In reply to this post by Brent Fulgham
On 1/4/06, Brent Fulgham <[hidden email]> wrote:

> --- Sebastian Sylvan <[hidden email]>
> wrote:
> > Some of the problems seem to be heavily geared
> > towards an imperative *implementation*, meaning that
> a Haskell
> > version is hardly idiomatic Haskell (and as such I ,
> and I
> > suspect otehrs, really have no inclination to work
> on it).
>
> I agree that several benchmarks suffer from this
> problem, but
> we have been trying to change this where possible.
>

A good example of an unfair benchmark that Udo Stenzel noted over at
the shootout haskell wiki is "sum-file".
Here you specify that no line is ever more than 128 characters long.
What's the purpose of doing that? Clearly it's to make life easier for
C programmers, is it not?
(do C programs never have to deal with the "real world" where such
assumptions can't be made?).
Furthermore an 128 digit number in base 10 would occupy 53 bytes, but
the C (and most other) implementations assume that it will all fit
within one machine word, which is obviously against the spec (the only
thing the spec says about size of the numbers is that it's at most 127
digits, since one char is a newline, so the implementation must assume
the worst about the size of the numbers to be compliant).

This is a one-liner in idiomatic Haskell (getContents >>= print . sum
. map read . lines), but since there are restrictions specifically
tailored to make life easier for lower level languages, the Haskell
submission must resort to all sorts of "hacks" to compete (to
circumvent the high-level general tools available). Unfair!

/S

--
Sebastian Sylvan
+46(0)736-818655
UIN: 44640862
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Shootout favoring imperative code

haskell-2
Also about sum-file: They do not reveal what the actual 8k test file
contains.  So there is no way to reproduce the benchmark locally for
testing.  (One can learn it totals 400000, but since negative numbers
are allowed, this does not help much).

The problem can even be solved in one line with (g)awk.

Apparantly it is bottlenecked by parsing strings into integers, but they
specify "Programs should use built-in line-oriented I/O functions rather
than custom-code." which means the programmer's hands are completely
tied.  So it is just a benchmark of the build-in library function, not
of any algorithm the programmer provides.

There is no need to beat a dead horse, though.  This benchmark sets out
to test fgets / atoi, and that is all.  There are better benchmarks to
spend time on.

--
Chris

Sebastian Sylvan wrote:

> On 1/4/06, Brent Fulgham <[hidden email]> wrote:
>
>>--- Sebastian Sylvan <[hidden email]>
>>wrote:
>>
>>>Some of the problems seem to be heavily geared
>>>towards an imperative *implementation*, meaning that
>>
>>a Haskell
>>
>>>version is hardly idiomatic Haskell (and as such I ,
>>
>>and I
>>
>>>suspect otehrs, really have no inclination to work
>>
>>on it).
>>
>>I agree that several benchmarks suffer from this
>>problem, but
>>we have been trying to change this where possible.
>>
>
>
> A good example of an unfair benchmark that Udo Stenzel noted over at
> the shootout haskell wiki is "sum-file".
> Here you specify that no line is ever more than 128 characters long.
> What's the purpose of doing that? Clearly it's to make life easier for
> C programmers, is it not?
> (do C programs never have to deal with the "real world" where such
> assumptions can't be made?).
> Furthermore an 128 digit number in base 10 would occupy 53 bytes, but
> the C (and most other) implementations assume that it will all fit
> within one machine word, which is obviously against the spec (the only
> thing the spec says about size of the numbers is that it's at most 127
> digits, since one char is a newline, so the implementation must assume
> the worst about the size of the numbers to be compliant).
>
> This is a one-liner in idiomatic Haskell (getContents >>= print . sum
> . map read . lines), but since there are restrictions specifically
> tailored to make life easier for lower level languages, the Haskell
> submission must resort to all sorts of "hacks" to compete (to
> circumvent the high-level general tools available). Unfair!
>
> /S
>
> --
> Sebastian Sylvan
> +46(0)736-818655
> UIN: 44640862
> _______________________________________________
> Haskell-Cafe mailing list
> [hidden email]
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Shootout favoring imperative code

Sebastian Sylvan
On 1/5/06, Chris Kuklewicz <[hidden email]> wrote:

> Also about sum-file: They do not reveal what the actual 8k test file
> contains.  So there is no way to reproduce the benchmark locally for
> testing.  (One can learn it totals 400000, but since negative numbers
> are allowed, this does not help much).
>
> The problem can even be solved in one line with (g)awk.
>
> Apparantly it is bottlenecked by parsing strings into integers, but they
> specify "Programs should use built-in line-oriented I/O functions rather
> than custom-code." which means the programmer's hands are completely
> tied.  So it is just a benchmark of the build-in library function, not
> of any algorithm the programmer provides.
>
> There is no need to beat a dead horse, though.  This benchmark sets out
> to test fgets / atoi, and that is all.  There are better benchmarks to
> spend time on.
>

I agree. The benchmark really is about how fast you can call low-level
IO system calls. But since Haskell is a high-level language it's
natural that it's a bit difficult to get access to these unsafe (but
fast) low-level functions.
In fact, if I really wanted to do this, I would use the FFI...

Do you think they'll accept this contribution for sum-file?

--------

import Foreign.C
import Foreign.Ptr
import Foreign.Marshal.Array

foreign import ccall "stdio.h" fgets :: CString -> Int -> Ptr CFile ->IO CString
foreign import ccall safe "stdlib.h" atoi :: CString -> Int
foreign import ccall safe "stdio.h &(*stdin)" c_stdin :: Ptr CFile

bufferSize = 128

loop :: CString -> Int -> IO Int
loop buf v =
    do ret <- fgets buf bufferSize c_stdin
       case (ret == nullPtr) of
         True  -> return v -- eof, or some other error
          False -> do let i = atoi buf
                     i `seq` loop buf (v + i) -- force eval of 'i'!

main = do v <- allocaArray bufferSize (\buf -> loop buf 0)
          print v

--------------

--
Sebastian Sylvan
+46(0)736-818655
UIN: 44640862
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Shootout favoring imperative code

haskell-2
I did manage to tweak SumFile to use unboxed Int# and go 10% faster.

http://haskell.org/hawiki/SumFile


Sebastian Sylvan wrote:

> On 1/5/06, Chris Kuklewicz <[hidden email]> wrote:
>
>>Also about sum-file: They do not reveal what the actual 8k test file
>>contains.  So there is no way to reproduce the benchmark locally for
>>testing.  (One can learn it totals 400000, but since negative numbers
>>are allowed, this does not help much).
>>
>>The problem can even be solved in one line with (g)awk.
>>
>>Apparantly it is bottlenecked by parsing strings into integers, but they
>>specify "Programs should use built-in line-oriented I/O functions rather
>>than custom-code." which means the programmer's hands are completely
>>tied.  So it is just a benchmark of the build-in library function, not
>>of any algorithm the programmer provides.
>>
>>There is no need to beat a dead horse, though.  This benchmark sets out
>>to test fgets / atoi, and that is all.  There are better benchmarks to
>>spend time on.
>>
>
>
> I agree. The benchmark really is about how fast you can call low-level
> IO system calls. But since Haskell is a high-level language it's
> natural that it's a bit difficult to get access to these unsafe (but
> fast) low-level functions.
> In fact, if I really wanted to do this, I would use the FFI...
>
> Do you think they'll accept this contribution for sum-file?
>
> --------
>
> import Foreign.C
> import Foreign.Ptr
> import Foreign.Marshal.Array
>
> foreign import ccall "stdio.h" fgets :: CString -> Int -> Ptr CFile ->IO CString
> foreign import ccall safe "stdlib.h" atoi :: CString -> Int
> foreign import ccall safe "stdio.h &(*stdin)" c_stdin :: Ptr CFile
>
> bufferSize = 128
>
> loop :: CString -> Int -> IO Int
> loop buf v =
>     do ret <- fgets buf bufferSize c_stdin
>        case (ret == nullPtr) of
>          True  -> return v -- eof, or some other error
>           False -> do let i = atoi buf
>     i `seq` loop buf (v + i) -- force eval of 'i'!
>
> main = do v <- allocaArray bufferSize (\buf -> loop buf 0)
>           print v
>
> --------------
>
> --
> Sebastian Sylvan
> +46(0)736-818655
> UIN: 44640862
>

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Shootout favoring imperative code

haskell-2
This uses getLine instead of getContents and is 3.8 times slower.


{-# OPTIONS -fglasgow-exts -O2 #-}
--
-- The Computer Language Shootout
-- http://shootout.alioth.debian.org/
--
-- compile with : ghc -O2 -o SumF SumF.hs
-- To get better performance set default heap size to 10MB
-- i.e. invoke as : ./SumF +RTS -H10M <input_file.txt
-- contributed by Greg Buchholz
-- modified by Mirko Rahn
-- modified by Chris Kuklewicz, 5 Jan 2006
--
import Data.Char(ord)
import GHC.Base

{-# INLINE d2i #-}
d2i :: Char -> Int#
d2i c = let (I# x) = ord c
            (I# z) = ord '0'
        in x -# z

main :: IO ()
main = do
  let next rt = do line <- catch getLine (\_ -> return [])
                   if (null line) then return (I# rt)
                                  else next (rt +# aLine line)
      aLine ('-':rest) = nLine rest 0#
          where nLine [] soFar = (-1#) *# soFar
                nLine ( x  :rest) soFar = nLine rest (d2i x +# (10# *#
soFar))
      aLine (x:rest) = pLine rest (d2i x)
          where pLine [] soFar = soFar
                pLine ( x  :rest) soFar = pLine rest (d2i x +# (10# *#
soFar))
  total <- next 0#
  print total


Sebastian Sylvan wrote:

> On 1/5/06, Chris Kuklewicz <[hidden email]> wrote:
>
>>I did manage to tweak SumFile to use unboxed Int# and go 10% faster.
>>
>>http://haskell.org/hawiki/SumFile
>
>
> Interestingly, it's actually faster to do it this way than to call the
> C functions fgets and atoi...
>
> However, I'm not sure it's "legal" since it's not using built-in
> line-based IO functions (which the spec states, for no apparent reason
> other than to favor languages whose IO convenience functions are
> low-level :-))
>
> /S
>
> --
> Sebastian Sylvan
> +46(0)736-818655
> UIN: 44640862
>

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Shootout favoring imperative code

Udo Stenzel
In reply to this post by Sebastian Sylvan
Sebastian Sylvan wrote:
> On 1/5/06, Chris Kuklewicz <[hidden email]> wrote:
> > There is no need to beat a dead horse, though.  This benchmark sets out
> > to test fgets / atoi, and that is all.  There are better benchmarks to
> > spend time on.
>
> I agree. The benchmark really is about how fast you can call low-level
> IO system calls. But since Haskell is a high-level language it's
> natural that it's a bit difficult to get access to these unsafe (but
> fast) low-level functions.

There's probably a bit more to it.  First off, one could legitimately
argue that (liftM lines getContents) is the Haskell way to do line
oriented IO.  The real question is, why does the fast solution have to
be ugly

> foreign import ccall "stdio.h" fgets :: CString -> Int -> Ptr CFile ->IO CString
> foreign import ccall safe "stdlib.h" atoi :: CString -> Int
> foreign import ccall safe "stdio.h &(*stdin)" c_stdin :: Ptr CFile

and why does the idiomatic solution have to be slow?

| main = print . sum . map read . lines =<< getContents

The biggest hit is probably the construction of a huge String as linked
list, which cannot be deforested away (not with the foldr/build
mechanism anyway).  Assuming we find a better representation for
Strings, we could make some headway here and in many other benchmarks.

So I think, just by replacing String and along with it getContents,
lines and read, we will get competitive speed and retain the ability to
handle arbitrarily long lines.  Of course, the shootout wouldn't care
for a feature that is otherwise quite important in practice...  Anyway,
I'll try try to come up with something and then follow up on this.


Udo.
--
A man always needs to remember one thing about a beautiful woman:
Somewhere, somebody's tired of her.

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe

signature.asc (196 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Shootout favoring imperative code

Brent Fulgham
In reply to this post by haskell-2


--- Chris Kuklewicz <[hidden email]>
wrote:

> Also about sum-file: They do not reveal what the
> actual 8k test file contains.  So there is no way
> to reproduce the benchmark locally for testing.
> (One can learn it totals 400000, but since
> negative numbers are allowed, this does not help
much).

It's created by catting the example file together
multiple times.  I'll update the page to be more clear
about
this, and I can send you the actual file used on the
test
machine if you want.

> Apparantly it is bottlenecked by parsing strings
> into integers, but they specify "Programs should use
> built-in line-oriented I/O functions rather
> than custom-code." which means the programmer's
> hands are completely tied.  So it is just a
benchmark of the
> build-in library function, not
> of any algorithm the programmer provides.

Yes -- it was designed as a test of the standard I/O
system.

-Brent
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Shootout favoring imperative code

haskell-2
In reply to this post by Udo Stenzel
Relative speed comparison below

Udo Stenzel wrote:

> Sebastian Sylvan wrote:
>
>>On 1/5/06, Chris Kuklewicz <[hidden email]> wrote:
>>
>>>There is no need to beat a dead horse, though.  This benchmark sets out
>>>to test fgets / atoi, and that is all.  There are better benchmarks to
>>>spend time on.
>>
>>I agree. The benchmark really is about how fast you can call low-level
>>IO system calls. But since Haskell is a high-level language it's
>>natural that it's a bit difficult to get access to these unsafe (but
>>fast) low-level functions.
>
>
> There's probably a bit more to it.  First off, one could legitimately
> argue that (liftM lines getContents) is the Haskell way to do line
> oriented IO.  The real question is, why does the fast solution have to
> be ugly
>
>
>>foreign import ccall "stdio.h" fgets :: CString -> Int -> Ptr CFile ->IO CString
>>foreign import ccall safe "stdlib.h" atoi :: CString -> Int
>>foreign import ccall safe "stdio.h &(*stdin)" c_stdin :: Ptr CFile
>
>
> and why does the idiomatic solution have to be slow?
>
> | main = print . sum . map read . lines =<< getContents
>
> The biggest hit is probably the construction of a huge String as linked
> list, which cannot be deforested away (not with the foldr/build
> mechanism anyway).  Assuming we find a better representation for
> Strings, we could make some headway here and in many other benchmarks.
>
> So I think, just by replacing String and along with it getContents,
> lines and read, we will get competitive speed and retain the ability to
> handle arbitrarily long lines.  Of course, the shootout wouldn't care
> for a feature that is otherwise quite important in practice...  Anyway,
> I'll try try to come up with something and then follow up on this.
>
>
> Udo.


The solution on the wiki (Char by Char) is the fastest I could make :
>   print . total =<< getContents
Time was 1.00 seconds

I tried using getLine and it was slower:
>   let next rt = do line <- catch getLine (\_ -> return [])
>                    if (null line) then return (I# rt)
>                                   else next (rt +# aLine line)
Time was 3.79 seconds

I tried using getContents and lines and it was slowest:
>   total <- return.(next 0#).lines =<< getContents
Time was 4.76 seconds

>From this, I assume the buffering that getContents does is very
optimized.  The cost of getLine was very nontrivial.  So for non-binary
input, I would recommend using getContents and processing it Char by Char.

--
Chris
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Shootout favoring imperative code

haskell-2
In reply to this post by Brent Fulgham
Brent Fulgham wrote:

>
> --- Chris Kuklewicz <[hidden email]>
> wrote:
>
>
>>Also about sum-file: They do not reveal what the
>>actual 8k test file contains.  So there is no way
>>to reproduce the benchmark locally for testing.
>>(One can learn it totals 400000, but since
>>negative numbers are allowed, this does not help
>
> much).
>
> It's created by catting the example file together
> multiple times.  I'll update the page to be more clear
> about
> this, and I can send you the actual file used on the
> test
> machine if you want.

That is what I did as a hack.  Nice to see I guessed right.  Right now I
use a 1,680,000 line (concatenated) version to get the processing times
large enough to discern small improvements.

>>Apparantly it is bottlenecked by parsing strings
>>into integers, but they specify "Programs should use
>>built-in line-oriented I/O functions rather
>>than custom-code." which means the programmer's
>>hands are completely tied.  So it is just a
>
> benchmark of the
>
>>build-in library function, not
>>of any algorithm the programmer provides.
>
>
> Yes -- it was designed as a test of the standard I/O
> system.
>
> -Brent
>

I assumed that that I could use getContents, like the previously
accepted Haskell entry.  It returns the entire stdin as a single (lazy)
string, so it is technically not "line oriented".  But it is certainly a
"standard I/O system" for Haskell.  I'll submit my improved version soon.

--
Chris
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Shootout favoring imperative code

haskell-2
In reply to this post by Sebastian Sylvan
[hidden email] wrote:
>>There is no need to beat a dead horse, though.  This benchmark sets out
>>to test fgets / atoi, and that is all.  There are better benchmarks to
>>spend time on.
>
>
> You can say that again!
>
Ah..sarcasm, I know that one.

Actually, I submitted a slightly faster sum-file entry for Haskell
tonight.  So I did kick the horse around, and I learned how to use
-funbox-strict-fields.

> Is a persecution complex required for Haskell programming :-)

The sum-file benchmark is not about my cleverness.  It is designed to
test the language's library routines.  I quote Brent:

> Yes -- it was designed as a test of the standard I/O
> system.
>
> -Brent

See...It is like the "startup" benchmark -- which just tests how long it
takes to start the program (and print "Hello World.").

>
> ackermann, sum-file, random, startup (aka hello world) are all
> left-over from the old Doug Bagley Great Computer Language Shootout -
> they are just little snippets of nothing that provide an easy starting
> point - takfp and harmonic are much the same.

takfp and ackerman and harmonic can be good tests of how well the
langauage handles recursion.

>
> I'm waiting for the complaints that binary-trees was designed to favour
> functional programming languages ;-)

;-) == sarcasm

>
> best wishes, Isaac
>
Cheers,
  Chris


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Shootout favoring imperative code

Isaac Gouy-2
In reply to this post by Chad Scherrer-2
I sent a private email and the response to it has
appeared on this mailing-list, so let me just correct
some of the interpretations that have been made.

> > You can say that again!

> Ah..sarcasm, I know that one.

No, it was emphatic agreement (the ordinary usage of
that phrase).


> > Is a persecution complex required for Haskell
> > programming :-)

This one is sarcasm. Hi Sebastian :-)


> > I'm waiting for the complaints that binary-trees
was designed
> > to favour functional programming languages ;-)

> ;-) == sarcasm

Actually no - binary-trees was written with functional
programming languages very much in mind.

best wishes, Isaac


               
__________________________________________
Yahoo! DSL – Something to write home about.
Just $16.99/mo. or less.
dsl.yahoo.com

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Shootout favoring imperative code

Sebastian Sylvan
In reply to this post by Udo Stenzel
On 1/6/06, Udo Stenzel <[hidden email]> wrote:

> Sebastian Sylvan wrote:
> > On 1/5/06, Chris Kuklewicz <[hidden email]> wrote:
> > > There is no need to beat a dead horse, though.  This benchmark sets out
> > > to test fgets / atoi, and that is all.  There are better benchmarks to
> > > spend time on.
> >
> > I agree. The benchmark really is about how fast you can call low-level
> > IO system calls. But since Haskell is a high-level language it's
> > natural that it's a bit difficult to get access to these unsafe (but
> > fast) low-level functions.
>
> There's probably a bit more to it.  First off, one could legitimately
> argue that (liftM lines getContents) is the Haskell way to do line
> oriented IO.  The real question is, why does the fast solution have to
> be ugly
>
> > foreign import ccall "stdio.h" fgets :: CString -> Int -> Ptr CFile ->IO CString
> > foreign import ccall safe "stdlib.h" atoi :: CString -> Int
> > foreign import ccall safe "stdio.h &(*stdin)" c_stdin :: Ptr CFile
>
> and why does the idiomatic solution have to be slow?
>
> | main = print . sum . map read . lines =<< getContents
>
> The biggest hit is probably the construction of a huge String as linked
> list, which cannot be deforested away (not with the foldr/build
> mechanism anyway).  Assuming we find a better representation for
> Strings, we could make some headway here and in many other benchmarks.
>
> So I think, just by replacing String and along with it getContents,
> lines and read, we will get competitive speed and retain the ability to
> handle arbitrarily long lines.  Of course, the shootout wouldn't care
> for a feature that is otherwise quite important in practice...  Anyway,
> I'll try try to come up with something and then follow up on this.

It would be neat if the PackedString library contained functions such
as hGetLine etc. It does have a function for reading from a buffer,
but it won't stop at a newline...
But yeah, fast string manipulation is difficult when using a
linked-list representation...

/S
--
Sebastian Sylvan
+46(0)736-818655
UIN: 44640862
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
12