Is Haskell IO inherently slower than C?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Is Haskell IO inherently slower than C?

Jake
I'm trying to write a program in Haskell that writes a large file at the end, and it seems like that output alone is taking way longer than it should. I don't understand why Haskell shouldn't be able to write data as quickly as C, so I wrote two test files:

-- Test.hs
import Control.Loop
import Data.ByteString.Builder
import System.IO

main :: IO ()
main =
  numLoop 0 1000000 $ \_ ->
    hPutBuilder stdout $ char7 ' '

// test.c
#include <stdio.h>

int main() {
  int i;
  for (i = 0; i < 1000000; i++) {
    fprintf(stdout, " ");
  }
  return 0;
}

I compiled them both with -O2, and ran them redirecting their outputs to /dev/null. For the Haskell version I got times aroudn 0.3 seconds, while the C version was around 0.03. Is there any reason why in simple IO the Haskell version would be slower by an order of magnitude?

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Is Haskell IO inherently slower than C?

Jake
I also tried the more standard forM_ [0..100000] idiom and got similar times. I hoped loop might be faster, but apparently not.

On Wed, May 4, 2016 at 9:05 PM Jake <[hidden email]> wrote:
I'm trying to write a program in Haskell that writes a large file at the end, and it seems like that output alone is taking way longer than it should. I don't understand why Haskell shouldn't be able to write data as quickly as C, so I wrote two test files:

-- Test.hs
import Control.Loop
import Data.ByteString.Builder
import System.IO

main :: IO ()
main =
  numLoop 0 1000000 $ \_ ->
    hPutBuilder stdout $ char7 ' '

// test.c
#include <stdio.h>

int main() {
  int i;
  for (i = 0; i < 1000000; i++) {
    fprintf(stdout, " ");
  }
  return 0;
}

I compiled them both with -O2, and ran them redirecting their outputs to /dev/null. For the Haskell version I got times aroudn 0.3 seconds, while the C version was around 0.03. Is there any reason why in simple IO the Haskell version would be slower by an order of magnitude?

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Is Haskell IO inherently slower than C?

Mike Izbicki
You need to be doing tests that take much longer to accurately compare
runtimes of executables.  Here's why:

Haskell programs have a more complicated run time system than C
programs, and the binaries output by GHC are much larger.  Therefore,
there will be a small additional overhead that you have to pay once
when the program starts.  For some machines, this could possibly be on
the order of 0.3 seconds.

I'd recommend you scale your test so that it takes at least 30 seconds
for the slowest program.  This will give you more meaningful results.

On Wed, May 4, 2016 at 6:07 PM, Jake <[hidden email]> wrote:

> I also tried the more standard forM_ [0..100000] idiom and got similar
> times. I hoped loop might be faster, but apparently not.
>
> On Wed, May 4, 2016 at 9:05 PM Jake <[hidden email]> wrote:
>>
>> I'm trying to write a program in Haskell that writes a large file at the
>> end, and it seems like that output alone is taking way longer than it
>> should. I don't understand why Haskell shouldn't be able to write data as
>> quickly as C, so I wrote two test files:
>>
>> -- Test.hs
>> import Control.Loop
>> import Data.ByteString.Builder
>> import System.IO
>>
>> main :: IO ()
>> main =
>>   numLoop 0 1000000 $ \_ ->
>>     hPutBuilder stdout $ char7 ' '
>>
>> // test.c
>> #include <stdio.h>
>>
>> int main() {
>>   int i;
>>   for (i = 0; i < 1000000; i++) {
>>     fprintf(stdout, " ");
>>   }
>>   return 0;
>> }
>>
>> I compiled them both with -O2, and ran them redirecting their outputs to
>> /dev/null. For the Haskell version I got times aroudn 0.3 seconds, while the
>> C version was around 0.03. Is there any reason why in simple IO the Haskell
>> version would be slower by an order of magnitude?
>
>
> _______________________________________________
> Haskell-Cafe mailing list
> [hidden email]
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
>
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Is Haskell IO inherently slower than C?

Jake
Thanks for the tip! I tried it again, this time using 100000000, and the Haskell one runs at around 25 seconds. The C one is only 1.5 seconds. Can that still be the run time system?

On Wed, May 4, 2016 at 9:18 PM Mike Izbicki <[hidden email]> wrote:
You need to be doing tests that take much longer to accurately compare
runtimes of executables.  Here's why:

Haskell programs have a more complicated run time system than C
programs, and the binaries output by GHC are much larger.  Therefore,
there will be a small additional overhead that you have to pay once
when the program starts.  For some machines, this could possibly be on
the order of 0.3 seconds.

I'd recommend you scale your test so that it takes at least 30 seconds
for the slowest program.  This will give you more meaningful results.

On Wed, May 4, 2016 at 6:07 PM, Jake <[hidden email]> wrote:
> I also tried the more standard forM_ [0..100000] idiom and got similar
> times. I hoped loop might be faster, but apparently not.
>
> On Wed, May 4, 2016 at 9:05 PM Jake <[hidden email]> wrote:
>>
>> I'm trying to write a program in Haskell that writes a large file at the
>> end, and it seems like that output alone is taking way longer than it
>> should. I don't understand why Haskell shouldn't be able to write data as
>> quickly as C, so I wrote two test files:
>>
>> -- Test.hs
>> import Control.Loop
>> import Data.ByteString.Builder
>> import System.IO
>>
>> main :: IO ()
>> main =
>>   numLoop 0 1000000 $ \_ ->
>>     hPutBuilder stdout $ char7 ' '
>>
>> // test.c
>> #include <stdio.h>
>>
>> int main() {
>>   int i;
>>   for (i = 0; i < 1000000; i++) {
>>     fprintf(stdout, " ");
>>   }
>>   return 0;
>> }
>>
>> I compiled them both with -O2, and ran them redirecting their outputs to
>> /dev/null. For the Haskell version I got times aroudn 0.3 seconds, while the
>> C version was around 0.03. Is there any reason why in simple IO the Haskell
>> version would be slower by an order of magnitude?
>
>
> _______________________________________________
> Haskell-Cafe mailing list
> [hidden email]
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
>

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Is Haskell IO inherently slower than C?

Mike Izbicki
That convinces me that your Haskell program is actually slower.  My
guess is this is due to a buffering issue, and not due to anything
haskell related.  Buffering is how the opeating system makes a
tradeoff between performance and safety for io operations, and I'm
guessing your two code fragments are (behind the scenes) making a
different decision about how to do buffering.

For a Haskell-related discussion about buffering, see:
http://book.realworldhaskell.org/read/io.html#io.buffering

BTW, a good rule of thumb for io performance in any language is to
never write anything character by character, and instead write large
chunks of characters at once.  The above link should make clear why
this is the case.

On Wed, May 4, 2016 at 6:22 PM, Jake <[hidden email]> wrote:

> Thanks for the tip! I tried it again, this time using 100000000, and the
> Haskell one runs at around 25 seconds. The C one is only 1.5 seconds. Can
> that still be the run time system?
>
> On Wed, May 4, 2016 at 9:18 PM Mike Izbicki <[hidden email]> wrote:
>>
>> You need to be doing tests that take much longer to accurately compare
>> runtimes of executables.  Here's why:
>>
>> Haskell programs have a more complicated run time system than C
>> programs, and the binaries output by GHC are much larger.  Therefore,
>> there will be a small additional overhead that you have to pay once
>> when the program starts.  For some machines, this could possibly be on
>> the order of 0.3 seconds.
>>
>> I'd recommend you scale your test so that it takes at least 30 seconds
>> for the slowest program.  This will give you more meaningful results.
>>
>> On Wed, May 4, 2016 at 6:07 PM, Jake <[hidden email]> wrote:
>> > I also tried the more standard forM_ [0..100000] idiom and got similar
>> > times. I hoped loop might be faster, but apparently not.
>> >
>> > On Wed, May 4, 2016 at 9:05 PM Jake <[hidden email]> wrote:
>> >>
>> >> I'm trying to write a program in Haskell that writes a large file at
>> >> the
>> >> end, and it seems like that output alone is taking way longer than it
>> >> should. I don't understand why Haskell shouldn't be able to write data
>> >> as
>> >> quickly as C, so I wrote two test files:
>> >>
>> >> -- Test.hs
>> >> import Control.Loop
>> >> import Data.ByteString.Builder
>> >> import System.IO
>> >>
>> >> main :: IO ()
>> >> main =
>> >>   numLoop 0 1000000 $ \_ ->
>> >>     hPutBuilder stdout $ char7 ' '
>> >>
>> >> // test.c
>> >> #include <stdio.h>
>> >>
>> >> int main() {
>> >>   int i;
>> >>   for (i = 0; i < 1000000; i++) {
>> >>     fprintf(stdout, " ");
>> >>   }
>> >>   return 0;
>> >> }
>> >>
>> >> I compiled them both with -O2, and ran them redirecting their outputs
>> >> to
>> >> /dev/null. For the Haskell version I got times aroudn 0.3 seconds,
>> >> while the
>> >> C version was around 0.03. Is there any reason why in simple IO the
>> >> Haskell
>> >> version would be slower by an order of magnitude?
>> >
>> >
>> > _______________________________________________
>> > Haskell-Cafe mailing list
>> > [hidden email]
>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
>> >
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Is Haskell IO inherently slower than C?

Jake
It seems to me that BlockBuffering is the fastest. I checked and the default buffering mode for stdout is BlockBuffering Nothing. So I went to the C program and checked the value of BUFSIZ which was 8192. Then I set the buffering mode for stdout to BlockBuffering (Just 8192) so they should be buffering the same, right?
It didn't really change the performance at all. Is there something else going on with the buffering I'm not changing?

On Wed, May 4, 2016 at 9:45 PM Mike Izbicki <[hidden email]> wrote:
That convinces me that your Haskell program is actually slower.  My
guess is this is due to a buffering issue, and not due to anything
haskell related.  Buffering is how the opeating system makes a
tradeoff between performance and safety for io operations, and I'm
guessing your two code fragments are (behind the scenes) making a
different decision about how to do buffering.

For a Haskell-related discussion about buffering, see:
http://book.realworldhaskell.org/read/io.html#io.buffering

BTW, a good rule of thumb for io performance in any language is to
never write anything character by character, and instead write large
chunks of characters at once.  The above link should make clear why
this is the case.

On Wed, May 4, 2016 at 6:22 PM, Jake <[hidden email]> wrote:
> Thanks for the tip! I tried it again, this time using 100000000, and the
> Haskell one runs at around 25 seconds. The C one is only 1.5 seconds. Can
> that still be the run time system?
>
> On Wed, May 4, 2016 at 9:18 PM Mike Izbicki <[hidden email]> wrote:
>>
>> You need to be doing tests that take much longer to accurately compare
>> runtimes of executables.  Here's why:
>>
>> Haskell programs have a more complicated run time system than C
>> programs, and the binaries output by GHC are much larger.  Therefore,
>> there will be a small additional overhead that you have to pay once
>> when the program starts.  For some machines, this could possibly be on
>> the order of 0.3 seconds.
>>
>> I'd recommend you scale your test so that it takes at least 30 seconds
>> for the slowest program.  This will give you more meaningful results.
>>
>> On Wed, May 4, 2016 at 6:07 PM, Jake <[hidden email]> wrote:
>> > I also tried the more standard forM_ [0..100000] idiom and got similar
>> > times. I hoped loop might be faster, but apparently not.
>> >
>> > On Wed, May 4, 2016 at 9:05 PM Jake <[hidden email]> wrote:
>> >>
>> >> I'm trying to write a program in Haskell that writes a large file at
>> >> the
>> >> end, and it seems like that output alone is taking way longer than it
>> >> should. I don't understand why Haskell shouldn't be able to write data
>> >> as
>> >> quickly as C, so I wrote two test files:
>> >>
>> >> -- Test.hs
>> >> import Control.Loop
>> >> import Data.ByteString.Builder
>> >> import System.IO
>> >>
>> >> main :: IO ()
>> >> main =
>> >>   numLoop 0 1000000 $ \_ ->
>> >>     hPutBuilder stdout $ char7 ' '
>> >>
>> >> // test.c
>> >> #include <stdio.h>
>> >>
>> >> int main() {
>> >>   int i;
>> >>   for (i = 0; i < 1000000; i++) {
>> >>     fprintf(stdout, " ");
>> >>   }
>> >>   return 0;
>> >> }
>> >>
>> >> I compiled them both with -O2, and ran them redirecting their outputs
>> >> to
>> >> /dev/null. For the Haskell version I got times aroudn 0.3 seconds,
>> >> while the
>> >> C version was around 0.03. Is there any reason why in simple IO the
>> >> Haskell
>> >> version would be slower by an order of magnitude?
>> >
>> >
>> > _______________________________________________
>> > Haskell-Cafe mailing list
>> > [hidden email]
>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
>> >

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Is Haskell IO inherently slower than C?

Duncan Coutts-4
In reply to this post by Jake
On Thu, 2016-05-05 at 01:05 +0000, Jake wrote:

> I'm trying to write a program in Haskell that writes a large file at
> the
> end, and it seems like that output alone is taking way longer than it
> should. I don't understand why Haskell shouldn't be able to write
> data as
> quickly as C, so I wrote two test files:
>
> -- Test.hs
> import Control.Loop
> import Data.ByteString.Builder
> import System.IO
>
> main :: IO ()
> main =
>   numLoop 0 1000000 $ \_ ->
>     hPutBuilder stdout $ char7 ' '

This is a highly pessimal use of bytestring builder. You're setting up
a new output buffer for every char. You should either write chars to
the handle buffer, or write one big builder to the handle.

That is either:

numLoop 0 1000000 $ \_ ->
  hPutChar stdout ' '

or

hPutBuilder stdout (mconcat (replicate 1000000 (char7 ' ')))

This isn't the fastest way to use the  bytestring builder, but it's
pretty convenient. The first way isn't great as it has to take the
Handle lock for every char.

On my system those three versions run in time ranges of:
  Hs putChar: 0.154s -- 0.167s
  Hs builder: 0.008s -- 0.014s
  C:          0.012s -- 0.023s

So the answer is no. So long as you can blat bytes into a buffer
quickly enough then there's no reason Haskell IO need be slower than
C.

In terms of benchmarking fairness, neither of these examples (the
builder nor the C version) are the fastest possible ways of blatting
bytes into buffers, though they're both reasonably convenient methods
in both languages.

Duncan
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Is Haskell IO inherently slower than C?

Hong Yang-2
In reply to this post by Mike Izbicki
I never used Data.ByteString.Builder before. But I guess "char7" is encoding a space to something. Will this slow things down?

On Wed, May 4, 2016 at 8:44 PM, Mike Izbicki <[hidden email]> wrote:
That convinces me that your Haskell program is actually slower.  My
guess is this is due to a buffering issue, and not due to anything
haskell related.  Buffering is how the opeating system makes a
tradeoff between performance and safety for io operations, and I'm
guessing your two code fragments are (behind the scenes) making a
different decision about how to do buffering.

For a Haskell-related discussion about buffering, see:
http://book.realworldhaskell.org/read/io.html#io.buffering

BTW, a good rule of thumb for io performance in any language is to
never write anything character by character, and instead write large
chunks of characters at once.  The above link should make clear why
this is the case.

On Wed, May 4, 2016 at 6:22 PM, Jake <[hidden email]> wrote:
> Thanks for the tip! I tried it again, this time using 100000000, and the
> Haskell one runs at around 25 seconds. The C one is only 1.5 seconds. Can
> that still be the run time system?
>
> On Wed, May 4, 2016 at 9:18 PM Mike Izbicki <[hidden email]> wrote:
>>
>> You need to be doing tests that take much longer to accurately compare
>> runtimes of executables.  Here's why:
>>
>> Haskell programs have a more complicated run time system than C
>> programs, and the binaries output by GHC are much larger.  Therefore,
>> there will be a small additional overhead that you have to pay once
>> when the program starts.  For some machines, this could possibly be on
>> the order of 0.3 seconds.
>>
>> I'd recommend you scale your test so that it takes at least 30 seconds
>> for the slowest program.  This will give you more meaningful results.
>>
>> On Wed, May 4, 2016 at 6:07 PM, Jake <[hidden email]> wrote:
>> > I also tried the more standard forM_ [0..100000] idiom and got similar
>> > times. I hoped loop might be faster, but apparently not.
>> >
>> > On Wed, May 4, 2016 at 9:05 PM Jake <[hidden email]> wrote:
>> >>
>> >> I'm trying to write a program in Haskell that writes a large file at
>> >> the
>> >> end, and it seems like that output alone is taking way longer than it
>> >> should. I don't understand why Haskell shouldn't be able to write data
>> >> as
>> >> quickly as C, so I wrote two test files:
>> >>
>> >> -- Test.hs
>> >> import Control.Loop
>> >> import Data.ByteString.Builder
>> >> import System.IO
>> >>
>> >> main :: IO ()
>> >> main =
>> >>   numLoop 0 1000000 $ \_ ->
>> >>     hPutBuilder stdout $ char7 ' '
>> >>
>> >> // test.c
>> >> #include <stdio.h>
>> >>
>> >> int main() {
>> >>   int i;
>> >>   for (i = 0; i < 1000000; i++) {
>> >>     fprintf(stdout, " ");
>> >>   }
>> >>   return 0;
>> >> }
>> >>
>> >> I compiled them both with -O2, and ran them redirecting their outputs
>> >> to
>> >> /dev/null. For the Haskell version I got times aroudn 0.3 seconds,
>> >> while the
>> >> C version was around 0.03. Is there any reason why in simple IO the
>> >> Haskell
>> >> version would be slower by an order of magnitude?
>> >
>> >
>> > _______________________________________________
>> > Haskell-Cafe mailing list
>> > [hidden email]
>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
>> >
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Is Haskell IO inherently slower than C?

Baojun Wang
ByteString Builder is supposed to build bytestring incrementally (as monoid), in the example it prints the builder immediately after build a single ``char7'', thus not necessary to use bytestring builder at all (which introduces overhead).

On Wed, May 4, 2016 at 7:21 PM Hong Yang <[hidden email]> wrote:
I never used Data.ByteString.Builder before. But I guess "char7" is encoding a space to something. Will this slow things down?

On Wed, May 4, 2016 at 8:44 PM, Mike Izbicki <[hidden email]> wrote:
That convinces me that your Haskell program is actually slower.  My
guess is this is due to a buffering issue, and not due to anything
haskell related.  Buffering is how the opeating system makes a
tradeoff between performance and safety for io operations, and I'm
guessing your two code fragments are (behind the scenes) making a
different decision about how to do buffering.

For a Haskell-related discussion about buffering, see:
http://book.realworldhaskell.org/read/io.html#io.buffering

BTW, a good rule of thumb for io performance in any language is to
never write anything character by character, and instead write large
chunks of characters at once.  The above link should make clear why
this is the case.

On Wed, May 4, 2016 at 6:22 PM, Jake <[hidden email]> wrote:
> Thanks for the tip! I tried it again, this time using 100000000, and the
> Haskell one runs at around 25 seconds. The C one is only 1.5 seconds. Can
> that still be the run time system?
>
> On Wed, May 4, 2016 at 9:18 PM Mike Izbicki <[hidden email]> wrote:
>>
>> You need to be doing tests that take much longer to accurately compare
>> runtimes of executables.  Here's why:
>>
>> Haskell programs have a more complicated run time system than C
>> programs, and the binaries output by GHC are much larger.  Therefore,
>> there will be a small additional overhead that you have to pay once
>> when the program starts.  For some machines, this could possibly be on
>> the order of 0.3 seconds.
>>
>> I'd recommend you scale your test so that it takes at least 30 seconds
>> for the slowest program.  This will give you more meaningful results.
>>
>> On Wed, May 4, 2016 at 6:07 PM, Jake <[hidden email]> wrote:
>> > I also tried the more standard forM_ [0..100000] idiom and got similar
>> > times. I hoped loop might be faster, but apparently not.
>> >
>> > On Wed, May 4, 2016 at 9:05 PM Jake <[hidden email]> wrote:
>> >>
>> >> I'm trying to write a program in Haskell that writes a large file at
>> >> the
>> >> end, and it seems like that output alone is taking way longer than it
>> >> should. I don't understand why Haskell shouldn't be able to write data
>> >> as
>> >> quickly as C, so I wrote two test files:
>> >>
>> >> -- Test.hs
>> >> import Control.Loop
>> >> import Data.ByteString.Builder
>> >> import System.IO
>> >>
>> >> main :: IO ()
>> >> main =
>> >>   numLoop 0 1000000 $ \_ ->
>> >>     hPutBuilder stdout $ char7 ' '
>> >>
>> >> // test.c
>> >> #include <stdio.h>
>> >>
>> >> int main() {
>> >>   int i;
>> >>   for (i = 0; i < 1000000; i++) {
>> >>     fprintf(stdout, " ");
>> >>   }
>> >>   return 0;
>> >> }
>> >>
>> >> I compiled them both with -O2, and ran them redirecting their outputs
>> >> to
>> >> /dev/null. For the Haskell version I got times aroudn 0.3 seconds,
>> >> while the
>> >> C version was around 0.03. Is there any reason why in simple IO the
>> >> Haskell
>> >> version would be slower by an order of magnitude?
>> >
>> >
>> > _______________________________________________
>> > Haskell-Cafe mailing list
>> > [hidden email]
>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
>> >
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe