Is Haskell for me?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
20 messages Options
Reply | Threaded
Open this post in threaded view
|

Is Haskell for me?

Luis P. Mendes
Hi,

I'd like to have some points of view on this subject.

I need to start a project on neural analysis and genetic algorithms
that will process many millions of records in a Linux X86_64 box (some
development also in a X32 computer).  Maybe I'll use fuzzy logic, or
case-based reasoning, or decision trees, too.

Recently, I finished a small project on genetic algorithms with
Python, but since run-time is very important, my next project will
need a fast language.  Either I'll go to C++ or similar language or
I'll try a functiional language - Haskell.
Apart from some features in Python, I've never programmed thinking in
a functional way.
I'd like to avoid having to learn C++.  I'd like to concentrate in
getting the work done and Haskell seems like a solution.  But...

My questions are:
- Is Haskell able to read (also write to a point) data from databases
in a fast and reliable way? (MySql or PostgreSQL)

- how could I program something like this in Haskell:
    .. generate random population
    .. for each one of the population:
      .. for time period 1 to ten million:
        .. evaluate method 1, 2, 3, 4, 5, 6, ....
      ..evaluate fitness of each one
    .. generate new population based on results of previous generation

It seems relatively intuitive for me to program this in an imperative
language.  But what about in Haskell?

- Is Haskell suitable to process data like this in a fast way
(aproximate to C++?)

- In order for Haskell to be fast, coding is done in a 'natural' way
or with use of special hidden details of the language?

- Although I always liked math, I no longer have the knowledge I used
to have several years ago.  Is this important to help program in this
funcional language?

- Are there graphical packages available to plot results or is it easy
to connect it to a Python (or C) library?

- Is code easily reusable in different future projects?  Since it has
no objects... how can it be done?

Sorry for all these questions, but I really need to know about this
and that's why I want to read answers from knowledgeable people.


Luis
Reply | Threaded
Open this post in threaded view
|

Is Haskell for me?

Luca Ciciriello

Hi.

Why not?

A friend of mine is a resercher in AI and is an expert in Neural network (solution spaces, matrices, etc). His programming language is Haskell. He is very happy to use Haskell (on MacOS X) in its reserach.

Personally I use Haskell with Postgres using HDBC or calling the postgress function in pqlib using the Haskell FFI (may favourite way).

 

I'm also an experienced C++ programmer (12 years) and I can say that C++ is a very complex language to learn from scratch and master it. Now I'm finding a little bit hard to switch  from an imperative language to a functional language.

Anyway Haskell is a very beautiful and elegant and powerful language that you can use for everything.

 

About using a GUI, I've never used one on Mac. Sorry. I know that exist a Haskell extension called GTK2HS (see: http://www.haskell.org/gtk2hs/).

If you want to use GHC (the de-facto compiler for Haskell), the code produced is very fast and you have a very powerful support for concurrency and parallelism.

 

For further information go to: http://www.haskell.org/ghc/ where you can find a lot of useful documentation and answers to ur questions.

 

Luca.

> Date: Fri, 6 Nov 2009 13:58:16 +0000
> From: [hidden email]
> To: [hidden email]
> Subject: [Haskell-beginners] Is Haskell for me?
>
> Hi,
>
> I'd like to have some points of view on this subject.
>
> I need to start a project on neural analysis and genetic algorithms
> that will process many millions of records in a Linux X86_64 box (some
> development also in a X32 computer). Maybe I'll use fuzzy logic, or
> case-based reasoning, or decision trees, too.
>
> Recently, I finished a small project on genetic algorithms with
> Python, but since run-time is very important, my next project will
> need a fast language. Either I'll go to C++ or similar language or
> I'll try a functiional language - Haskell.
> Apart from some features in Python, I've never programmed thinking in
> a functional way.
> I'd like to avoid having to learn C++. I'd like to concentrate in
> getting the work done and Haskell seems like a solution. But...
>
> My questions are:
> - Is Haskell able to read (also write to a point) data from databases
> in a fast and reliable way? (MySql or PostgreSQL)
>
> - how could I program something like this in Haskell:
> .. generate random population
> .. for each one of the population:
> .. for time period 1 to ten million:
> .. evaluate method 1, 2, 3, 4, 5, 6, ....
> ..evaluate fitness of each one
> .. generate new population based on results of previous generation
>
> It seems relatively intuitive for me to program this in an imperative
> language. But what about in Haskell?
>
> - Is Haskell suitable to process data like this in a fast way
> (aproximate to C++?)
>
> - In order for Haskell to be fast, coding is done in a 'natural' way
> or with use of special hidden details of the language?
>
> - Although I always liked math, I no longer have the knowledge I used
> to have several years ago. Is this important to help program in this
> funcional language?
>
> - Are there graphical packages available to plot results or is it easy
> to connect it to a Python (or C) library?
>
> - Is code easily reusable in different future projects? Since it has
> no objects... how can it be done?
>
> Sorry for all these questions, but I really need to know about this
> and that's why I want to read answers from knowledgeable people.
>
>
> Luis
> _______________________________________________
> Beginners mailing list
> [hidden email]
> http://www.haskell.org/mailman/listinfo/beginners
     
_________________________________________________________________
New Windows 7: Simplify what you do everyday. Find the right PC for you.
http://www.microsoft.com/uk/windows/buy/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/beginners/attachments/20091106/851a2e23/attachment.html
Reply | Threaded
Open this post in threaded view
|

Re: Is Haskell for me?

Maurí­cio CA
In reply to this post by Luis P. Mendes
 > - how could I program something like this in Haskell: ..
 > generate random population

Since Haskell is a "pure" language, after you attribute a value
to a variable they are glued together forever. So, the language
itself can't have such a thing as a random number. (Here, someone
with technical knowledge will probably correct me, but you get
the poing.) Dealing with this (and other "outside world" stuff)
requires learning how to deal with a special type construct
called "Monad". It's the major step you will have to as a Haskell
begginer. It's, though, extremely interesting, and extremely
powerfull after you understand it.

 > - Although I always liked math, I no longer have the knowledge
 > I used to have several years ago. Is this important to help
 > program in this funcional language?

No. But if you like math, you're probably going to find links to
really interesting math while you learn.

 > (...) is it easy to connect it to a Python (or C) library?

To C libraries the answer is yes. Actually, if I want to do C, I
prefer to do it in Haskell :) A friend of mine, who uses neural
networks in his MSc work, was impressed that in half an hour I
could get better results using a low-level binding from libfann to
Haskell (link below) than his own hand written code.

http://hackage.haskell.org/package/bindings-fann

 > - Is code easily reusable in different future projects? Since it
 > has no objects... how can it be done?

Yes, to a point you may find too radical at first. Even the most
basic constructs are usually combinations of other pieces. After
some time, you will start any code you write by searching pieces
to join together instead of writing everything yourself. It is,
though, usually easier to write code from scratch in Haskell than
reuse code in imperative languages :)


Biased opinions, of course, but I hope they help. Best,
Maur?cio

Reply | Threaded
Open this post in threaded view
|

Re: Is Haskell for me?

Cory Knapp
I'm certainly not an expert on Haskell, and my studies over the last
year have prevented me from really working in Haskell (or any language)
much, but...

Maur??cio CA wrote:

> > - how could I program something like this in Haskell: ..
> > generate random population
>
> Since Haskell is a "pure" language, after you attribute a value
> to a variable they are glued together forever. So, the language
> itself can't have such a thing as a random number. (Here, someone
> with technical knowledge will probably correct me, but you get
> the poing.) Dealing with this (and other "outside world" stuff)
> requires learning how to deal with a special type construct
> called "Monad". It's the major step you will have to as a Haskell
> begginer. It's, though, extremely interesting, and extremely
> powerfull after you understand it.
I don't think you need that deep of an understanding of monads in order
to write random code. You need a basic understanding of "monads as
computations", and you need to know how to use do notation, which should
feel very much like programming in an imperative language.
On the other hand, the rest of the algorithm Luis mentioned is *very*
natural in Haskell once you learn how to deal with lists.
>
> > - Although I always liked math, I no longer have the knowledge
> > I used to have several years ago. Is this important to help
> > program in this funcional language?
>
> No. But if you like math, you're probably going to find links to
> really interesting math while you learn.
I Definitely agree. Being a math student, one of the things I like about
Haskell is how it's a "down to earth" example of some very high level
concepts showing up-- without necessarily needing to learn all the math
that's there. (I spend more time learning the math than the Haskell, but
that's a result of my biases, not need.)

Now, your other (related) questions:

- Is Haskell suitable to process data like this in a fast way
(aproximate to C++?)

- In order for Haskell to be fast, coding is done in a 'natural' way
or with use of special hidden details of the language?

Idiomatic Haskell won't be as fast as idiomatic C++, but it will blow Python away. Since I assume you'll be working with large data sets, lazy evaluation may actually make your life much easier. If you really are pushing for lightening fast code, you can do some Don Stewart style hacking, or you can link to C using the foreign function interface. I have no experience with either, but I've heard nothing but the best about Haskell/C interaction.


Cheers,
Cory
Reply | Threaded
Open this post in threaded view
|

Is Haskell for me?

Bernhard Lehnert
In reply to this post by Luis P. Mendes
Hi!

Am Freitag, den 06.11.2009, 13:58 +0000 schrieb Luis P. Mendes:
> - Is Haskell able to read (also write to a point) data from databases
> in a fast and reliable way? (MySql or PostgreSQL)

You might want to check out these links:
http://software.complete.org/software/projects/show/hdbc
http://software.complete.org/static/hdbc-odbc/doc//HDBC-odbc/Database-HDBC-ODBC.html
http://sites.google.com/site/haskell/notes/connecting-to-mysql-with-haskell
http://www.volker-wysk.de/mysql-hs/

> - how could I program something like this in Haskell:
>     .. generate random population
>     .. for each one of the population:
>       .. for time period 1 to ten million:
>         .. evaluate method 1, 2, 3, 4, 5, 6, ....
>       ..evaluate fitness of each one
>     .. generate new population based on results of previous generation

You will have to change your thinking a lot. Forget about "for each one
of the population". Think "Map a function to the list which contains the
population". When I started I thought I knew map from Python - and in
fact I did underestimate "map" all the time in Python.
For time 1 to ten million evaluate a method? No - map "evaluate fitness"
to the list [1..10000000] (Don't worry about big numbers - it's all
lazily evaluated.
"Generate new population based on previous generation"? It's just
recursion and where will you find better recursion than in a functional
language?


> - Are there graphical packages available to plot results or is it easy
> to connect it to a Python (or C) library?

There are some graphical packages, but if everything else fails: Do fast
computing in Haskell and plot the results using Python.


> - Is code easily reusable in different future projects?  Since it has
> no objects... how can it be done?

You do not reuse objects, you reuse functions - that's why they call it
functional and yes

> Sorry for all these questions, but I really need to know about this
> and that's why I want to read answers from knowledgeable people.

Sorry, I'm a beginner myself and would not actually call myself
"knowledgeable". However, the learning curve is steep and lots of things
that seem to be very complicated are soon very logic!

You'll need some time to get the new ideas. If you got that, it's really
worthwhile!

Greets,
Bernhard

Reply | Threaded
Open this post in threaded view
|

Re: Is Haskell for me?

Shawn Willden
In reply to this post by Cory Knapp
On Friday 06 November 2009 09:52:48 am Cory Knapp wrote:
> Idiomatic Haskell won't be as fast as idiomatic C++, but it will blow
> Python away.

Based on the little bit of stuff I've done, I think I'd characterize it this
way:  C++ will be maybe twice as fast as Haskell.  Maybe a little more, maybe
a little less, depending on a lot of details.  For heavy computation, Python
will be a couple orders of magnitude slower than both.

IOW, Haskell is slower than C++ but it's in the same ballpark.

Would anyone disagree?

        Shawn.
Reply | Threaded
Open this post in threaded view
|

Re: Is Haskell for me?

Rafael Gustavo da Cunha Pereira Pinto-2
WARNING: THIS IS A VERY LONG REPLY

Luis,

If you are in a hurry skip to the point where I give advices ;-)

----------------------------------------SKIP IF YOU LIKE
TO----------------------------------------------------


I'll ask some more questions!! :-P


- Nothing beats Assembly in speed.

Why isn't everyone programming in Assembly??

- Python gives me the flexibility for fast prototyping, because it easily
connects to C and Java libraries

Why isn't every enterprise service bus (ESB) implementation written in
Python??

- Ruby has Rails!! One of most successful MVC frameworks ever.

Why isn't every system written in Ruby on Rails?




Basically, what I am trying to say is that even though Haskell is not
blazing fast, it can be made fast enough for you, given you use the right
algorithms and optimizations.

For this to happen, you must first understand, the functional paradigm, the
language and how the compiler optimizes your code.

For the rest of your questions:



- Is Haskell able to read (also write to a point) data from databases in a
fast and reliable way? (MySql or PostgreSQL)

Yes, there is a lot of ways. Look at Hackage
(HackageDB)<http://hackage.haskell.org/packages/hackage.html>for
database related packages

- how could I program something like this in Haskell:
   .. generate random population
   .. for each one of the population:
     .. for time period 1 to ten million:
       .. evaluate method 1, 2, 3, 4, 5, 6, ....
     ..evaluate fitness of each one
   .. generate new population based on results of previous generation

It seems relatively intuitive for me to program this in an imperative
language.  But what about in Haskell?


It looks rather imperative to me, either. You can implement imperative
things on the IO and State monads, but I would really suggest you rethink
the algorithm.


- Is Haskell suitable to process data like this in a fast way
(aproximate to C++?)

How fast, again, depends on the algorithm.
Haskell programs look like a collection of mathematical functions.

One could write a program like a composition of functions:

main= count . words . readfile

and then define functions individually...

- In order for Haskell to be fast, coding is done in a 'natural' way
or with use of special hidden details of the language?

The natural way makes you code right, but you have to think every step you
do, since a badly organized recursion, or the excess of lazyness can make
your program hog on memory and/or go dead slow.


- Although I always liked math, I no longer have the knowledge I used
to have several years ago.  Is this important to help program in this
funcional language?

The math needed in your first steps is not hard.

After you started learning Haskell you will most certainly step on Monads.
These might require some abstract algebra, if you want to understand what's
behind the curtains.

Refrain yourself from trying to understand them using math and take a look
on Philip Wadler's "Monads for functional
programming"<http://homepages.inf.ed.ac.uk/wadler/papers/marktoberdorf/baastad.pdf>

- Are there graphical packages available to plot results or is it easy
to connect it to a Python (or C) library?

Hackage is the way to go.

- Is code easily reusable in different future projects?  Since it has
no objects... how can it be done?

Like all other languages, it all depends on YOU.

Haskell has no objects, but it has polymorphic functions, which are just as
powerful.

It also has type classes, which are almost like C++ object classes, but
completely different. Type classes can restrict or increase polymorphism,
depending on how you use them.

There are also MANY extensions implemented on GHC that expand the type
system to its limit.

---------------------------------------- END OF SKIP
BLOCK----------------------------------------------------


Is Haskell for you? I can't answer, but I can give some advice:

1) If you are in a hurry, go for what you know (C++, Java, Python,
Assembly...)

2) If you really want to dig in, dig in. Learning Haskell is a wonderful
experience and can dramatically change your way of programming. Think of a
mind altering experience!!!

3) Even though it is hard to write great programs at first, in the end it is
very rewarding. As your programming style improves, you'll see how elegant
algorithms are implemented in functional programming.


I still consider myself a beginner, but I can assure you Haskell is a great
language for functional programming.


Best regards,

Rafael Gustavo da Cunha Pereira Pinto
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/beginners/attachments/20091106/82db49e7/attachment-0001.html
Reply | Threaded
Open this post in threaded view
|

Re: Is Haskell for me?

Daniel Fischer-4
In reply to this post by Shawn Willden
Am Freitag 06 November 2009 18:19:50 schrieb Shawn Willden:
> On Friday 06 November 2009 09:52:48 am Cory Knapp wrote:
> > Idiomatic Haskell won't be as fast as idiomatic C++, but it will blow
> > Python away.
>
> Based on the little bit of stuff I've done, I think I'd characterize it
> this way:  C++ will be maybe twice as fast as Haskell.  Maybe a little
> more, maybe a little less, depending on a lot of details.

Difficult waters. Depending on a lot of things, (rather idiomatic) Haskell is between a
little faster (that's rare, however) and *much* slower than C (or C++, but I can only
strongly advise against using that; pick C, C# or Java if you want to use a fast, halfway
sensible imperative language. C++ consistently chose what I find the worst parts of both
worlds - YMMV).

Two things should be noted:
- it's easy, especially if one isn't yet experienced, to write Haskell code that is much
slower than C because of the wrong choice of data structures or not making things strict
in the right places.
- but if that happens, you have a good chance of quickly getting help to get up to speed
here or on #haskell.

That said, a factor of 2-3 relative to C is usually achievable without low-level tweaking,
sometimes better, sometimes worse.

> For heavy
> computation, Python will be a couple orders of magnitude slower than both.

For not very large values of 'couple': don't expect a factor of more than 100 - that's
extremely rare.

>
> IOW, Haskell is slower than C++ but it's in the same ballpark.

Yes, as a general rule, that's it.
>
> Would anyone disagree?

Bulat?
>
> Shawn.


Reply | Threaded
Open this post in threaded view
|

Re: Is Haskell for me?

Gaius Hammond
In reply to this post by Shawn Willden

On 6 Nov 2009, at 17:19, Shawn Willden wrote:

> Based on the little bit of stuff I've done, I think I'd characterize  
> it this
> way:  C++ will be maybe twice as fast as Haskell.  Maybe a little  
> more, maybe
> a little less, depending on a lot of details.  For heavy  
> computation, Python
> will be a couple orders of magnitude slower than both.



To be fair, Python offloads its heavy lifting to C libraries - NumPy  
and SciPy run at very close to full C speed on large datasets. This is  
also how Matlab works. Unladen Swallow is an upcoming JIT compiler for  
Python.



Where Haskell shines for computation is when you can leverage lazy  
evaluation.



Cheers,



G


Reply | Threaded
Open this post in threaded view
|

Re: Is Haskell for me?

Felipe Lessa
On 11/6/09, Gaius Hammond <[hidden email]> wrote:
> To be fair, Python offloads its heavy lifting to C libraries - NumPy
> and SciPy run at very close to full C speed on large datasets. This is
> also how Matlab works. Unladen Swallow is an upcoming JIT compiler for
> Python.
>
> Where Haskell shines for computation is when you can leverage lazy
> evaluation.

*If* you can offload most of your work to SciPy. Depending on what you
do this is at least difficult. I don't know how much of a neural
network can be represented as big fat matrix :).

--
Felipe.
Reply | Threaded
Open this post in threaded view
|

Re: Is Haskell for me?

geremy condra-2
On Fri, Nov 6, 2009 at 4:58 PM, Felipe Lessa <[hidden email]> wrote:

> On 11/6/09, Gaius Hammond <[hidden email]> wrote:
>> To be fair, Python offloads its heavy lifting to C libraries - NumPy
>> and SciPy run at very close to full C speed on large datasets. This is
>> also how Matlab works. Unladen Swallow is an upcoming JIT compiler for
>> Python.
>>
>> Where Haskell shines for computation is when you can leverage lazy
>> evaluation.
>
> *If* you can offload most of your work to SciPy. Depending on what you
> do this is at least difficult. I don't know how much of a neural
> network can be represented as big fat matrix :).
>
> --
> Felipe.

Neural networking can be easily handled in both Python and C.
I've used my own Graphine graph theory package for it and
found that it was quite easy and reasonably fast, and certainly
LEDA is a very high performance, pretty easy-to-use library.
There's no need to coerce ANNs into a matrix form if you don't
want to.

Geremy Condra
Reply | Threaded
Open this post in threaded view
|

Re: Is Haskell for me?

Jon Harrop
In reply to this post by Shawn Willden
On Friday 06 November 2009 17:19:50 Shawn Willden wrote:

> On Friday 06 November 2009 09:52:48 am Cory Knapp wrote:
> > Idiomatic Haskell won't be as fast as idiomatic C++, but it will blow
> > Python away.
>
> Based on the little bit of stuff I've done, I think I'd characterize it
> this way:  C++ will be maybe twice as fast as Haskell.  Maybe a little
> more, maybe a little less, depending on a lot of details.  For heavy
> computation, Python will be a couple orders of magnitude slower than both.
>
> IOW, Haskell is slower than C++ but it's in the same ballpark.
>
> Would anyone disagree?

The long-standing bug in GHC's garbage collector that makes writing a single
boxed value into an array O(n) instead of O(1), because the GC dirties and
retraverses the entire array, makes it impossible to write efficient Haskell
code to solve many basic problems.

Here is a simple dictionary benchmark where Python is 4x faster than Haskell
because this bug means it is impossible to implement a competitively-
performant dictionary:

http://flyingfrogblog.blogspot.com/2009/04/f-vs-ocaml-vs-haskell-hash-table.html

Haskell's celebrated idiomatic quicksort is actually 6,000x slower than a
traditional implementation on this machine and consumes asymptotically more
memory (making it useless for quite mundane problems):

http://www.haskell.org/haskellwiki/Introduction#Quicksort_in_Haskell

Although people have created array-based quicksorts in Haskell the same perf
bug in the GC means that a generic quicksort in Haskell would be
asymptotically slower if it were given an array of boxed values.

As an aside, purity complicates the creation of a parallel generic quicksort
because it is necessary to mutate different parts of the same array in
parallel. AFAICT, writing an efficient parallel generic quicksort is an
unsolved problem in Haskell.

Suffice to say, I cannot agree that Haskell is in the same ballpark as C++ in
terms of performance. :-)

--
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e
Reply | Threaded
Open this post in threaded view
|

Re: Is Haskell for me?

Ben Lippmeier

On 21/11/2009, at 15:36 , Jon Harrop wrote:

> The long-standing bug in GHC's garbage collector that makes writing a single
> boxed value into an array O(n) instead of O(1), because the GC dirties and
> retraverses the entire array, makes it impossible to write efficient Haskell
> code to solve many basic problems.

Are you talking about http://hackage.haskell.org/trac/ghc/ticket/650 ?

The way I read that ticket is that writing a boxed value to a mutable array is still O(1), but the garbage collector traverses the whole array during scanning. That could certainly slow down GC cycles, but how does it make array update O(n)?

(update of the standard Haskell "pure" arrays being a separate issue, of course).

Ben.


Reply | Threaded
Open this post in threaded view
|

Re: Is Haskell for me?

Nicolas Pouillard-2
Excerpts from Ben Lippmeier's message of Sat Nov 21 07:38:09 +0100 2009:

>
> On 21/11/2009, at 15:36 , Jon Harrop wrote:
>
> > The long-standing bug in GHC's garbage collector that makes writing a single
> > boxed value into an array O(n) instead of O(1), because the GC dirties and
> > retraverses the entire array, makes it impossible to write efficient Haskell
> > code to solve many basic problems.
>
> Are you talking about http://hackage.haskell.org/trac/ghc/ticket/650 ?
>
> The way I read that ticket is that writing a boxed value to a mutable array is still O(1), but the garbage collector traverses the whole array during
> scanning. That could certainly slow down GC cycles, but how does it make array update O(n)?

He means in worst case. Writing once, will cause O(n) during the next GC. Of
course if you do a lot of updates between two GCs their is no extra penalty.
So if you make 'k' updates between 2 GCs it costs you k*O(1)+O(n) which is
still O(n) but practically nicer than k*O(n).

--
Nicolas Pouillard
http://nicolaspouillard.fr
Reply | Threaded
Open this post in threaded view
|

Re: Is Haskell for me?

Ben Lippmeier

On 21/11/2009, at 21:13 , Nicolas Pouillard wrote:

> Excerpts from Ben Lippmeier's message of Sat Nov 21 07:38:09 +0100 2009:
>>
>> On 21/11/2009, at 15:36 , Jon Harrop wrote:
>>
>>> The long-standing bug in GHC's garbage collector that makes writing a single
>>> boxed value into an array O(n) instead of O(1), because the GC dirties and
>>> retraverses the entire array, makes it impossible to write efficient Haskell
>>> code to solve many basic problems.
>>
>> Are you talking about http://hackage.haskell.org/trac/ghc/ticket/650 ?
>>
>> The way I read that ticket is that writing a boxed value to a mutable array is still O(1), but the garbage collector traverses the whole array during
>> scanning. That could certainly slow down GC cycles, but how does it make array update O(n)?
>
> He means in worst case. Writing once, will cause O(n) during the next GC. Of
> course if you do a lot of updates between two GCs their is no extra penalty.
> So if you make 'k' updates between 2 GCs it costs you k*O(1)+O(n) which is
> still O(n) but practically nicer than k*O(n).
>

Hmm. I'd be careful about conflating algorithmic complexity with memory management issues. By the above reasoning, if I were to run any program using arrays on a system with a two space garbage collector (which copies all live objects during each GC) I could say the worst case algorithmic complexity was O(n). That doesn't sound right.

I could take this further and say that in a virtual memory system, there is a chance that the whole heap gets copied to the disk and back between each array update. I might again say this has O(n) complexity, but it wouldn't be very helpful...

Ben.

Reply | Threaded
Open this post in threaded view
|

Re: Is Haskell for me?

Nicolas Pouillard-2
Excerpts from Ben Lippmeier's message of Sat Nov 21 12:56:09 +0100 2009:

>
> On 21/11/2009, at 21:13 , Nicolas Pouillard wrote:
>
> > Excerpts from Ben Lippmeier's message of Sat Nov 21 07:38:09 +0100 2009:
> >>
> >> On 21/11/2009, at 15:36 , Jon Harrop wrote:
> >>
> >>> The long-standing bug in GHC's garbage collector that makes writing a single
> >>> boxed value into an array O(n) instead of O(1), because the GC dirties and
> >>> retraverses the entire array, makes it impossible to write efficient Haskell
> >>> code to solve many basic problems.
> >>
> >> Are you talking about http://hackage.haskell.org/trac/ghc/ticket/650 ?
> >>
> >> The way I read that ticket is that writing a boxed value to a mutable array is still O(1), but the garbage collector traverses the whole array during
> >> scanning. That could certainly slow down GC cycles, but how does it make array update O(n)?
> >
> > He means in worst case. Writing once, will cause O(n) during the next GC. Of
> > course if you do a lot of updates between two GCs their is no extra penalty.
> > So if you make 'k' updates between 2 GCs it costs you k*O(1)+O(n) which is
> > still O(n) but practically nicer than k*O(n).
> >

> Hmm. I'd be careful about conflating algorithmic complexity with memory management issues. By the above reasoning, if I were to run any program using
> arrays on a system with a two space garbage collector (which copies all live objects during each GC) I could say the worst case algorithmic complexity was
> O(n). That doesn't sound right.

> I could take this further and say that in a virtual memory system, there is a chance that the whole heap gets copied to the disk and back between each
> array update. I might again say this has O(n) complexity, but it wouldn't be very helpful...

Your algorithm is O(1) but the current run-time system can take a time closer to O(n). This could be the case with a two spaces GC or with swapping.

--
Nicolas Pouillard
http://nicolaspouillard.fr
Reply | Threaded
Open this post in threaded view
|

Re: Is Haskell for me?

Jon Harrop
In reply to this post by Ben Lippmeier
On Saturday 21 November 2009 11:56:09 Ben Lippmeier wrote:
> Hmm. I'd be careful about conflating algorithmic complexity with memory
> management issues.

No need. Just look at how badly the performance scales for Haskell vs other
languages. For example, inserting 1-16 million floating point key/values into
a hash table:

          Haskell         OCaml         F#
 1M:   3.198s   1.0x   1.129s  1.0x  0.080s  1.0x
 2M:   8.498s   2.7x   2.313s  2.0x  0.138s  1.7x
 4M:  25.697s   8.0x   4.567s  4.0x  0.281s  3.5x
 8M:  97.994s  30.6x  10.450s  9.3x  0.637s  8.0x
16M: 388.080s 121.4x  23.261s 20.6x  1.338s 16.7x

Note that Haskell is 290x slower than F# on that last test.

In practice, you would turn to a purely functional dictionary in Haskell based
upon balanced binary trees in order to work around this long-standing bug in
the GC but those trees incur O(log n) indirections and typically run orders
of magnitude slower than a decent hash table.

Suffice to say, Haskell is nowhere near being in the ballpark of C++'s
performance for basic functionality like dictionaries and sorting.

> By the above reasoning, if I were to run any program
> using arrays on a system with a two space garbage collector (which copies
> all live objects during each GC) I could say the worst case algorithmic
> complexity was O(n). That doesn't sound right.

Can you write a program that demonstrates this effect as I did?

> I could take this further and say that in a virtual memory system, there is
> a chance that the whole heap gets copied to the disk and back between each
> array update.

Can you write a program that demonstrates this effect as I did?

--
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e
Reply | Threaded
Open this post in threaded view
|

Re: Is Haskell for me?

Nathan M. Holden
In reply to this post by Luis P. Mendes
I'm not a Haskell expert-- in fact, I'm a beginner, and that's why I'm here,
but I read your blog post on the subject of this hash table program, and it
seems to me from reading the comments in reply that you're just here trolling,
because you're using an algorithm that is fundamentally not purely functional,
and of course that's going to be slower, just like asking Joel Zumaya to pitch
left handed.

If you were being honest about your complaint, you'd make an apples-to-apples
comparison, and a number of commenters on your blog have proposed
implementations that perform much better than your example.

Sorry if I'm just being a jerk,
Nathan M. Holden,
Haskell Beginner

On Saturday 21 November 2009 12:55:03 pm [hidden email] wrote:

> Message: 4
> Date: Sat, 21 Nov 2009 18:02:33 +0000
> From: Jon Harrop <[hidden email]>
> Subject: Re: [Haskell-beginners] Re: Is Haskell for me?
> To: Ben Lippmeier <[hidden email]>
> Cc: beginners <[hidden email]>
> Message-ID: <[hidden email]>
> Content-Type: text/plain;  charset="iso-8859-1"
>
> On Saturday 21 November 2009 11:56:09 Ben Lippmeier wrote:
> > Hmm. I'd be careful about conflating algorithmic complexity with memory
> > management issues.
>
> No need. Just look at how badly the performance scales for Haskell vs other
> languages. For example, inserting 1-16 million floating point key/values
>  into a hash table:
>
>           Haskell         OCaml         F#
>  1M:   3.198s   1.0x   1.129s  1.0x  0.080s  1.0x
>  2M:   8.498s   2.7x   2.313s  2.0x  0.138s  1.7x
>  4M:  25.697s   8.0x   4.567s  4.0x  0.281s  3.5x
>  8M:  97.994s  30.6x  10.450s  9.3x  0.637s  8.0x
> 16M: 388.080s 121.4x  23.261s 20.6x  1.338s 16.7x
>
> Note that Haskell is 290x slower than F# on that last test.
>
> In practice, you would turn to a purely functional dictionary in Haskell
>  based upon balanced binary trees in order to work around this
>  long-standing bug in the GC but those trees incur O(log n) indirections
>  and typically run orders of magnitude slower than a decent hash table.
>
> Suffice to say, Haskell is nowhere near being in the ballpark of C++'s
> performance for basic functionality like dictionaries and sorting.
>
> > By the above reasoning, if I were to run any program
> > using arrays on a system with a two space garbage collector (which copies
> > all live objects during each GC) I could say the worst case algorithmic
> > complexity was O(n). That doesn't sound right.
>
> Can you write a program that demonstrates this effect as I did?
>
> > I could take this further and say that in a virtual memory system, there
> > is a chance that the whole heap gets copied to the disk and back between
> > each array update.
>
> Can you write a program that demonstrates this effect as I did?
Reply | Threaded
Open this post in threaded view
|

Re: Is Haskell for me?

Jon Harrop
On Saturday 21 November 2009 18:57:21 you wrote:

> I'm not a Haskell expert-- in fact, I'm a beginner, and that's why I'm
> here, but I read your blog post on the subject of this hash table program,
> and it seems to me from reading the comments in reply that you're just here
> trolling, because you're using an algorithm that is fundamentally not
> purely functional, and of course that's going to be slower, just like
> asking Joel Zumaya to pitch left handed.
>
> If you were being honest about your complaint, you'd make an
> apples-to-apples comparison, and a number of commenters on your blog have
> proposed implementations that perform much better than your example.
>
> Sorry if I'm just being a jerk,

Not at all, that is a perfectly reasonable concern but I did already try to
address it in my previous post:

> > In practice, you would turn to a purely functional dictionary in Haskell
> > based upon balanced binary trees in order to work around this
> > long-standing bug in the GC but those trees incur O(log n) indirections
> > and typically run orders of magnitude slower than a decent hash table.

For example, the following Haskell program builds a purely functional
Data.Map:

module Main where

import Prelude hiding (lookup)
import Data.Map (empty, insert, lookup, size)

n = 1000000

build m 0 = m
build m n = build (insert x (1.0 / x) m) (n-1)
  where x = fromIntegral n :: Double

main = do
  let
      m = build empty n
      (Just v) = lookup 100 m

  print $ size m
  print v

Running this program with different "n" gives:

        Data.Map
 1M:  2.797s  1.0x
 2M:  6.090s  2.2x
 4M: 14.226s  5.1x
 8M: 28.449s 10.2x
16M: 83.171s 29.7x

This is several times faster than Haskell's Data.Hashtable (because of the
long-standing bug in their GC that I described) and is scaling better.
However, if you compare with the timings I gave before:

      Data.Hashtable       OCaml          F#
 1M:   3.198s   1.0x   1.129s  1.0x  0.080s  1.0x
 2M:   8.498s   2.7x   2.313s  2.0x  0.138s  1.7x
 4M:  25.697s   8.0x   4.567s  4.0x  0.281s  3.5x
 8M:  97.994s  30.6x  10.450s  9.3x  0.637s  8.0x
16M: 388.080s 121.4x  23.261s 20.6x  1.338s 16.7x

you'll see that the absolute performance of this idiomatic Haskell solution is
still absolutely awful: consistently about 50x slower than the F#.

This is also true in the context of sorting: Haskell's standard library
routines for sorting are orders of magnitude slower than those found in most
other compiled languages.

Suffice to say, idiomatic Haskell is also nowhere near being in the same
ballpark as C++ with respect to performance. Realistically, with enough
expertise you should be able to optimize most of your Haskell programs to
beat Python's performance but there are some important cases where you will
not even be able to do that.

--
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e
Reply | Threaded
Open this post in threaded view
|

Re: Is Haskell for me?

Ben Lippmeier
In reply to this post by Jon Harrop

On 22/11/2009, at 5:02 , Jon Harrop wrote:
>
>> By the above reasoning, if I were to run any program
>> using arrays on a system with a two space garbage collector (which copies
>> all live objects during each GC) I could say the worst case algorithmic
>> complexity was O(n). That doesn't sound right.
>
> Can you write a program that demonstrates this effect as I did?

I believe your numbers, but I don't have a F# install to test against ATM.


> Suffice to say, Haskell is nowhere near being in the ballpark of C++'s
> performance for basic functionality like dictionaries and sorting.


Sorry, I wasn't trying to start a pissing match. I just wanted to know what the issue was so I could avoid it in my own programs.

Thanks!

Ben.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/beginners/attachments/20091121/76b2abd0/attachment-0001.html