documentation in GHC.Conc, Control.Parallel.Strategies; querying number of CPUs

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

documentation in GHC.Conc, Control.Parallel.Strategies; querying number of CPUs

Frederik Eaton-3
Hello,

I am interested in implementing some multi-threaded algorithms in
Haskell.

I have run into some documentation dead-ends. The documentation in
GHC.Conc is what I get when I search for "ghc pseq" on google, but it
doesn't document pseq and some other functions:

pseq
par
forkOnIO
childHandler
ensureIOManagerIsRunning

In particular, I wonder: How is pseq different from seq? Under what
circumstances is it used? I have looked at the source code so I see
that it is implemented in terms of 'seq' and 'lazy':

> -- "pseq" is defined a bit weirdly (see below)
> --
> -- The reason for the strange "lazy" call is that
> -- it fools the compiler into thinking that pseq  and par are non-strict in
> -- their second argument (even if it inlines pseq at the call site).
> -- If it thinks pseq is strict in "y", then it often evaluates
> -- "y" before "x", which is totally wrong.  
>
> pseq :: a -> b -> b
> pseq  x y = x `seq` lazy y

- does this mean pseq should be used instead of 'seq' when I want the
first argument to be evaluated first? And I am also curious about the
others, although par seems to be documented in Control.Parallel.

Also, the following functions in Control.Parallel.Strategies are not
documented, at least in Haddock:

(>|) :: Done -> Done -> Done
(>||) :: Done -> Done -> Done
using :: a -> Strategy a -> a
demanding :: a -> Done -> a
sparking :: a -> Done -> a
sPar :: a -> Strategy b
sSeq :: a -> Strategy b
r0 :: Strategy a
rwhnf :: Strategy a
rnf :: Strategy a
($|) :: (a -> b) -> Strategy a -> a -> b
($||) :: (a -> b) -> Strategy a -> a -> b
(.|) :: (b -> c) -> Strategy b -> (a -> b) -> a -> c
(.||) :: (b -> c) -> Strategy b -> (a -> b) -> a -> c
(-|) :: (a -> b) -> Strategy b -> (b -> c) -> a -> c
(-||) :: (a -> b) -> Strategy b -> (b -> c) -> a -> c
seqPair :: Strategy a -> Strategy b -> Strategy (a, b)
parPair :: Strategy a -> Strategy b -> Strategy (a, b)
seqTriple :: Strategy a -> Strategy b -> Strategy c -> Strategy (a, b, c)
parTriple :: Strategy a -> Strategy b -> Strategy c -> Strategy (a, b, c)
fstPairFstList :: NFData a => Strategy [(a, b)]
force :: NFData a => a -> a
sforce :: NFData a => a -> b -> b

The types Done and Strategy or the class NFData and related classes in
this module are also not documented in Haddock. If there is a paper
which defines all of these then it would be nice to have a link to the
paper in the module's documentation, for people to use until the
module's documentation itself can be updated.

As an aside, if you've read this far then you may know the answer to a
related question: is there a way to query how many processors the
current machine has? I am implementing a parallel sort, and in cases
such as sorting where one can decompose an algorithm into an
arbitrarily large number of threads, I am wondering how to tell what
the maximum useful number of threads is (usually this will be some
increasing function of the number of CPUs) to avoid the overhead of
spawning a thread when it is not needed. (I'm about to read
"Lightweight concurrency primitives for GHC" by Li et al, if that's
the right place to look)

Thanks,

Frederik

--
http://ofb.net/~frederik/
_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: documentation in GHC.Conc, Control.Parallel.Strategies; querying number of CPUs

Donald Bruce Stewart
frederik:

> Hello,
>
> I am interested in implementing some multi-threaded algorithms in
> Haskell.
>
> I have run into some documentation dead-ends. The documentation in
> GHC.Conc is what I get when I search for "ghc pseq" on google, but it
> doesn't document pseq and some other functions:
>
> pseq
> par

These guys are documented in the parallel Haskell stuff (and should be
in the haddocks!!)

   http://www.haskell.org/ghc/dist/current/docs/users_guide/lang-parallel.html 

> forkOnIO
> childHandler
> ensureIOManagerIsRunning

Not sure about these last ones.
 

> In particular, I wonder: How is pseq different from seq? Under what
> circumstances is it used? I have looked at the source code so I see
> that it is implemented in terms of 'seq' and 'lazy':
>
> > -- "pseq" is defined a bit weirdly (see below)
> > --
> > -- The reason for the strange "lazy" call is that
> > -- it fools the compiler into thinking that pseq  and par are non-strict in
> > -- their second argument (even if it inlines pseq at the call site).
> > -- If it thinks pseq is strict in "y", then it often evaluates
> > -- "y" before "x", which is totally wrong.  
> >
> > pseq :: a -> b -> b
> > pseq  x y = x `seq` lazy y
>
> - does this mean pseq should be used instead of 'seq' when I want the
> first argument to be evaluated first? And I am also curious about the
> others, although par seems to be documented in Control.Parallel.
>
> Also, the following functions in Control.Parallel.Strategies are not
> documented, at least in Haddock:
 
Yes, I don't understand either why Control.Parallel.Strategies isn't
documented. Please file a bug report!
 
> As an aside, if you've read this far then you may know the answer to a
> related question: is there a way to query how many processors the
> current machine has? I am implementing a parallel sort, and in cases

There's no builtin way, but there's an open trac ticket for this.

> such as sorting where one can decompose an algorithm into an
> arbitrarily large number of threads, I am wondering how to tell what
> the maximum useful number of threads is (usually this will be some
> increasing function of the number of CPUs) to avoid the overhead of
> spawning a thread when it is not needed. (I'm about to read
> "Lightweight concurrency primitives for GHC" by Li et al, if that's
> the right place to look)

I usually just ensure the 'n' value is available as a command line flag
to my program.

-- Don
_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: documentation in GHC.Conc, Control.Parallel.Strategies; querying number of CPUs

Esa Ilari Vuokko
In reply to this post by Frederik Eaton-3
Hi,

On 7/15/07, Frederik Eaton <[hidden email]> wrote:
> As an aside, if you've read this far then you may know the answer to a
> related question: is there a way to query how many processors the
> current machine has? I am implementing a parallel sort, and in cases

In Windows there's Win32-package and System.Win32.getSystemInfo
(field siNumberOfProcessors.)

HTH,
Esa
_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: documentation in GHC.Conc, Control.Parallel.Strategies; querying number of CPUs

Malcolm Wallace
In reply to this post by Frederik Eaton-3
Frederik Eaton <[hidden email]> wrote:

> In particular, I wonder: How is pseq different from seq? Under what
> circumstances is it used?

`pseq` is a "genuine" operational sequence operator.  Haskell'98's
    x `seq` y
does not guarantee that x is evaluated to WHNF before y, whereas `pseq`
does guarantee exactly this.

Regards,
    Malcolm
_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: documentation in GHC.Conc, Control.Parallel.Strategies; querying number of CPUs

Donald Bruce Stewart
Malcolm.Wallace:
> Frederik Eaton <[hidden email]> wrote:
>
> > In particular, I wonder: How is pseq different from seq? Under what
> > circumstances is it used?
>
> `pseq` is a "genuine" operational sequence operator.  Haskell'98's
>     x `seq` y
> does not guarantee that x is evaluated to WHNF before y, whereas `pseq`
> does guarantee exactly this.

Good to see the documentation for parallel strategies is improved!

    http://www.haskell.org/ghc/dist/current/docs/parallel/Control-Parallel-Strategies.html

what can we do to make it easier to get into concurrent and parallel
haskell programming? Its really one of the killer features, but is
under-documented and under-blogged-about.

Meanwhile the erlang guys use parMap in every blog I see, despite their
slow compiler and awful string support ;)

-- Don (pondering how to steal parallel programming mindshare)
_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: documentation in GHC.Conc, Control.Parallel.Strategies; querying number of CPUs

Simon Marlow-5
Donald Bruce Stewart wrote:

> what can we do to make it easier to get into concurrent and parallel
> haskell programming? Its really one of the killer features, but is
> under-documented and under-blogged-about.

We have some new tools coming, perhaps for GHC 6.8.  Michael Adams has ported
the GramSim parallel performance-analysis tools to GHC, so we can generate
graphs of runnable/running/blocked threads over time, amongst other things.
This should make it much easier to get a handle on parallel performance, which I
think is the area that we're most lacking at the moment.  I'll try to get the
patches in as soon as possible.

Cheers,
        Simon
_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: documentation in GHC.Conc, Control.Parallel.Strategies; querying number of CPUs

Simon Marlow-5
In reply to this post by Frederik Eaton-3
Frederik Eaton wrote:

> Hello,
>
> I am interested in implementing some multi-threaded algorithms in
> Haskell.
>
> I have run into some documentation dead-ends. The documentation in
> GHC.Conc is what I get when I search for "ghc pseq" on google, but it
> doesn't document pseq and some other functions:
>
> pseq
> par
> forkOnIO
> childHandler
> ensureIOManagerIsRunning
>
> In particular, I wonder: How is pseq different from seq?

See the current library docs:

http://www.haskell.org/ghc/dist/current/docs/libraries/base/Control-Parallel.html

This is what will appear in GHC 6.8, except that Control.Parallel has now
moved into the parallel package (and the online docs don't seem to reflect
this, which is strange - Ian, any ideas?).

In GHC 6.6.x you need to get pseq from GHC.Conc.  We noticed this mistake
only after the 6.6 release.

> Also, the following functions in Control.Parallel.Strategies are not
> documented, at least in Haddock:

That module has better documentation now, follow the link above.

Cheers,
        Simon
_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re[2]: documentation in GHC.Conc, Control.Parallel.Strategies; querying number of CPUs

Bulat Ziganshin-2
Hello Simon,

Tuesday, July 24, 2007, 12:19:18 PM, you wrote:

> See the current library docs:
> http://www.haskell.org/ghc/dist/current/docs/libraries/base/Control-Parallel.html
> This is what will appear in GHC 6.8

isn't */current/* branch should provide documentation for current
release version, i.e. 6.6.*?

--
Best regards,
 Bulat                            mailto:[hidden email]

_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: documentation in GHC.Conc, Control.Parallel.Strategies; querying number of CPUs

Simon Marlow-5
Bulat Ziganshin wrote:

> Hello Simon,
>
> Tuesday, July 24, 2007, 12:19:18 PM, you wrote:
>
>> See the current library docs:
>> http://www.haskell.org/ghc/dist/current/docs/libraries/base/Control-Parallel.html
>> This is what will appear in GHC 6.8
>
> isn't */current/* branch should provide documentation for current
> release version, i.e. 6.6.*?

No, "stable" is the stable branch (i.e. 6.6.x), "current" means HEAD.
Perhaps I should have called it "head", I was using terminology from
FreeBSD at the time, I think.

Cheers,
        Simon

_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: documentation in GHC.Conc, Control.Parallel.Strategies; querying number of CPUs

Ian Lynagh
In reply to this post by Simon Marlow-5
On Tue, Jul 24, 2007 at 09:19:18AM +0100, Simon Marlow wrote:
>
> See the current library docs:
>
> http://www.haskell.org/ghc/dist/current/docs/libraries/base/Control-Parallel.html
>
> This is what will appear in GHC 6.8, except that Control.Parallel has now
> moved into the parallel package (and the online docs don't seem to reflect
> this, which is strange - Ian, any ideas?).

The up-to-date doc is here:
http://www.haskell.org/ghc/dist/current/docs/parallel/Control-Parallel.html

I'll investigate why it's not ending up in the right place.

I think I'll move everything to head/ rather than current/ at the same
time, leaving a symlink for existing links, as that will hopefully cause
less confusion.


Thanks
Ian

_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries