Replace Random

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Replace Random

Dominic Steinitz-2
Hello libraries,

Following a great blog [post] by @lehins, a group of us (@curiousleo,
@lehins and me) are trying to improve the situation with the `random'
library.

@curiousleo and I have created a [resource] that tests the quality of
Haskell random number generators via well known (and not so well known)
test suites: dieharder, TestU01, PractRand and gjrand. The current
`random' does not fare well especially with the use of the `split'
function (but this is well known and the [reason] why `QuickCheck' moved
from using it in [2.8] to using [tf-random] in [2.9] and latterly
[splitmix] in [2.13]): see [this] for example. On the other hand,
`splitmix' [passes] bigcrush[1].

The putative proposal is to replace the current algorithm in `random'
with that of `splitmix'[2] and to remove the performance bottleneck by
changing the interface (the current interface causes the performance
infelicity by making "all of the integral numbers go through the
arbitrary precision Integer in order to produce the value in a desired
range") - see @lehin's blog for more details.

Can anyone interested:

* Create a separate issue for each concern they have (eg. range for
  floats (0, 1] vs [0, 1], etc.) [here].
* Submit PRs with target at the [interface-to-performance] branch (or
  into master if it is vastly different approach) with your suggested
  alternatives.

If you are going to raise a concern then it might be worth reading some
of the [discussions] that have already taken place.

We think once we have the API flashed out, switching to splitmix will be
a piece of cake and will require an addition of just a few lines of code
and removal of current StdGen related functionality. For historical
reasons instead of removing it we could move StdGen into a separate
module with a disclaimer not to use it, but that isn't terribly
important.

The Random Team (@lehins, @curiousleo, @idontgetoutmuch)


[post] https://alexey.kuleshevi.ch/blog/2019/12/21/random-benchmarks/

[resource] https://github.com/tweag/random-quality

[reason]
http://publications.lib.chalmers.se/records/fulltext/183348/local_183348.pdf

[2.8] https://hackage.haskell.org/package/QuickCheck-2.8

[tf-random] https://hackage.haskell.org/package/tf-random

[2.9] https://hackage.haskell.org/package/QuickCheck-2.9

[splitmix] https://hackage.haskell.org/package/splitmix

[2.13] https://hackage.haskell.org/package/QuickCheck-2.13

[this]
https://github.com/tweag/random-quality/blob/master/results/random-word32-split-practrand-1gb

[passes]
https://github.com/tweag/random-quality/blob/master/results/splitmix-word32-testu01-bigcrush

[here] https://github.com/idontgetoutmuch/random/issues

[interface-to-performance]
https://github.com/idontgetoutmuch/random/tree/interface-to-performance

[discussions] https://github.com/idontgetoutmuch/random/pull/1



Footnotes
_________

[1] Just to clarify: both random and splitmix pass BigCrush. random
fails any statistical test immediately (e.g. [SmallCrush]
(https://github.com/tweag/random-quality/blob/master/results/random-word32-split-testu01-smallcrush#L337-L349)
and other even smaller ones) when a sequence based on split is
used. splitmix passes Crush when split is part of the sequence, but
I've seen it fail one test in BigCrush ("LinearComp"). So we should
just be careful here: splitmix itself passes BigCrush and split-based
sequences all pass Crush, but not all pass BigCrush.

[2] `split' is already availaible as an instance: `instance Random
SMGen where'.

_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Fix Random

Carter Schonwald
Yup. I have some of these changes in a branch for a while.   Just was uncertain that those who share the common interface are ok with the current tyoeclass interface breaking.    

Sounds like everyone’s willing to deal with the needed breakage. 

I’ll have a release candidate to share publicly in a few weeks.  


Meta point: Dominic if you want to communicate with me try email. You seem to prefer anything but direct concrete communication with me. OTOH I guess we just are terrible at communicating with each other. And I guess that’s fine.   Just unfortunate. 


On Mon, Feb 17, 2020 at 8:13 AM <[hidden email]> wrote:
Hello libraries,

Following a great blog [post] by @lehins, a group of us (@curiousleo,
@lehins and me) are trying to improve the situation with the `random'
library.

@curiousleo and I have created a [resource] that tests the quality of
Haskell random number generators via well known (and not so well known)
test suites: dieharder, TestU01, PractRand and gjrand. The current
`random' does not fare well especially with the use of the `split'
function (but this is well known and the [reason] why `QuickCheck' moved
from using it in [2.8] to using [tf-random] in [2.9] and latterly
[splitmix] in [2.13]): see [this] for example. On the other hand,
`splitmix' [passes] bigcrush[1].

The putative proposal is to replace the current algorithm in `random'
with that of `splitmix'[2] and to remove the performance bottleneck by
changing the interface (the current interface causes the performance
infelicity by making "all of the integral numbers go through the
arbitrary precision Integer in order to produce the value in a desired
range") - see @lehin's blog for more details.

Can anyone interested:

* Create a separate issue for each concern they have (eg. range for
  floats (0, 1] vs [0, 1], etc.) [here].
* Submit PRs with target at the [interface-to-performance] branch (or
  into master if it is vastly different approach) with your suggested
  alternatives.

If you are going to raise a concern then it might be worth reading some
of the [discussions] that have already taken place.

We think once we have the API flashed out, switching to splitmix will be
a piece of cake and will require an addition of just a few lines of code
and removal of current StdGen related functionality. For historical
reasons instead of removing it we could move StdGen into a separate
module with a disclaimer not to use it, but that isn't terribly
important.

The Random Team (@lehins, @curiousleo, @idontgetoutmuch)


[post] https://alexey.kuleshevi.ch/blog/2019/12/21/random-benchmarks/

[resource] https://github.com/tweag/random-quality

[reason]
http://publications.lib.chalmers.se/records/fulltext/183348/local_183348.pdf

[2.8] https://hackage.haskell.org/package/QuickCheck-2.8

[tf-random] https://hackage.haskell.org/package/tf-random

[2.9] https://hackage.haskell.org/package/QuickCheck-2.9

[splitmix] https://hackage.haskell.org/package/splitmix

[2.13] https://hackage.haskell.org/package/QuickCheck-2.13

[this]
https://github.com/tweag/random-quality/blob/master/results/random-word32-split-practrand-1gb

[passes]
https://github.com/tweag/random-quality/blob/master/results/splitmix-word32-testu01-bigcrush

[here] https://github.com/idontgetoutmuch/random/issues

[interface-to-performance]
https://github.com/idontgetoutmuch/random/tree/interface-to-performance

[discussions] https://github.com/idontgetoutmuch/random/pull/1



Footnotes
_________

[1] Just to clarify: both random and splitmix pass BigCrush. random
fails any statistical test immediately (e.g. [SmallCrush]
(https://github.com/tweag/random-quality/blob/master/results/random-word32-split-testu01-smallcrush#L337-L349)
and other even smaller ones) when a sequence based on split is
used. splitmix passes Crush when split is part of the sequence, but
I've seen it fail one test in BigCrush ("LinearComp"). So we should
just be careful here: splitmix itself passes BigCrush and split-based
sequences all pass Crush, but not all pass BigCrush.

[2] `split' is already availaible as an instance: `instance Random
SMGen where'.

_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Fix Random

Carter Schonwald
Thx for the kick start.  Was taking some downtime and personal/work projects as my focus and break after releasing the new vector last  month. Random is on the top of my oss queue atm 

On Mon, Feb 17, 2020 at 11:05 AM Carter Schonwald <[hidden email]> wrote:
Yup. I have some of these changes in a branch for a while.   Just was uncertain that those who share the common interface are ok with the current tyoeclass interface breaking.    

Sounds like everyone’s willing to deal with the needed breakage. 

I’ll have a release candidate to share publicly in a few weeks.  


Meta point: Dominic if you want to communicate with me try email. You seem to prefer anything but direct concrete communication with me. OTOH I guess we just are terrible at communicating with each other. And I guess that’s fine.   Just unfortunate. 


On Mon, Feb 17, 2020 at 8:13 AM <[hidden email]> wrote:
Hello libraries,

Following a great blog [post] by @lehins, a group of us (@curiousleo,
@lehins and me) are trying to improve the situation with the `random'
library.

@curiousleo and I have created a [resource] that tests the quality of
Haskell random number generators via well known (and not so well known)
test suites: dieharder, TestU01, PractRand and gjrand. The current
`random' does not fare well especially with the use of the `split'
function (but this is well known and the [reason] why `QuickCheck' moved
from using it in [2.8] to using [tf-random] in [2.9] and latterly
[splitmix] in [2.13]): see [this] for example. On the other hand,
`splitmix' [passes] bigcrush[1].

The putative proposal is to replace the current algorithm in `random'
with that of `splitmix'[2] and to remove the performance bottleneck by
changing the interface (the current interface causes the performance
infelicity by making "all of the integral numbers go through the
arbitrary precision Integer in order to produce the value in a desired
range") - see @lehin's blog for more details.

Can anyone interested:

* Create a separate issue for each concern they have (eg. range for
  floats (0, 1] vs [0, 1], etc.) [here].
* Submit PRs with target at the [interface-to-performance] branch (or
  into master if it is vastly different approach) with your suggested
  alternatives.

If you are going to raise a concern then it might be worth reading some
of the [discussions] that have already taken place.

We think once we have the API flashed out, switching to splitmix will be
a piece of cake and will require an addition of just a few lines of code
and removal of current StdGen related functionality. For historical
reasons instead of removing it we could move StdGen into a separate
module with a disclaimer not to use it, but that isn't terribly
important.

The Random Team (@lehins, @curiousleo, @idontgetoutmuch)


[post] https://alexey.kuleshevi.ch/blog/2019/12/21/random-benchmarks/

[resource] https://github.com/tweag/random-quality

[reason]
http://publications.lib.chalmers.se/records/fulltext/183348/local_183348.pdf

[2.8] https://hackage.haskell.org/package/QuickCheck-2.8

[tf-random] https://hackage.haskell.org/package/tf-random

[2.9] https://hackage.haskell.org/package/QuickCheck-2.9

[splitmix] https://hackage.haskell.org/package/splitmix

[2.13] https://hackage.haskell.org/package/QuickCheck-2.13

[this]
https://github.com/tweag/random-quality/blob/master/results/random-word32-split-practrand-1gb

[passes]
https://github.com/tweag/random-quality/blob/master/results/splitmix-word32-testu01-bigcrush

[here] https://github.com/idontgetoutmuch/random/issues

[interface-to-performance]
https://github.com/idontgetoutmuch/random/tree/interface-to-performance

[discussions] https://github.com/idontgetoutmuch/random/pull/1



Footnotes
_________

[1] Just to clarify: both random and splitmix pass BigCrush. random
fails any statistical test immediately (e.g. [SmallCrush]
(https://github.com/tweag/random-quality/blob/master/results/random-word32-split-testu01-smallcrush#L337-L349)
and other even smaller ones) when a sequence based on split is
used. splitmix passes Crush when split is part of the sequence, but
I've seen it fail one test in BigCrush ("LinearComp"). So we should
just be careful here: splitmix itself passes BigCrush and split-based
sequences all pass Crush, but not all pass BigCrush.

[2] `split' is already availaible as an instance: `instance Random
SMGen where'.

_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Fix Random

Dominic Steinitz-2
In reply to this post by Carter Schonwald
Hi Carter,

That’s great :) May I suggest you make a PR of your proposal against
latest release on Hackage.

That way everyone will be able to see what the changes are against the
latest release.

The proposal @lehins, @curiousleo and I have been working on are here:


An alternative would be to create a branch from v1.1 tag and then we
could all submit PRs against that branch.

Dominic Steinitz
Twitter: @idontgetoutmuch

On 17 Feb 2020, at 16:05, Carter Schonwald <[hidden email]> wrote:

Yup. I have some of these changes in a branch for a while.   Just was uncertain that those who share the common interface are ok with the current tyoeclass interface breaking.    

Sounds like everyone’s willing to deal with the needed breakage. 

I’ll have a release candidate to share publicly in a few weeks.  


_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries