Announce: Probablistic programming in Haskell with bali-phy

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Announce: Probablistic programming in Haskell with bali-phy

Benjamin Redelings
Hi,

I'm working on a project to implement support for probabilistic
programming with models written as Haskell programs.  This is similar to
Bayesian graphical models (e.g. BUGS, JAGS) and to full probabilistic
programming systems like Church / Venture.

     http://www.bali-phy.org/models.php

This project is not very mature, but it is used as the core of the MCMC
software mentioned above, which is relatively mature. Probabilistic
programs are written using the probability monad.

I've implemented a Haskell interpreter in C++ that records execution
traces of the Haskell programs.  These execution traces depend on some
number of (random) variables, and when one of the variables is changed
(during MCMC), the parts of the execution trace that depend on the
changed variables are invalidated and recomputed.  This saves work by
not recomputing the probability of the new state from scratch.  I can
provide DOT graphs of the execution traces if anyone is interested.

The problem domain I've been focusing on is evolution of DNA sequences
on binary trees.  You can see some of the code for that problem domain here:

     https://github.com/bredelings/BAli-Phy/blob/master/modules/

The code for the interpreter is here:

https://github.com/bredelings/BAli-Phy/blob/master/src/computation/


-BenRI

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Announce: Probablistic programming in Haskell with bali-phy

Sergiu Ivanov-2
Hello Benjamin,

Thus quoth  Benjamin Redelings  at 15:47 on Tue, Jun 13 2017:
>
> I'm working on a project to implement support for probabilistic
> programming with models written as Haskell programs.  This is similar to
> Bayesian graphical models (e.g. BUGS, JAGS) and to full probabilistic
> programming systems like Church / Venture.
>
>      http://www.bali-phy.org/models.php

This project looks very interesting to me and seems to be a result of an
impressive amount of work!  Thank you for the announcement.

> Probabilistic programs are written using the probability monad.
>
> I've implemented a Haskell interpreter in C++ that records execution
> traces of the Haskell programs.

Is your probability monad an actual monad that you implemented in a
Haskell package, or is it a consequence of the fact that you are running
Haskell-like programs in your interpreter?

More generally: how deep can the interoperability between your project
and some other Haskell code go?

> These execution traces depend on some number of (random) variables,
> and when one of the variables is changed (during MCMC), the parts of
> the execution trace that depend on the changed variables are
> invalidated and recomputed.  This saves work by not recomputing the
> probability of the new state from scratch.  I can provide DOT graphs
> of the execution traces if anyone is interested.

That sounds really cool!

> The problem domain I've been focusing on is evolution of DNA sequences
> on binary trees.

I'm very much interested in your project because I often look into
non-deterministic evolution of some abstract computing devices, and you
actually seem to be focusing on a similar use case.

--
Sergiu

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.

signature.asc (497 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Announce: Probablistic programming in Haskell with bali-phy

Benjamin Redelings
Hi Sergiu,

Thanks for your interest!

On 06/13/2017 12:30 PM, Sergiu Ivanov wrote:
> Probabilistic programs are written using the probability monad.
>> I've implemented a Haskell interpreter in C++ that records execution
>> traces of the Haskell programs.
> Is your probability monad an actual monad that you implemented in a
> Haskell package, or is it a consequence of the fact that you are running
> Haskell-like programs in your interpreter?
1. Its an actual monad.  It allows probability distributions to be
"performed" by a Haskell function that serves as an interpreter. But I
haven't implemented it in GHC.  There are actually two interpreters,
which are at the top of this file.  "IOAndPass" is <<=, and "IOAnd" is
<<. Please forgive the extreme hackiness :-P
https://github.com/bredelings/BAli-Phy/blob/master/modules/Distributions.hs
The (sample) function performs a distribution d by generating an IO
action to randomly sample from the distribution.  The (sample') function
performs a distribution by setting up a node in a (dynamic) graphical
model, which is also an IO action.

2. Actions in this monad can either
* sample from a distribution.
* specify that we should record samples of an expression.
* alter properties of variables sampled in a given scope.
* observe random outcomes.

For example in the code below, I specify that we should sample a
dirichlet containing n modifiable variables, but that MCMC should
perform resampling of the variables at a rate of 1/sqrt(n).

frequencies_model a = do {
   let {n_letters = alphabetSize a;
        letters = alphabet_letters a};
   SamplingRate (1.0/sqrt(intToDouble n_letters)) $ do {
      pi <- dirichlet' n_letters 1.0;
      sequence_ $ zipWith (\p l -> Log l p) pi letters;
      return pi
   }
};

So, the monad here is mainly used to tag variables sampled in a given
scope.  It might also be used to specify mcmc transition kernels for
these variables.

I'm not quite sure how to handle scopes outside monads yet, but Venture
does something like that, and it doesn't use monads.

3. The above code is "mochastic" because we perform the distribution in
a monadic interpreter in order to sample from it.  But we could also do
something non-monadic like

let x=sample $ normal 0 1
      y=sample $ gamma 1 1
in x*y

But Haskell itself isn't a monad.  For example, x and y aren't
sequenced.  Are there Haskell libraries to "run" Haskell programs in
different interpreters?

4. The low-level C++ interpreter for haskell in bali-phy is not
monadic.  Its basically a C++ implementation of something like the STG
machine, but modified to do incremental computation for generic
Haskell.  I actually used Sestoft (1997) "Deriving a Lazy Abstract
Machine" instead of the STG.  I guess I could have tried to do this by
modifying GHCi to track execution traces.  I'm not really sure if that
would be be a good idea or not.  Any thoughts?

> More generally: how deep can the interoperability between your project
> and some other Haskell code go?
Hmm.. well I guess we could implement *sampling* from the above monad in
GHC just by making the interpreter call unsafePerformIO when it sees a
probability distribution.  But implementing MCMC inference would require
reimplementing incremental computation for Haskell in GHC.


> These execution traces depend on some number of (random) variables,
>> and when one of the variables is changed (during MCMC), the parts of
>> the execution trace that depend on the changed variables are
>> invalidated and recomputed.  This saves work by not recomputing the
>> probability of the new state from scratch.  I can provide DOT graphs
>> of the execution traces if anyone is interested.
> That sounds really cool!
>
>> The problem domain I've been focusing on is evolution of DNA sequences
>> on binary trees.
> I'm very much interested in your project because I often look into
> non-deterministic evolution of some abstract computing devices, and you
> actually seem to be focusing on a similar use case.
Hmm... I guess the idea is to represent a distribution on the possible
execution histories of non-deterministic haskell programs. So, yeah,
maybe there is some overlap.

Huh... that actually might work!  Do you handle non-determinism
probabilistically?  Do you want to condition on observations, or just
run the abstract machine many times and record the distribution of what
happens?

-BenRI
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Announce: Probablistic programming in Haskell with bali-phy

Sergiu Ivanov-2
Hi Benjamin,

Thus quoth  Benjamin Redelings  at 20:46 on Thu, Jun 15 2017:

> On 06/13/2017 12:30 PM, Sergiu Ivanov wrote:
>>
>> Probabilistic programs are written using the probability monad.
>>> I've implemented a Haskell interpreter in C++ that records execution
>>> traces of the Haskell programs.
>> Is your probability monad an actual monad that you implemented in a
>> Haskell package, or is it a consequence of the fact that you are running
>> Haskell-like programs in your interpreter?
>
> 1. Its an actual monad.  It allows probability distributions to be
> "performed" by a Haskell function that serves as an interpreter. But I
> haven't implemented it in GHC.  There are actually two interpreters,
> which are at the top of this file.  "IOAndPass" is <<=, and "IOAnd" is
> <<. Please forgive the extreme hackiness :-P
> https://github.com/bredelings/BAli-Phy/blob/master/modules/Distributions.hs
OK, I see. (I suppose you meant to say that IOAndPass is =<<).

> The (sample) function performs a distribution d by generating an IO
> action to randomly sample from the distribution.  The (sample') function
> performs a distribution by setting up a node in a (dynamic) graphical
> model, which is also an IO action.

OK, so the difference is in the presence or absence of the graphical
representation of the model your are working with?

> 2. Actions in this monad can either
> * sample from a distribution.
> * specify that we should record samples of an expression.

Do you mean something like

  x <- doSomething `fmap` someDistribution

where x <- someDistribution would sample someDistribution?

> * alter properties of variables sampled in a given scope.

Oh, so you have a kind of state baked into your monad as well?

> For example in the code below, I specify that we should sample a
> dirichlet containing n modifiable variables, but that MCMC should
> perform resampling of the variables at a rate of 1/sqrt(n).
[...]

OK, I see, thanks for the example!

What do you mean by "modifiable variables"?  Are those just random
variables?

> So, the monad here is mainly used to tag variables sampled in a given
> scope.  It might also be used to specify mcmc transition kernels for
> these variables.

OK.

> I'm not quite sure how to handle scopes outside monads yet, but Venture
> does something like that, and it doesn't use monads.

Not sure what you mean by "handle scopes outside monads".

Some kind of scopes are handled without monads in a lot of programming
languages/frameworks :-)

> 3. The above code is "mochastic" because we perform the distribution in
> a monadic interpreter in order to sample from it.  But we could also do
> something non-monadic like
>
> let x=sample $ normal 0 1
>       y=sample $ gamma 1 1
> in x*y

Yeah, sure, because every monad is a functor.

> But Haskell itself isn't a monad.  For example, x and y aren't
> sequenced.

Not sure what you mean here again.

Something can be a monad or not a monad depending on how you look at it.
If you see Haskell code as a sequence of strings fed into the compiler,
it's quite monadic to my taste.  You probably mean something else.

> Are there Haskell libraries to "run" Haskell programs in different
> interpreters?

There is a number of different Haskell implementations [0] as well as
haskell-src-exts [1] which is a parser of Haskell implemented as a
Haskell library.  But you probably mean something else by "different
interpreters".

> 4. The low-level C++ interpreter for haskell in bali-phy is not
> monadic.  Its basically a C++ implementation of something like the STG
> machine, but modified to do incremental computation for generic
> Haskell.  I actually used Sestoft (1997) "Deriving a Lazy Abstract
> Machine" instead of the STG.  I guess I could have tried to do this by
> modifying GHCi to track execution traces.  I'm not really sure if that
> would be be a good idea or not.  Any thoughts?

My question here would be: why do you chose to write a Haskell
interpreter (or modifying an existing one) instead of writing everything
in Haskell?  (I believe that's also the question Olaf asked.)

(I don't really have any ideas on what would be a good way of
implementing a Haskell interpreter.)

>> More generally: how deep can the interoperability between your project
>> and some other Haskell code go?
> Hmm.. well I guess we could implement *sampling* from the above monad in
> GHC just by making the interpreter call unsafePerformIO when it sees a
> probability distribution.  But implementing MCMC inference would require
> reimplementing incremental computation for Haskell in GHC.

OK, so sampling can be natively implemented in Haskell.

What do you mean by "incremental computation"?

>>> The problem domain I've been focusing on is evolution of DNA sequences
>>> on binary trees.
>> I'm very much interested in your project because I often look into
>> non-deterministic evolution of some abstract computing devices, and you
>> actually seem to be focusing on a similar use case.
> Hmm... I guess the idea is to represent a distribution on the possible
> execution histories of non-deterministic haskell programs. So, yeah,
> maybe there is some overlap.

Yes, that's what I kind of had in my brain.

> Huh... that actually might work!  Do you handle non-determinism
> probabilistically?  Do you want to condition on observations, or just
> run the abstract machine many times and record the distribution of what
> happens?

When I did similar things for my research, I just explored all the
non-deterministic branches of execution for a couple of steps.  I used a
clot of Perl code which could filter out some branches.  I haven't
considered using a probabilistic approach, but it may actually be quite
interesting, I'll give it a think, thank you :-)

--
Sergiu

[0] https://wiki.haskell.org/Implementations

[1] http://hackage.haskell.org/package/haskell-src-exts

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.

signature.asc (497 bytes) Download Attachment