representations of probability distributions

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

representations of probability distributions

Olaf Klinke
Dear cafe,

following up the announcement of bali-phy and the discussion that ensued, I'd like ask the cafe for an overview of representations of probability distributions as data types. It seems that all packages mentioned to far use some flavour of Markov chain. Why? What else is there? Can someone recommend a survey publication/book/talk?

Explicitly I am interested in:
- the data type, as in Haskell's type system
- how an element of the data type relates to the distribution it describes
- how the monad instance works (if there is one)
- how to compute conditionals (if computable)
- how to compute integrals (if computable)
- what base types are supported (finite types, reals, products ...)

Below is an incomplete, and possibly oversimplified (or even wrong), list of my own.
Additions and corrections welcome.
-- Olaf

1. Markov chains
- uses a random number generator
- type is something like
  Markov a = a -> IO a
- repeated execution produces an infinite trace xs :: [a]
  where for each predicate p :: a -> Bool, the limit of
  (length (filter p (take n xs))) % n
  as n tends to infinity approaches the probability of p
  under the distribution modeled.
- monad instance?
- conditionals?
- integrals?
- supported base types a in Markov a?

2. Riesz functionals: possibly the most functional representation
- representation as linear functionals
  Riesz a = (a -> R) -> R
  for a suitable real number type R
- embed predicates in the type (a -> R) via characteristic functions
  characteristic p = \a -> if p a then 1 else 0
  Then for a functional phi :: Riesz a and p :: a -> Bool,
  phi (characteristic p) is the probability of p.
- Monad instance is the same as for the continuation monad
- conditionals?
- integration is trivial: evaluation
- arbitrary base types
- but how would you even declare a uniform distribution in this type?

3. Density functions
- represent a distribution on the reals as density function w.r.t. the Lebesgue measure
- limited to reals (and products thereof)
- data type possibly holds only parameters of parametric families (gaussian, beta, gamma, ...)
- computing conditionals only for matching pairs of prior and sampling function
  (conjugate priors), analytically
- integration analytically
- no monad instance because no arbitrary base type

4. Finite distributions
- type Dist a = [(a,R)] for suitable real number type R
- optimizations use search trees instead of flat lists
- basically composition of list and writer monads,
  where R is regarded as monoid under multiplication
- represents a distribution of finite support.
  If phi :: Dist a, then
  probability p = sum $ map snd $ filter (p.fst) phi
- conditioning possible if base type is of Eq class and R supports division
- gets unwieldy without Ord instance of base type to keep the list small

5. Valuations
- type Val a = (a -> Bool) -> R
  for suitable real number type R
- models mathematical definition closely
- monad bind is integration: requires exact reals R due to limit process
- conditioning possible if R supports division
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: representations of probability distributions

Benjamin Redelings
Hi,

On 07/03/2017 05:14 PM, Olaf Klinke wrote:
> Dear cafe,
>
> following up the announcement of bali-phy and the discussion that ensued, I'd like ask the cafe for an overview of representations of probability distributions as data types. It seems that all packages mentioned to far use some flavour of Markov chain. Why? What else is there? Can someone recommend a survey publication/book/talk?

One other class of inference strategies is Sequential Monte Carlo (aka
Particle filters). Unlike MCMC, these methods run multiple points
("particles") in parallel, and the points can have unequal weights.  The
point cloud is incrementally updated as information is added, so these
methods work well for online algorithms where you sequentially add data
points.  One idea in the probabilistic programming community is to
separate the inference strategy from the inference problem, so that one
could apply "pluggable" inference strategies. (See
https://arxiv.org/abs/1404.0099).

There are also "variational" methods, which I think restrict the
posterior distribution to a particular analytic form, and use an
iterative algorithm to find the parameter values of the restricted
distribution that minimize the KL distance to the true posterior
distribution.  So, this is not a Monte Carlo method.

> Explicitly I am interested in:
> - the data type, as in Haskell's type system
> - how an element of the data type relates to the distribution it describes
> - how the monad instance works (if there is one)
> - how to compute conditionals (if computable)
> - how to compute integrals (if computable)
> - what base types are supported (finite types, reals, products ...)
>
> Below is an incomplete, and possibly oversimplified (or even wrong), list of my own.
> Additions and corrections welcome.
> -- Olaf
>
> 1. Markov chains
> - uses a random number generator
> - type is something like
>    Markov a = a -> IO a
> - repeated execution produces an infinite trace xs :: [a]
>    where for each predicate p :: a -> Bool, the limit of
>    (length (filter p (take n xs))) % n
>    as n tends to infinity approaches the probability of p
>    under the distribution modeled.
> - monad instance?
> - conditionals?
> - integrals?
> - supported base types a in Markov a?
>
> 2. Riesz functionals: possibly the most functional representation
> - representation as linear functionals
>    Riesz a = (a -> R) -> R
>    for a suitable real number type R
> - embed predicates in the type (a -> R) via characteristic functions
>    characteristic p = \a -> if p a then 1 else 0
>    Then for a functional phi :: Riesz a and p :: a -> Bool,
>    phi (characteristic p) is the probability of p.
> - Monad instance is the same as for the continuation monad
> - conditionals?
> - integration is trivial: evaluation
> - arbitrary base types
> - but how would you even declare a uniform distribution in this type?
>
> 3. Density functions
> - represent a distribution on the reals as density function w.r.t. the Lebesgue measure
> - limited to reals (and products thereof)
> - data type possibly holds only parameters of parametric families (gaussian, beta, gamma, ...)
> - computing conditionals only for matching pairs of prior and sampling function
>    (conjugate priors), analytically
> - integration analytically
> - no monad instance because no arbitrary base type
>
> 4. Finite distributions
> - type Dist a = [(a,R)] for suitable real number type R
> - optimizations use search trees instead of flat lists
> - basically composition of list and writer monads,
>    where R is regarded as monoid under multiplication
> - represents a distribution of finite support.
>    If phi :: Dist a, then
>    probability p = sum $ map snd $ filter (p.fst) phi
> - conditioning possible if base type is of Eq class and R supports division
> - gets unwieldy without Ord instance of base type to keep the list small
>
> 5. Valuations
> - type Val a = (a -> Bool) -> R
>    for suitable real number type R
> - models mathematical definition closely
> - monad bind is integration: requires exact reals R due to limit process
> - conditioning possible if R supports division
> _______________________________________________
> Haskell-Cafe mailing list
> To (un)subscribe, modify options or view archives go to:
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> Only members subscribed via the mailman list are allowed to post.

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: representations of probability distributions

Olaf Klinke
In reply to this post by Olaf Klinke
Thanks for the link, Benjamin.

Can one give a Hasekll type to a particle filter?
In the arxiv paper you linked, I found this quite unnerving sentence:

"The divergence between the true conditioned distribution on execution traces and the distribution encoded by the program may depend strongly on what inference instructions are chosen and how they are interleaved with the incorporation of data."

Does this mean: Here is a programming language that may or may not compute what you meant it to do?
--Olaf
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: representations of probability distributions

David Banas-2
In reply to this post by Olaf Klinke
Hi Olaf,

I can’t quite tell whether this is apropos to your request, but thought I’d mention it:

-db


_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Loading...