Hi,
0. Does anyone know of any simple extensions of the HM type system to non-deterministic functions? The reason that I'm asking is that for probabilistic programming in the lambda calculus, there are two ways of writing expressions: (a) stochastic: let x = sample $ normal 0 1 in x*x or simply (sample $ normal 0 1)^2 (b) "mochastic": do {x <- normal 0 1; return (x*x)} The "mo" in the second one refers to the use of monads. That is the approach taken in the paper "Practical Probabilistic Programming with monads" (http://mlg.eng.cam.ac.uk/pub/pdf/SciGhaGor15.pdf) which I really enjoyed. However, I am interested in the stochastic form here. There are a number of reasons, such as the fact that the monadic representation forces serialization on things that need not be serial. In fact, though, I'm not trying to prove which one is best, I am just interested in exploring the non-monadic approach as well. 1. So, is it possible to do a simple extension to the type system to express non-determinism? I found this paper (Implicit Self-Adjusting Computation for Purely Functional Programs) that uses "level" tags on types to express either (i) security or (ii) changeability. The first idea (for example) is that each type is tagged with one of two "levels", say Public and Secure, so that we actually have Int[Public] or Int[Secure]. Any function that consumes a Secure value must (i) must return a Secure type and (ii) has the arrow in its type labelled with [Secure]. I won't explain the "changeable" idea because its kind of complicated, but I am very interested in it. 2. This is kind of tangential to the point of my question, but to explain the examples below, it might be important to distinguish sampling from a distribution from the distribution itself. So, normal 0 1 won't generate a random sample. Instead, normal 0 1 () will generate a random sample. This allows us to pass (normal 0 1) to another function which applies it multiple times to generate multiple samples from the same distribution. -- sample from a distribution dist sample dist = dist () --- take n samples from a distribution dist iid n dist = take n (map sample $ repeat dist) Here we see some of the value of using the stochastic approach, versus the "mochastic" approach: we can use normal Haskell syntax to handle lists of random values! 3. So, I'm wondering if its possible to extend the HM type system to handle non-determinism in a similar fashion by either (i) having some function types be non-deterministic and/or (ii) having term types be non-deterministic. Taking the second approach, I suggest tagging each type with level [D] (for deterministic) or [N] (for non-deterministic). Notation-wise, if a determinism level is unspecified, then this means (I think) quantifying over determinism levels. A function that samples from the normal distribution we would get a type like: normal:: double -> double -> () -> double[N] Our goal would be that an expression that consumes a non-deterministic expression must itself be non-deterministic, and any function that takes a non-deterministic input must have a non-deterministic output. We could implement that using rules something like this, where {a,b} are type variables and {l1,l2} are level variables. x:a[l1] :: a[l1] \x:a[l1] -> E:b[l2] :: a[l1] -> b[max l1 l2] E[a[l1]->b[l2]] E[a[l]] :: b[l2] The idea is that max l1 l2 would yield N (non-deterministic) if either l1=N or l2=N, because N > D. 4. Putting non-determinism into the type system would affect GHC in a few ways: (a) we shouldn't pull non-deterministic expressions out of lambdas: We should NOT change \x -> let y=sample $ normal 0 1 in y+x into let y = sample $ normal 0 1 in \x -> y+x (b) we should merge variables with identical values if the types are non-deterministic. For example it is OK to change let {x=normal 0 1; y = normal 0 1 in (sample x * sample y)} into let {x=normal 0 1} in sample x However it is NOT OK to change let {x=sample $ normal 0 1; y = sample $ normal 0 1} in x*y into let {x=sample $ normal 0 1} in x*x Perhaps this would be useful in other contexts? 5. If what I've written makes sense, then the types of the functions 'sample' and 'iid' would be: sample:: (()->a[N]) -> a[N] iid:: Int -> (() -> a[N]) -> [a[N]] 6. This is quite a long e-mail, so to summarize, I am interested in whether or not there are any simple systems for putting non-determinism into HM. Is the use of tagged types known NOT to work? Is there are work on this that I should be aware of? Any help much appreciated! :-) take care, -BenRI _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post. |
Hi Benjamin
Let's see what you ask for, you have *new* syntax for types: a[N] and a[D] what are a[N][N] or a[N][D] or a[N][D] or a[D][D]? Aren't they a[N], a[N], a[N] and a[D] respectively? That's what monads are about! So a[N] ~ Distr a a[D] ~ Identity a ~ a No need to complicate type-system! You just to not be afraid of monads! Monads aren't sequencing, they are computational context. I guess, you just want more natural term-level syntax. You can use ApplicativeDo [1] (in GHC-8.0+), so e.g. do x <- normal 0 1 y <- normal 0 1 return (f x y) will be transformed into liftA2 f (normal 0 1) (normal 0 1) That's almost like f (normal 0 1) (normal 0 1) if you have proper syntax highlighting ;) Note: various term syntax extensions been proposed. E.g. idiom brackets in the "Applicative programming with effects" [2]: (| f (normal 0 1) (normal 0 1) |) to mean pure f <*> normal 0 1 <*> normal 0 1 which is equivalent to above liftA2 expression. If you like that, you can check "the Strathclyde Haskell Enhancement", it supports idiom brackets. [1] https://www.microsoft.com/en-us/research/publication/desugaring-haskells-do-notation-into-applicative-operations/ [2] http://strictlypositive.org/IdiomLite.pdf [3] https://personal.cis.strath.ac.uk/conor.mcbride/pub/she/ On 11.01.2018 18:27, Benjamin Redelings wrote: > Hi, > > 0. Does anyone know of any simple extensions of the HM type system to > non-deterministic functions? The reason that I'm asking is that for > probabilistic programming in the lambda calculus, there are two ways > of writing expressions: > > (a) stochastic: let x = sample $ normal 0 1 in x*x > > or simply (sample $ normal 0 1)^2 > > (b) "mochastic": do {x <- normal 0 1; return (x*x)} > > The "mo" in the second one refers to the use of monads. That is the > approach taken in the paper "Practical Probabilistic Programming with > monads" (http://mlg.eng.cam.ac.uk/pub/pdf/SciGhaGor15.pdf) which I > really enjoyed. > > However, I am interested in the stochastic form here. There are a > number of reasons, such as the fact that the monadic representation > forces serialization on things that need not be serial. In fact, > though, I'm not trying to prove which one is best, I am just > interested in exploring the non-monadic approach as well. > > 1. So, is it possible to do a simple extension to the type system to > express non-determinism? I found this paper (Implicit Self-Adjusting > Computation for Purely Functional Programs) that uses "level" tags on > types to express either (i) security or (ii) changeability. The first > idea (for example) is that each type is tagged with one of two > "levels", say Public and Secure, so that we actually have Int[Public] > or Int[Secure]. Any function that consumes a Secure value must (i) > must return a Secure type and (ii) has the arrow in its type labelled > with [Secure]. I won't explain the "changeable" idea because its kind > of complicated, but I am very interested in it. > > 2. This is kind of tangential to the point of my question, but to > explain the examples below, it might be important to distinguish > sampling from a distribution from the distribution itself. So, normal > 0 1 won't generate a random sample. Instead, normal 0 1 () will > generate a random sample. This allows us to pass (normal 0 1) to > another function which applies it multiple times to generate multiple > samples from the same distribution. > > -- sample from a distribution dist > sample dist = dist () > > --- take n samples from a distribution dist > iid n dist = take n (map sample $ repeat dist) > > Here we see some of the value of using the stochastic approach, versus > the "mochastic" approach: we can use normal Haskell syntax to handle > lists of random values! > > 3. So, I'm wondering if its possible to extend the HM type system to > handle non-determinism in a similar fashion by either (i) having some > function types be non-deterministic and/or (ii) having term types be > non-deterministic. Taking the second approach, I suggest tagging > each type with level [D] (for deterministic) or [N] (for > non-deterministic). Notation-wise, if a determinism level is > unspecified, then this means (I think) quantifying over determinism > levels. A function that samples from the normal distribution we would > get a type like: > > normal:: double -> double -> () -> double[N] > > Our goal would be that an expression that consumes a non-deterministic > expression must itself be non-deterministic, and any function that > takes a non-deterministic input must have a non-deterministic > output. We could implement that using rules something like this, > where {a,b} are type variables and {l1,l2} are level variables. > > x:a[l1] :: a[l1] > \x:a[l1] -> E:b[l2] :: a[l1] -> b[max l1 l2] > E[a[l1]->b[l2]] E[a[l]] :: b[l2] > > The idea is that max l1 l2 would yield N (non-deterministic) if either > l1=N or l2=N, because N > D. > > 4. Putting non-determinism into the type system would affect GHC in a > few ways: > > (a) we shouldn't pull non-deterministic expressions out of lambdas: > > We should NOT change > \x -> let y=sample $ normal 0 1 in y+x > into > let y = sample $ normal 0 1 in \x -> y+x > > (b) we should merge variables with identical values if the types are > non-deterministic. > > For example it is OK to change > let {x=normal 0 1; y = normal 0 1 in (sample x * sample y)} > into > let {x=normal 0 1} in sample x > > However it is NOT OK to change > let {x=sample $ normal 0 1; y = sample $ normal 0 1} in x*y > into > let {x=sample $ normal 0 1} in x*x > > Perhaps this would be useful in other contexts? > > 5. If what I've written makes sense, then the types of the functions > 'sample' and 'iid' would be: > > sample:: (()->a[N]) -> a[N] > > iid:: Int -> (() -> a[N]) -> [a[N]] > > 6. This is quite a long e-mail, so to summarize, I am interested in > whether or not there are any simple systems for putting > non-determinism into HM. Is the use of tagged types known NOT to > work? Is there are work on this that I should be aware of? > > Any help much appreciated! :-) > > take care, > > -BenRI > > _______________________________________________ > Haskell-Cafe mailing list > To (un)subscribe, modify options or view archives go to: > http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe > Only members subscribed via the mailman list are allowed to post. _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post. signature.asc (836 bytes) Download Attachment |
Hi Oleg,
I like monads, really I do. I am just not asking a question about how I can (or should) use monads. I guess we can split my question into two branches: (a) why are monads a perfect solution to my problem? (b) can we extend the HM type system to support non-determinism directly? What I am actually interested in is (b), so I don't want to get sidetracked with (a) if it means ignoring (b). I will think if I can respond to (a) without completely getting side-tracked and ignoring (b). Does that make sense? take care, -BenRI P.S. Thanks for the links! _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post. |
(a) Non-determism is an effect, e.g. simple list is non-determinism
monad, for small discrete distributions! (b) Yes. We can write effectful code "implicitly" - You might look into *Automatically Escaping Monads* - https://www.youtube.com/watch?v=wG8AErq6Bbo, slides: http://benl.ouroborus.net/talks/2016-HIW-Escape.pdf - http://disciple.ouroborus.net/ or https://github.com/DDCSF/ddc Interstingly, while searching for the paper, I stumbled upon Oleg Kiselyov's (not me) paper from *Effects Without Monads: Non-determinism*, which is a different approach. Maybe that's what you are looking after http://okmij.org/ftp/tagless-final/nondet-paper.pdf - Oleg On 11.01.2018 21:41, Benjamin Redelings wrote: > Hi Oleg, > > I like monads, really I do. I am just not asking a question about how > I can (or should) use monads. > > I guess we can split my question into two branches: > (a) why are monads a perfect solution to my problem? > (b) can we extend the HM type system to support non-determinism directly? > > What I am actually interested in is (b), so I don't want to get > sidetracked with (a) if it means ignoring (b). I will think if I can > respond to (a) without completely getting side-tracked and ignoring > (b). Does that make sense? > > take care, > > -BenRI > > P.S. Thanks for the links! > _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post. 0xB8BB0BA4.asc (28K) Download Attachment |
In reply to this post by Benjamin Redelings
> 3. So, I'm wondering if its possible to extend the HM type system to > handle non-determinism in a similar fashion by either (i) having some > function types be non-deterministic and/or (ii) having term types be > non-deterministic. Taking the second approach, I suggest tagging each > type with level [D] (for deterministic) or [N] (for > non-deterministic). Notation-wise, if a determinism level is > unspecified, then this means (I think) quantifying over determinism > levels. A function that samples from the normal distribution we would > get a type like: > > normal:: double -> double -> () -> double[N] Keep in mind that I don't know the first thing about type systems. I just babble about what feels right. With this said, this sounds like a completely new kind of term to me. Any by "kind", I mean as in "k" or "*". So some of the original examples could look like {-# LANGUAGE KindSignatures, NonDeterminism #-} normal :: Double -> Double -> () -> Double :: n sample :: (() -> a :: n) -> a :: n iid :: Int -> (() -> a :: n) -> [a :: n] constN :: (a :: *) -> (b :: n) -> (a :: *) -- also this: non-deterministic argument, but deterministic result where "n" is a new kind annotation replacing the [N] and the implicit "*" replaces the [D]. So the rules would have to be on the kind-level I suppose? Cheers. _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post. |
In reply to this post by Benjamin Redelings
On Thu, Jan 11, 2018 at 1:41 PM, Benjamin Redelings
<[hidden email]> wrote: > (a) why are monads a perfect solution to my problem? > (b) can we extend the HM type system to support non-determinism directly? Yes, there are type systems that express effects differently. No, they don't really do anything different than monads would do. The classic paper on the subject is probably Wadler's "The Marriage of Effects and Monads". http://homepages.inf.ed.ac.uk/wadler/papers/effectstocl/effectstocl.pdf That said, there are a variety of other language features building on the idea of effects. Look for any of the literature on algebraic effects or effect handlers. Many of them use non-determinism as an example. /g -- Prosperum ac felix scelus virtus vocatur -- Seneca _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post. |
In reply to this post by Oleg Grenrus
Hi Oleg,
Thanks for the links! These are quite interesting. 1. Here is one situation that occurs in evolutionary biology where I would want to have the full range of Haskell syntax available to me. Consider a binary tree, where each tree node has an integer name in [1..num_nodes tree]. The function (parent tree n) gives the parent of node n and (root tree) gives the root node. -- the expected value for a node is the value at its parent mean node tree x | node == root tree = 0 | otherwise = x!!parent tree node -- given a tree, simulate down the tree, simulate_on_tree tree = let x = [sample $ normal (mean node tree x) 1 | node <- [1..num_nodes tree]] My understanding is that you cannot refer to the result of a computation while performing a computation, as in: do {x <- simulate_on_tree tree x} Am I missing something? On 01/11/2018 12:02 PM, Oleg Grenrus wrote: > (a) Non-determism is an effect, e.g. simple list is non-determinism monad, for small discrete distributions! 2. Why would we want to consider non-determinism (in the sense of returning an unpredictable value) as an effect? Certainly running a non-deterministic function does not change global state like modifying an IORef would. I'm also thinking of functions that are (somehow) TRULY random, so they are not keeping a hidden state around somewhere. I'm calling them "non-deterministic" instead of "random" because I want to ignore (for the moment) the probability distribution, and just say that the result is arbitrary. 3. Sampling from a normal distribution gives ONE value, and the list of possible values is .... large :-) [i.e. it would include all Double values.] > (b) Yes. We can write effectful code "implicitly" > - You might look into *Automatically Escaping Monads* > - https://www.youtube.com/watch?v=wG8AErq6Bbo, slides: > http://benl.ouroborus.net/talks/2016-HIW-Escape.pdf > - http://disciple.ouroborus.net/ or https://github.com/DDCSF/ddc 4. Interesting - I like his approach to making the box / run instructions implicit. > Interstingly, while searching for the paper, I stumbled upon Oleg > Kiselyov's (not me) paper from > *Effects Without Monads: Non-determinism*, which is a different approach. > Maybe that's what you are looking after > http://okmij.org/ftp/tagless-final/nondet-paper.pdf 5. In this paper, it seems that non-determinism means returning ALL possible outcomes. However, what I meant is arbitrarily choosing ONE possible outcome. My terminology is probably being imported from statistics - is there a different word I should use here? -BenRI _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post. |
In reply to this post by Oleg Grenrus
Hi Oleg,
On 01/11/2018 09:17 AM, Oleg Grenrus wrote: > Hi Benjamin > > Let's see what you ask for, you have *new* syntax for types: > > a[N] and a[D] > > what are a[N][N] or a[N][D] or a[N][D] or a[D][D]? > > Aren't they a[N], a[N], a[N] and a[D] respectively? > That's what monads are about! of the type. So a type could be something like the pair (Int,D) or (Int,N). In that context a[N][N] is not part of the system. > > So > > a[N] ~ Distr a > a[D] ~ Identity a ~ a > > No need to complicate type-system! You just to not be afraid of monads! > > Monads aren't sequencing, they are computational context. > > I guess, you just want more natural term-level syntax. > > You can use ApplicativeDo [1] (in GHC-8.0+), so e.g. > > do x <- normal 0 1 > y <- normal 0 1 > return (f x y) > > will be transformed into > > liftA2 f (normal 0 1) (normal 0 1) > > That's almost like > > f (normal 0 1) (normal 0 1) > > if you have proper syntax highlighting ;) > > Note: various term syntax extensions been proposed. > E.g. idiom brackets in the "Applicative programming with effects" [2]: > > (| f (normal 0 1) (normal 0 1) |) > > to mean > > pure f <*> normal 0 1 <*> normal 0 1 > > which is equivalent to above liftA2 expression. If you like that, you > can check > "the Strathclyde Haskell Enhancement", it supports idiom brackets. do { x <- f x } does not work, where as let x = f x does work. Basically I'm trying to avoid monads because I want to use the full features of the Haskell language, instead of programming in an embedded language. In that context "more natural" term-level syntax is not sufficient. Also, it seems possible that everything in Haskell COULD be written in a monad. We could eliminate recursive let bindings, and tell people to create a giant state machine which they use by reading and writing IORefs. But then you also eliminate some of the point of using Haskell and may as well go write in C or something. So it seems to me that just because you CAN use a monad doesn't mean you SHOULD use a monad, and the question is "when is a monad better than something else?" Does that make sense? Am I missing something? -BenRI _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post. |
You seem rather confused as to what a monad is. It is not about "everything is in an IORef", for one. (IO is not the only Monad, nor are the others pretending to be IO.) On Fri, Jan 12, 2018 at 11:38 AM, Benjamin Redelings <[hidden email]> wrote: Hi Oleg, brandon s allbery kf8nh sine nomine associates unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post. |
In reply to this post by Benjamin Redelings
The `random-fu` package has a monad called `Random` to encapsulate/construct random variables (i.e., for simulation). Whether monads are a "perfect" solution to your problem depends on what your random variables represent. In most "finite" cases of jointly distributed random variables, you can get away with just applicative functors. But this wouldn't do if you were modeling something like a normal random walk. The sequencing property isn't something you "need" to care about when you're not using it (unless it causes performance issues). After all, even hand-written math is written down "in order", even mathematical expressions have parse trees with implicit partial orders, etc. On Thu, Jan 11, 2018 at 8:27 AM, Benjamin Redelings <[hidden email]> wrote: Hi, _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post. |
In reply to this post by Benjamin Redelings
Long story short: You can stay inside Haskell's type system or extend it, but you end up with monads either way. I'll explain why.
> (a) why are monads a perfect solution to my problem? Two reasons. First: Probability distributions do form a monad [Giry,CS,Kock]. It's a mathematical fact. So why not exploit it? It gives you powerful combinators and powerful abstractions. Second: Because the requirements on the type modificators [N] and [D] are describing many features monads have. First observe that you don't really need [D] because any deterministic computation can be embedded into the non-deterministic computations by making the set of possibilities a singleton. That is precisely what 'return' does in the context of monads. (Think of return for the list monad.) > any function that takes a non-deterministic input must have a non-deterministic output. That is common with monads, too: There is no generic function that can extract values out of monads. > we should merge variables with identical values if the types are non-deterministic. That is problematic on several levels. First, is "identical values" decidable by the compiler? Second, equality might depend on the implementation, which might change. For example, (normal 0 1) could contain a random number generator with an intrinsic state. > sample:: (()->a[N]) -> a[N] Observe that, disregarding non-terminating computations, any type t is isomorphic to () -> t. Knowing this, your statement seems to imply that a non-deterministic computation is identified with what you can sample from it. > iid:: Int -> (() -> a[N]) -> [a[N]] The function Control.Monad.sequence plays this role for monads, with a slightly different type signature. Control.Monad.sequence . Data.List.repeat :: Monad m => m a -> m [a] Let the name not deceive you: It is not about sequential computations, rather about ordering the random variables sequentially. > (b) can we extend the HM type system to support non-determinism directly? Yes, you can. But then you have a different language. There many publications describing that [Draheim,Lago,Borgström,RP]. In a nutshell, start with the simply typed lambda calculus and add an operation choose :: I -> a -> a -> a where I is the the type of real numbers 0 <= x <= 1. 'choose' makes a weighted probabilistic choice between its second and third argument. However, in order to describe what programs in (lambda calculus + choose) actually _mean_, you need two things: (1) Define what the compiled program should do at runtime (operational semantics). (2) Define what the program means, mathematically (denotational semantics). For (1), the prevailing approach seems to be to equate a probabilistic term with its behaviour under sampling. Markov Chains are popular [MCMC]. I read it's non-trivial, however, to find a Markov chain that behaves according to a mathematically defined probability distribution. The probability that you know more about that than I do is 1. For (2), the mathematical meaning of ordinary Haskell programs are given via the following mapping. Every Haskell type t is associated with a domain D [DOM] and each Haskell function of type t -> t' is associated with a mathematical function f: D -> D' between the associated domains. Whenever non-determinism is involved, e.g. a probabilistic computation on type t, then instead of D one uses P(D), a suitable "powerdomain". There are various P for various sorts of non-determinism (see the work of Gordon Plotkin), and each of them is a monad on the category of domains. A major problem is whether the P for probabilisitc choice works well with e.g. function types. That is why many papers restrict themselves to first-order functions. Another notoriously hard problem is to combine different sorts of non-determinism. It is like combining monads in Haskell: Some monads have monad transformers associated with them, but some don't. Finally, you might want to play with non-determinism other than the probabilistic one. For example, there is the infinite-search package on hackage, which provides a monad of plain non-deterministic choice beyond finite lists. It is even possible to define a Haskell probability monad in the same spirit. I can provide some code if you wish. Regards, Olaf References [Giry] http://dx.doi.org/10.1007/BFb0092872 [CS] http://dx.doi.org/10.1007/s10485-013-9324-9 [Kock] http://www.tac.mta.ca/tac/volumes/26/4/26-04.pdf [Draheim] http://www.springer.com/de/book/9783642551970 [Lago] https://arxiv.org/abs/1104.0195 [Borgström] https://arxiv.org/abs/1512.08990 [RP] http://www.cs.tufts.edu/~nr/pubs/pmonad.pdf [MCMC] http://okmij.org/ftp/kakuritu/Hakaru10/index.html [DOM] http://www.worldscientific.com/worldscibooks/10.1142/6284 _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post. |
In reply to this post by Benjamin Redelings
Hello, Benjamin. Nature of Haskell is to treat random values under monad. Because generator returns different values on calls with the same arguments. So, no way to "declare" such function as "nondeterministic" function. Haskell is not good language for such job. But there are more suitable languages for such kind of tasks. If you prefer Haskell-like syntax, better will be to use ML family language: modern F#, Ocaml or SML, where IO will not be involved - all is under IO explicitly. But there is another good choice: Mercury. It supports (sure, because it's declarative language and is based on Prolog) special notations for computations:
(see
more about these declarations here:
https://www.mercurylang.org/information/doc-release/mercury_ref/Determinism-categories.html#Determinism-categories) But, I'm sure, no problem to develop some DSL for Haskell which will hide some details ;) === Best regards, Paul 12.01.2018 18:26, Benjamin Redelings
пишет:
Hi Oleg, _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post. |
Hello,
Thus quoth PY on Mon Jan 15 2018 at 09:08 (+0100): > > But there is another good choice: *Mercury*. It supports (sure, > because it's declarative language and is based on Prolog) special > notations for computations: If you prefer a more Haskellish syntax, you may also want to look at Curry : http://www-ps.informatik.uni-kiel.de/currywiki/ Curry has a built-in function called "choice" which allows "non-deterministic, set-valued" functions. Now, whichever tool you use, you are probably going to wind up with monads or with monads in disguise, as Olaf points out. (And sometimes disguise may be quite important.) -- Sergiu _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post. signature.asc (497 bytes) Download Attachment |
Hi Ben,
Not sure I understand exactly if this what you want, but if the problem is recursion within monads you might want to take a look at MonadFix or recursive do notation. If I understood more about how they worked I'd give an example, but I don't; I just know they're related to recursion in monads. https://downloads.haskell.org/~ghc/7.8.1/docs/html/users_guide/syntax-extns.html#recursive-do-notation
בתאריך יום ב׳, 15 בינו׳ 2018, 5:53, מאת Sergiu Ivanov <[hidden email]>: Hello, _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post. |
In reply to this post by Olaf Klinke
Hi Olaf,
Thanks! This was helpful. I'll engage below: On 01/13/2018 04:45 PM, Olaf Klinke wrote: > Long story short: You can stay inside Haskell's type system or extend it, but you end up with monads either way. I'll explain why. I might see what you mean about ending up with monads either way. Specifically, I think its important to separate random distributions from their random samples, which might correspond (in a Random-like monad) to the difference between an action and performing the action. Is this something like what you are saying? I think there may still be some extra value to the type system I mentioned in performing CSE (common sub-expression elimination) safely. > >> (a) why are monads a perfect solution to my problem? > Two reasons. First: Probability distributions do form a monad [Giry,CS,Kock]. It's a mathematical fact. So why not exploit it? It gives you powerful combinators and powerful abstractions. They certainly do. I've written a haskell interpreter, and my definitions for a (very hacky) Random monad are here, with two different interpreters for it (sample and sample') below: https://github.com/bredelings/BAli-Phy/blob/9c96374013453fe382e609cca357a2c0f3f154b2/modules/Distributions.hs#L14 I'm currently exploiting the monadic structure. For example, given a distribution dist, I basically use 'sequence (replicate n dist)' to sample n i.i.d. values from it. So, to some extent I could see the fact that Monads provide a 'sequence' function as a benefit. In the framework I proposed we would have to write 'map sample (replicate n dist)', which does not seem very burdensome though. The fact that you have to run these distributions inside of an interpreter also makes some things easy that would be difficult to do otherwise, since the interpreter (a) can carry around modifiable state and (b) creates a call stack, like in call-by-value languages. In contrast if you say something like "let {x = normal 0 1 ()} in x*x", then it doesn't really have a call chain, since the thunk for x can get forced from multiple different contexts that were not responsible for the allocation of the thunk on the heap. Instead it seems that each heap location has a let-allocation chain, but without any intepreter state threaded through the chain. Does that make sense? Is there any literature on the let-allocation chain? It seems like this would come up during debugging. Despite all this, it still seems to me that there might be reasons not to "exploit" the monadic structure of probability distributions, at least not in the traditional fashion. See below. > Second: Because the requirements on the type modificators [N] and [D] are describing many features monads have. First observe that you don't really need [D] because any deterministic computation can be embedded into the non-deterministic computations by making the set of possibilities a singleton. That is precisely what 'return' does in the context of monads. (Think of return for the list monad.) >> any function that takes a non-deterministic input must have a non-deterministic output. > That is common with monads, too: There is no generic function that can extract values out of monads. Hmm.... one reason I'm hesitant to "exploit" the monadic structure of Random is that I don't want to have the [D] values outside the monad and [N] values inside it. The need to use a function like "return" to lift (if that is the right terminology) non-monadic values into the monad, and the need to run [N] terms in an interpreter to un-lift values of the monad seems to be more of an obstruction than a benefit. It means that you can't do things that you could do with normal haskell functions. For example, with a normal haskell function you could write `let {x = f 0 x} in x`. However, with a monadic object you can't write `do {x <- f 0 x; return x}`. Hmm.... I guess you could maybe do `let {x = interpret $ do {y <- f 0 x; return y}} in x` though, where 'interpret' is an interpreter for the monad. Hmm.... I don't know. This seems weird. It certainly seems more verbose than `let {x = sample $ f 0 x} in x`. OK! So, let's say that 'unsafePerformIO' is an interpreter for the IO monad. In my non-monadic framework, I am suggesting that we define: sample:: (() -> a[N]) -> a[N] sample dist = dist () In the monadic framework, we define something like random_iterpreter = Random a -> IO a sample :: Random a -> a sample dist = (unsafePerformIO . random_iterpreter) dist Then, I think we get equivalent *expressiveness* without modifying the HM type system. Furthermore, I think that (unsafePerformIO . random_interpreter) can be completely safe if we imagine that we can generate truly random variables somehow. However, I think that the extension to the HM type system still is useful in solving some problems that are created by using unsafePerformIO with CSE (see below). > >> we should merge variables with identical values if the types are non-deterministic. > That is problematic on several levels. First, is "identical values" decidable by the compiler? Second, equality might depend on the implementation, which might change. I'm assuming we only consider expressions equivalent if they have the same syntax tree. (So that should avoid problems with overloading ==). This leaves the second problem, you mention: > For example, (normal 0 1) could contain a random number generator with an intrinsic state. If we have "let {x=normal 0 1;y=normal 0 1} in E" then the two normal 0 1 actions could be executed in either order. However, for random samples I think that is not a problem in some situations, since the distributions would be the same. However, regardless of whether we care about ordering, it is definitely wrong to merge x and y to get "let {x=normal 0 1} in E[y := x]". I think this is the problem you are talking about. But this problem is exactly what I am proposing a solution for! The idea is that 'normal' would have type 'double -> double -> double[N]', and therefore merging the expressions would be prohibited by the rule that says we cannot merge two identical expressions of type a[N] (see my original e-mail). Interestingly, this could maybe used to handle cases like "let {x = unsafePerformIO $ readchar file; y = unsafePerformIO $ readchar file} in E". We could define unsafePerformIO as having type unsafePerformIO: IO a -> a[N]. This would NOT solve the problem that the code could perform the readchar's in either order, but it WOULD avoid merging x & y. I guess one question is: does GHC avoid this merger already? And, if so, does it avoid this merger by refusing to merge variables? If GHC refuses to merge variables with identical ASTs that call unsafePerformIO then I would assert that it is ALREADY using the type system I am proposing. >> sample:: (()->a[N]) -> a[N] > Observe that, disregarding non-terminating computations, any type t is isomorphic to () -> t. Knowing this, your statement seems to imply that a non-deterministic computation is identified with what you can sample from it. Hmm... I'm not sure this is right. I agree that type t[D] is isomorphic with () -> t[D], because it is easy enough to come up with an isomorphism f where (f (f_inverse value)) = value, and also (f_inverse (f (f_inverse value))) = value. f::forall level.(()->t[level]) -> t[level] f dist = dist () f::forall level.t[level] -> () -> t[level] f_inverse value = \() -> value But the whole point of having t[N] is that ()->t[N] should not be isomorphic to t[N]. Thus, if we evaluate let {x = f (f_inverse E); y = f (f_inverse E) in F} where E is non-deterministic, then I think x and y can be different. I think this means that f is not an isomorphism when applied to non-deterministic expression, so that you should say "disregarding non-terminating or non-deterministic" computations. There are some complications here, in that I am quantifying over [N,D] levels in the definitions of f and f_inverse, so that they take their [N,D] levels from their arguments. I am treating variables as non-values, since they stand for entire expressions that might be substituted for them. My hope is that this allows placing the [N,D] levels on the input and result types instead of on the arrow, but there could be problems with this. It is counter-intuitive, since normal 0 1 () could yield the value 2, and 2 itself is not random, but was obtained randomly. But I think that the system works if implemented, though it might not match the standard interpretation of types? > >> iid:: Int -> (() -> a[N]) -> [a[N]] > The function Control.Monad.sequence plays this role for monads, with a slightly different type signature. > Control.Monad.sequence . Data.List.repeat :: Monad m => m a -> m [a] > Let the name not deceive you: It is not about sequential computations, rather about ordering the random variables sequentially. Hmm... I don't completely understand this. Are you saying that E1 >>= (\x -> E2) does not require that E1 is performed before E2? That seems possible only if E2 does not use x. But maybe I'm missing something. > >> (b) can we extend the HM type system to support non-determinism directly? > Yes, you can. But then you have a different language. There many publications describing that [Draheim,Lago,Borgström,RP]. In a nutshell, start with the simply typed lambda calculus and add an operation Hmm.... I intentionally attempt to define "non-deterministic" expressions instead of "probabilistic" expressions precisely to avoid this problem :-) Thus, my version of 'choose' simply states that 'choose 1 2' can return either 1 or 2. It does not say anything about the probability of returning either choice, only that both are "valid" reductions. Therefore, my version of choose would have type: choose: a -> a -> a[N]. > choose :: I -> a -> a -> a > > where I is the the type of real numbers 0 <= x <= 1. 'choose' makes a weighted probabilistic choice between its second and third argument. I'm not completely sure of the role of the real number here. Are we saying that 'choose 0.6 0 1' would (for example) return 0 with probability 0.6 and 1 with probability 0.4? It seems that you could maybe define a function that takes 2 random numbers: choose :: I -> I -> a -> a -> a where (a) choose u p 0 1 would return 0 if u<p and 1 otherwise. (b) the number u is *implicitly* supplied to the function. So, the user writes "choose 0.6 0 1". (c) the type of main would be Main: I -> IO (). This makes the program deterministic if we know the value of the real number, but probabilistic if we supply a uniform 0 1 random number. > However, in order to describe what programs in (lambda calculus + choose) actually _mean_, you need two things: > (1) Define what the compiled program should do at runtime (operational semantics). > (2) Define what the program means, mathematically (denotational semantics). Hmm.... yeah I think the denotational semantics could be quite hard. I think that Chung Chieh Shan is working on some aspect of this this, especially measures on R^n. > For (1), the prevailing approach seems to be to equate a probabilistic term with its behaviour under sampling. Markov Chains are popular [MCMC]. I read it's non-trivial, however, to find a Markov chain that behaves according to a mathematically defined probability distribution. The probability that you know more about that than I do is 1. Ha ha :-) Yes, designing MCMC approaches to sample from a distribution is hard. Obviously for some expressions like sampling from a normal you don't need MCMC. I would imagine that one approach to this is simply only allow primitives that you can get by transforming a Uniform[0,1] random variable, or perhaps fairly simple rejection sampling. Then maybe MCMC-type things would be implemented as probabilistic programs (e.g. not primitive operations). > For (2), the mathematical meaning of ordinary Haskell programs are given via the following mapping. Every Haskell type t is associated with a domain D [DOM] and each Haskell function of type t -> t' is associated with a mathematical function f: D -> D' between the associated domains. Whenever non-determinism is involved, e.g. a probabilistic computation on type t, then instead of D one uses P(D), a suitable "powerdomain". Hmm... I'm guessing (having not looked at the papers below yet except kind of the Haraku paper) that the powerdomain for Double is the measurable sets (aka Borel sets) for the Reals... > There are various P for various sorts of non-determinism (see the work of Gordon Plotkin), and each of them is a monad on the category of domains. A major problem is whether the P for probabilisitc choice works well with e.g. function types. That is why many papers restrict themselves to first-order functions. Another notoriously hard problem is to combine different sorts of non-determinism. It is like combining monads in Haskell: Some monads have monad transformers associated with them, but some don't. Interesting. I'll take a look at the papers you mention. I have wondered where some of the restrictions come from. > Finally, you might want to play with non-determinism other than the probabilistic one. For example, there is the infinite-search package on hackage, which provides a monad of plain non-deterministic choice beyond finite lists. It is even possible to define a Haskell probability monad in the same spirit. I can provide some code if you wish. Thanks so much for explaining things to me in detail, and for the helpful links! -BenRI > Regards, > Olaf > > References > [Giry] http://dx.doi.org/10.1007/BFb0092872 > [CS] http://dx.doi.org/10.1007/s10485-013-9324-9 > [Kock] http://www.tac.mta.ca/tac/volumes/26/4/26-04.pdf > [Draheim] http://www.springer.com/de/book/9783642551970 > [Lago] https://arxiv.org/abs/1104.0195 > [Borgström] https://arxiv.org/abs/1512.08990 > [RP] http://www.cs.tufts.edu/~nr/pubs/pmonad.pdf > [MCMC] http://okmij.org/ftp/kakuritu/Hakaru10/index.html > [DOM] http://www.worldscientific.com/worldscibooks/10.1142/6284 _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post. |
Benjamin,
I am a domain theorist and not an expert on writing compilers/interpreters. Hence I can not comment on your proposals of call-by-value and interpreter state. What I wanted to say with my catchphrase is that you have to do the same amount of work. But in the first case, where you have a DSL, the work (= the random monad) is visible to the programmer, whereas in the second case only the author of the compiler gets to see the monad. But the denotational semantics can be the same. If you go for a DSL (which is specifically what you want ot avoid), then a Kleisli category is how it is done. Look at Kleisli in Control.Arrow. For non-probabilistic non-determinism, take the category (Kleisli []). Every morphism a -> b then really is a Haskell function a -> [b], but the Arrow notation hides this. Then choose has the rather trivial implementation choose :: Kleisli [] (a,a) a choose = Kleisli (\(x,y) -> [x,y]) You could abstract this into a type class: class (Arrow a) => Nondet a where choose :: a (x,x) x Let's try this: import Prelude hiding ((.),id) import Control.Category import Control.Monad data KleisliDN m x y = D (x -> y) | N (x -> m y) instance Monad m => Category (KleisliDN m) where id = D id (D g) . (D f) = D (g . f) (N g) . (N f) = N (g <=< f) (D g) . (N f) = N (liftM g . f) (N g) . (D f) = N (g . f) data DN m x = Det x | Nondet (m x) Observe that Kleisli m () x is isomorophic to DN m x, so here you have explicit tagging with D and N on the value-level. If you want the compiler to reject certain combinations of D and N, then write a typeclass class (Category c, Category d) => Subcategory c d where liftMorphism :: c a b -> d a b instance Monad m => Subcategory (->) (Kleisli m) where liftMorphism f = Kleisli (return.f) and use the category (->) for D and Kleisli m for N. > Am 15.01.2018 um 19:17 schrieb Benjamin Redelings <[hidden email]>: > > But this problem is exactly what I am proposing a solution for! The idea is that 'normal' would have type 'double -> double -> double[N]', and therefore merging the expressions would be prohibited by the rule that says we cannot merge two identical expressions of type a[N] (see my original e-mail). > > Interestingly, this could maybe used to handle cases like "let {x = unsafePerformIO $ readchar file; y = unsafePerformIO $ readchar file} in E". We could define unsafePerformIO as having type > > unsafePerformIO: IO a -> a[N]. > > This would NOT solve the problem that the code could perform the readchar's in either order, but it WOULD avoid merging x & y. I guess one question is: does GHC avoid this merger already? And, if so, does it avoid this merger by refusing to merge variables? If GHC refuses to merge variables with identical ASTs that call unsafePerformIO then I would assert that it is ALREADY using the type system I am proposing. > >>> sample:: (()->a[N]) -> a[N] >> Observe that, disregarding non-terminating computations, any type t is isomorphic to () -> t. Knowing this, your statement seems to imply that a non-deterministic computation is identified with what you can sample from it. > Hmm... I'm not sure this is right. I agree that type t[D] is isomorphic with () -> t[D], because it is easy enough to come up with an isomorphism f where (f (f_inverse value)) = value, and also (f_inverse (f (f_inverse value))) = value. > > f::forall level.(()->t[level]) -> t[level] > f dist = dist () > > f::forall level.t[level] -> () -> t[level] > f_inverse value = \() -> value > > But the whole point of having t[N] is that ()->t[N] should not be isomorphic to t[N]. Thus, if we evaluate let {x = f (f_inverse E); y = f (f_inverse E) in F} where E is non-deterministic, then I think x and y can be different. toKleisli :: DN m x -> KleisliDN m () x toKleisli (Det x) = D (const x) toKleisli (Nondet mx) = N (const mx) fromKleisli :: KleisliDN m () x -> DN m x fromKleisli (D f) = Det (f ()) fromKleisli (N f) = Nondet (f ()) () -> t[N] takes a unit and returns a process randomly generating a t. Since there is only one unit, there is only one process described by this function. I've seen functions of type () -> t in OCaml where they are used to bring lazy evaluation into an otherwise strict language. > I think this means that f is not an isomorphism when applied to non-deterministic expression, so that you should say "disregarding non-terminating or non-deterministic" computations. > > There are some complications here, in that I am quantifying over [N,D] levels in the definitions of f and f_inverse, so that they take their [N,D] levels from their arguments. I am treating variables as non-values, since they stand for entire expressions that might be substituted for them. My hope is that this allows placing the [N,D] levels on the input and result types instead of on the arrow, but there could be problems with this. It is counter-intuitive, since normal 0 1 () could yield the value 2, and 2 itself is not random, but was obtained randomly. But I think that the system works if implemented, though it might not match the standard interpretation of types? I don't get all of what you are saying here. But it reminds me of monadic bind. When you say: do x <- normal 0 1 if x == 2 then return True else return False then the resulting Boolean is still in the monad, and the type of the overall expression tells you that the 2 is a random one. > >> choose :: I -> a -> a -> a >> >> where I is the the type of real numbers 0 <= x <= 1. 'choose' makes a weighted probabilistic choice between its second and third argument. > I'm not completely sure of the role of the real number here. Are we saying that 'choose 0.6 0 1' would (for example) return 0 with probability 0.6 and 1 with probability 0.4? Exactly. And that is all you need for probabilistic choice, in the following categorical sense. The probabilistic powerdomain of a domain D is the free convex cone over D. A convex cone is a structure that supports the operation choose and multiplication by real numbers from the unit interval [0,1]. (Here we allow total probabilities less than 1, otherwise we don't have a domain. The true probability distributions sit at the top of this domain. The free convex cone over the two-element set {x,y} is a triangle, with "certainly x" at the top left, "certainly y" at the top right, and the constant zero measure at the bottom corner.) > It seems that you could maybe define a function that takes 2 random numbers: > > choose :: I -> I -> a -> a -> a > > where > (a) choose u p 0 1 would return 0 if u<p and 1 otherwise. But how is choose probabilistic, then? It's just an if-statement in disguise! > (b) the number u is *implicitly* supplied to the function. So, the user writes "choose 0.6 0 1". > (c) the type of main would be Main: I -> IO (). This makes the program deterministic if we know the value of the real number, but probabilistic if we supply a uniform 0 1 random number. That sounds like PRNG to me. If you know the seed, you know the outcome. > >> However, in order to describe what programs in (lambda calculus + choose) actually _mean_, you need two things: >> (1) Define what the compiled program should do at runtime (operational semantics). >> (2) Define what the program means, mathematically (denotational semantics). > Hmm.... yeah I think the denotational semantics could be quite hard. I think that Chung Chieh Shan is working on some aspect of this this, especially measures on R^n. That was my point: With monads, the denotational semantics is easy. The operational semantics is hard. > >> For (2), the mathematical meaning of ordinary Haskell programs are given via the following mapping. Every Haskell type t is associated with a domain D [DOM] and each Haskell function of type t -> t' is associated with a mathematical function f: D -> D' between the associated domains. Whenever non-determinism is involved, e.g. a probabilistic computation on type t, then instead of D one uses P(D), a suitable "powerdomain". > Hmm... I'm guessing (having not looked at the papers below yet except kind of the Haraku paper) that the powerdomain for Double is the measurable sets (aka Borel sets) for the Reals... No, the probabilistic powerspace of D is the set of all measures on the Borel sigma-algebra of D, ordered pointwise. That is, f <= g if f(A) <= g(A) for all measurable sets A. The beautiful thing about this is that the monadic bind is precisely integration against a measure. >> There are various P for various sorts of non-determinism (see the work of Gordon Plotkin), and each of them is a monad on the category of domains. A major problem is whether the P for probabilisitc choice works well with e.g. function types. That is why many papers restrict themselves to first-order functions. Another notoriously hard problem is to combine different sorts of non-determinism. It is like combining monads in Haskell: Some monads have monad transformers associated with them, but some don't. > Interesting. I'll take a look at the papers you mention. I have wondered where some of the restrictions come from. It's whack-a-mole of the categorical kind. There are various subcategories of domains. For example, all Haskell types map to domains that have the additional property "bifinite". One can show that the probabilistic powerdomain of any domain is a domain, but the probabilistic powerdomain of a bifinite domain is never bifinite. Hence there can be no Haskell type that faithfully represents all the random computations of a given type. It is known that the probabilistic powerdomain of a "continuous" domain is again continuous. Sadly, the continuous domains don't have function spaces that are continuous. Thus you get a semantics for a language with choice, but where functions are not first-class citizens. Olaf _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post. |
Free forum by Nabble | Edit this page |