Re: [Haskell] [Fwd: undecidable & overlapping instances: a bug?]

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: [Haskell] [Fwd: undecidable & overlapping instances: a bug?]

Mark P. Jones
[Sorry, I guess this should have been in the cafe ...]

Simon Peyton-Jones wrote:

> The trouble is that
> a) the coverage condition ensures that everything is well behaved
> b) but it's too restrictive for some uses of FDs, notably the MTL library
> c) there are many possibilities for more generous conditions, but
>         the useful ones all seem complicated
>
> Concerning the last point I've dumped the current brand leader
> for (c) into http://hackage.haskell.org/trac/ghc/ticket/1241#comment:15.
>
> Better ideas for (c) would be welcome.

Let's take the declaration:  "instance P => C t where ..."
The version of the "coverage condition" in my paper [1] requires
that TV(t_Y) \subseteq TV(t_X), for each dependency (X->Y) \in F_C.
(I'm using the notation from the paper; let me know if you need more
help to parse it.)  This formulation is simple and sound, but it
doesn't use any dependency information that could be extracted from P.
To remedy this, calculate L = F_P, the set of functional dependencies
induced by P, and then expand the right hand side of the set inequality
above by taking the closure of TV(t_X) with respect to L.  In symbols,
we have to check that:

   TV(t_Y) \subseteq TV(t_X)^+_L, for each (X->Y) \in F_C.

I believe (but haven't formally proved) that this is sound; I don't
know how to make a more general "coverage condition" without losing
that.  I don't know if it's sufficient for examples like MTL (because
I'm not sure where to look for details of what that requires), but
if it isn't then I'd be very suspicious ...

All the best,
Mark

[1] http://www.cs.pdx.edu/~mpj/pubs/fundeps-esop2000.pdf
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: [Haskell] [Fwd: undecidable & overlapping instances: a bug?]

Martin Sulzmann
Mark P Jones writes:
 > [Sorry, I guess this should have been in the cafe ...]
 >
 > Simon Peyton-Jones wrote:
 > > The trouble is that
 > > a) the coverage condition ensures that everything is well behaved
 > > b) but it's too restrictive for some uses of FDs, notably the MTL library
 > > c) there are many possibilities for more generous conditions, but
 > >         the useful ones all seem complicated
 > >
 > > Concerning the last point I've dumped the current brand leader
 > > for (c) into http://hackage.haskell.org/trac/ghc/ticket/1241#comment:15.
 > >
 > > Better ideas for (c) would be welcome.
 >
 > Let's take the declaration:  "instance P => C t where ..."
 > The version of the "coverage condition" in my paper [1] requires
 > that TV(t_Y) \subseteq TV(t_X), for each dependency (X->Y) \in F_C.
 > (I'm using the notation from the paper; let me know if you need more
 > help to parse it.)  This formulation is simple and sound, but it
 > doesn't use any dependency information that could be extracted from P.
 > To remedy this, calculate L = F_P, the set of functional dependencies
 > induced by P, and then expand the right hand side of the set inequality
 > above by taking the closure of TV(t_X) with respect to L.  In symbols,
 > we have to check that:
 >
 >    TV(t_Y) \subseteq TV(t_X)^+_L, for each (X->Y) \in F_C.
 >
 > I believe (but haven't formally proved) that this is sound; I don't
 > know how to make a more general "coverage condition" without losing
 > that.  I don't know if it's sufficient for examples like MTL (because
 > I'm not sure where to look for details of what that requires), but
 > if it isn't then I'd be very suspicious ...
 >
 > All the best,
 > Mark
 >
 > [1] http://www.cs.pdx.edu/~mpj/pubs/fundeps-esop2000.pdf


I think the above is equivalent to the (refined) weak coverage
condition in [2] (see Section 6, p26). The weak coverage condition
give us soundness. More precisely, we obtain confluence from which we
can derive consistency (which in turn guarantees that the type class
program is sound).

*BUT* this only works if we have termination which is often very
tricky to establish.

For the example,

> class Concrete a b | a -> b where
>         bar :: a -> String
>
> instance (Show a) => Concrete a b

termination holds, but the weak coverage condition does *not*
hold. Indeed, this program should be therefore rejected.

Martin
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: [Haskell] [Fwd: undecidable & overlapping instances: a bug?]

Martin Sulzmann

Sorry, forgot to add

[2]
http://www.comp.nus.edu.sg/~sulzmann/publications/jfp-fds-revised.pdf

Martin


Martin Sulzmann writes:
 > Mark P Jones writes:
 >  > [Sorry, I guess this should have been in the cafe ...]
 >  >
 >  > Simon Peyton-Jones wrote:
 >  > > The trouble is that
 >  > > a) the coverage condition ensures that everything is well behaved
 >  > > b) but it's too restrictive for some uses of FDs, notably the MTL library
 >  > > c) there are many possibilities for more generous conditions, but
 >  > >         the useful ones all seem complicated
 >  > >
 >  > > Concerning the last point I've dumped the current brand leader
 >  > > for (c) into http://hackage.haskell.org/trac/ghc/ticket/1241#comment:15.
 >  > >
 >  > > Better ideas for (c) would be welcome.
 >  >
 >  > Let's take the declaration:  "instance P => C t where ..."
 >  > The version of the "coverage condition" in my paper [1] requires
 >  > that TV(t_Y) \subseteq TV(t_X), for each dependency (X->Y) \in F_C.
 >  > (I'm using the notation from the paper; let me know if you need more
 >  > help to parse it.)  This formulation is simple and sound, but it
 >  > doesn't use any dependency information that could be extracted from P.
 >  > To remedy this, calculate L = F_P, the set of functional dependencies
 >  > induced by P, and then expand the right hand side of the set inequality
 >  > above by taking the closure of TV(t_X) with respect to L.  In symbols,
 >  > we have to check that:
 >  >
 >  >    TV(t_Y) \subseteq TV(t_X)^+_L, for each (X->Y) \in F_C.
 >  >
 >  > I believe (but haven't formally proved) that this is sound; I don't
 >  > know how to make a more general "coverage condition" without losing
 >  > that.  I don't know if it's sufficient for examples like MTL (because
 >  > I'm not sure where to look for details of what that requires), but
 >  > if it isn't then I'd be very suspicious ...
 >  >
 >  > All the best,
 >  > Mark
 >  >
 >  > [1] http://www.cs.pdx.edu/~mpj/pubs/fundeps-esop2000.pdf
 >
 >
 > I think the above is equivalent to the (refined) weak coverage
 > condition in [2] (see Section 6, p26). The weak coverage condition
 > give us soundness. More precisely, we obtain confluence from which we
 > can derive consistency (which in turn guarantees that the type class
 > program is sound).
 >
 > *BUT* this only works if we have termination which is often very
 > tricky to establish.
 >
 > For the example,
 >
 > > class Concrete a b | a -> b where
 > >         bar :: a -> String
 > >
 > > instance (Show a) => Concrete a b
 >
 > termination holds, but the weak coverage condition does *not*
 > hold. Indeed, this program should be therefore rejected.
 >
 > Martin
 > _______________________________________________
 > Haskell-Cafe mailing list
 > [hidden email]
 > http://www.haskell.org/mailman/listinfo/haskell-cafe
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: [Haskell] [Fwd: undecidable & overlapping instances: a bug?]

Iavor Diatchki
In reply to this post by Martin Sulzmann
Hello,

I believe that this "weak coverage condition" (which is also called
"the dependency condition" somewhere on the wiki) is exactly what GHC
6.4 used to implement but than in 6.6 this changed.  According to
Simon's comments on the trac ticket, this rule requires FDs to be
"full" to preserve the confluence of the system that is described in
the "Understanding Fds via CHRs" paper.  I have looked at the paper
many times, and I am very unclear on this point, any enlightenment
would be most appreciated (by the way, there was a post about that on
the haskell-prime list a couple of days ago, which contains a concrete
example as well).

To answer Mark's question, this rule provides enough power for mtl and
monadLib.  They use classes like "class StateM m s | m -> s" and
instances like "instance StateM m s => StateM (ExceptionT m) s".  (The
full source code for monadLib is at
http://www.galois.com/~diatchki/monadLib/monadLib-3.3.0/src/MonadLib.hs)

Martin, could you elaborate on the problem with non-termination?  I
have seen examples where the type-checker could loop while trying to
prove things, but I was not aware that there were implications related
to soundness as well.

-Iavor


On 10/18/07, Martin Sulzmann <[hidden email]> wrote:

> Mark P Jones writes:
>  > [Sorry, I guess this should have been in the cafe ...]
>  >
>  > Simon Peyton-Jones wrote:
>  > > The trouble is that
>  > > a) the coverage condition ensures that everything is well behaved
>  > > b) but it's too restrictive for some uses of FDs, notably the MTL library
>  > > c) there are many possibilities for more generous conditions, but
>  > >         the useful ones all seem complicated
>  > >
>  > > Concerning the last point I've dumped the current brand leader
>  > > for (c) into http://hackage.haskell.org/trac/ghc/ticket/1241#comment:15.
>  > >
>  > > Better ideas for (c) would be welcome.
>  >
>  > Let's take the declaration:  "instance P => C t where ..."
>  > The version of the "coverage condition" in my paper [1] requires
>  > that TV(t_Y) \subseteq TV(t_X), for each dependency (X->Y) \in F_C.
>  > (I'm using the notation from the paper; let me know if you need more
>  > help to parse it.)  This formulation is simple and sound, but it
>  > doesn't use any dependency information that could be extracted from P.
>  > To remedy this, calculate L = F_P, the set of functional dependencies
>  > induced by P, and then expand the right hand side of the set inequality
>  > above by taking the closure of TV(t_X) with respect to L.  In symbols,
>  > we have to check that:
>  >
>  >    TV(t_Y) \subseteq TV(t_X)^+_L, for each (X->Y) \in F_C.
>  >
>  > I believe (but haven't formally proved) that this is sound; I don't
>  > know how to make a more general "coverage condition" without losing
>  > that.  I don't know if it's sufficient for examples like MTL (because
>  > I'm not sure where to look for details of what that requires), but
>  > if it isn't then I'd be very suspicious ...
>  >
>  > All the best,
>  > Mark
>  >
>  > [1] http://www.cs.pdx.edu/~mpj/pubs/fundeps-esop2000.pdf
>
>
> I think the above is equivalent to the (refined) weak coverage
> condition in [2] (see Section 6, p26). The weak coverage condition
> give us soundness. More precisely, we obtain confluence from which we
> can derive consistency (which in turn guarantees that the type class
> program is sound).
>
> *BUT* this only works if we have termination which is often very
> tricky to establish.
>
> For the example,
>
> > class Concrete a b | a -> b where
> >         bar :: a -> String
> >
> > instance (Show a) => Concrete a b
>
> termination holds, but the weak coverage condition does *not*
> hold. Indeed, this program should be therefore rejected.
>
> Martin
>
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

RE: [Haskell] [Fwd: undecidable & overlapping instances: a bug?]

Simon Peyton Jones
| I believe that this "weak coverage condition" (which is also called
| "the dependency condition" somewhere on the wiki) is exactly what GHC
| 6.4 used to implement but than in 6.6 this changed.  According to
| Simon's comments on the trac ticket, this rule requires FDs to be
| "full" to preserve the confluence of the system that is described in
| the "Understanding Fds via CHRs" paper.  I have looked at the paper
| many times, and I am very unclear on this point, any enlightenment
| would be most appreciated.

Right.  In the language of http://hackage.haskell.org/trac/ghc/ticket/1241, last comment (date 10/17/07), it would be *great* to

        replace the Coverage Condition (CC)
        with the Weak Coverage Condition (WCC)

as Mark suggests.  BUT to retain soundness we can only replace CC with
        WCC + B
WCC alone without (B) threatens soundness.  To retain termination as well (surely desirable) we must have
        WCC + B + C

(You'll have to look at the ticket to see what B,C are.)

And since A+B+C are somewhat onerous, we probably want to have
        CC  or  (WCC + B + C)


At least we know of nothing else weaker that will do (apart from CC of course).


Like you, Iavor, I find it very hard to internalise just why (B) and (C) are important.  But I believe the paper gives examples of why they are, and Martin is getting good at explaining it. Martin: can you give an example, once more, of the importance of (B) (=fullness)?

Anyway that's why I have been moving slowly in "fixing" GHC: the rule
        CC or (WCC + B + C)
just seems too baroque.

Simon

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

RE: [Haskell] [Fwd: undecidable & overlapping instances: a bug?]

Martin Sulzmann
Simon Peyton-Jones writes:
 > | I believe that this "weak coverage condition" (which is also called
 > | "the dependency condition" somewhere on the wiki) is exactly what GHC
 > | 6.4 used to implement but than in 6.6 this changed.  According to
 > | Simon's comments on the trac ticket, this rule requires FDs to be
 > | "full" to preserve the confluence of the system that is described in
 > | the "Understanding Fds via CHRs" paper.  I have looked at the paper
 > | many times, and I am very unclear on this point, any enlightenment
 > | would be most appreciated.
 >
 > Right.  In the language of http://hackage.haskell.org/trac/ghc/ticket/1241, last comment (date 10/17/07), it would be *great* to
 >
 >         replace the Coverage Condition (CC)
 >         with the Weak Coverage Condition (WCC)
 >
 > as Mark suggests.  BUT to retain soundness we can only replace CC with
 >         WCC + B
 > WCC alone without (B) threatens soundness.  To retain termination as well (surely desirable) we must have
 >         WCC + B + C
 >
 > (You'll have to look at the ticket to see what B,C are.)
 >
 > And since A+B+C are somewhat onerous, we probably want to have
 >         CC  or  (WCC + B + C)
 >
 >
 > At least we know of nothing else weaker that will do (apart from CC of course).
 >
 >
 > Like you, Iavor, I find it very hard to internalise just why (B) and (C) are important.  But I believe the paper gives examples of why they are, and Martin is getting good at explaining it. Martin: can you give an example, once more, of the importance of (B) (=fullness)?
 >

Fullness (B) is a necessary condition to guarantee that the constraint
solver (aka CHR solver) derived from the type class program is confluent.

Here's an example (taken from the paper).

  class F a b c | a->b
  instance F Int Bool Char
  instance F a b Bool => F [a] [b] Bool

The FD is not full because the class parameter c is not involved in
the FD. We will show now that the CHR solver is not confluent.

Here is the translation to CHRs (see the paper for details)

  rule F a b1 c, F a b2 d  ==> b1=b2      -- (FD)
  rule F Int Bool Char    <==> True       -- (Inst1)
  rule F Int a b           ==> a=Bool     -- (Imp1)
  rule F [a] [b] Bool     <==> F a b Bool -- (Inst2)
  rule F [a] c d           ==> c=[b]      -- (Imp2)  


The above CHRs are not confluent. For example,
there are two possible CHR derivations for the initial
constraint store F [a] [b] Bool, F [a] b2 d

    F [a] [b] Bool, F [a] b2 d
-->_FD (means apply the FD rule)
    F [a] [b] Bool, F [a] [b] d , b2=[b]
--> Inst2
    F a b Bool, F [a] [b] d , b_2=[b]         (*)


Here's the second CHR derivation

    F [a] [b] Bool, F [a] b2 d
-->_Inst2
    F a b Bool, F [a] b2 d
-->_Imp2
    F a b Bool, F [a] [c] d, b2=[c]           (**)


(*) and (**) are final stores (ie no further CHR are applicable).
Unfortunately, they are not logically equivalent (one store says
b2=[b] whereas the other store says b2=[c]).

To conclude, fullness is a necessary condition to establish confluence
of the CHR solver. Confluence is vital to guarantee completeness of
type inference.


I don't think that fullness is an onerous condition.

Martin

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: [Haskell] [Fwd: undecidable & overlapping instances: a bug?]

Iavor Diatchki
Hello,

On 10/19/07, Martin Sulzmann <[hidden email]> wrote:

> Simon Peyton-Jones writes:
>  > ...
>  > Like you, Iavor, I find it very hard to internalise just why (B) and (C) are important.  But I believe the paper gives examples of why they are, and Martin is getting good at explaining it. Martin: can you give an example, once more, of the importance of (B) (=fullness)?
>  >
>
> Fullness (B) is a necessary condition to guarantee that the constraint
> solver (aka CHR solver) derived from the type class program is confluent.
>
> Here's an example (taken from the paper).
>
>   class F a b c | a->b
>   instance F Int Bool Char
>   instance F a b Bool => F [a] [b] Bool
>
> The FD is not full because the class parameter c is not involved in
> the FD. We will show now that the CHR solver is not confluent.
>
> Here is the translation to CHRs (see the paper for details)
>
>   rule F a b1 c, F a b2 d  ==> b1=b2      -- (FD)
>   rule F Int Bool Char    <==> True       -- (Inst1)
>   rule F Int a b           ==> a=Bool     -- (Imp1)
>   rule F [a] [b] Bool     <==> F a b Bool -- (Inst2)
>   rule F [a] c d           ==> c=[b]      -- (Imp2)
>
>
> The above CHRs are not confluent. For example,
> there are two possible CHR derivations for the initial
> constraint store F [a] [b] Bool, F [a] b2 d
>
>     F [a] [b] Bool, F [a] b2 d
> -->_FD (means apply the FD rule)
>     F [a] [b] Bool, F [a] [b] d , b2=[b]
> --> Inst2
>     F a b Bool, F [a] [b] d , b_2=[b]         (*)
>
>
> Here's the second CHR derivation
>
>     F [a] [b] Bool, F [a] b2 d
> -->_Inst2
>     F a b Bool, F [a] b2 d
> -->_Imp2
>     F a b Bool, F [a] [c] d, b2=[c]           (**)
>
>
> (*) and (**) are final stores (ie no further CHR are applicable).
> Unfortunately, they are not logically equivalent (one store says
> b2=[b] whereas the other store says b2=[c]).

But what is wrong with applying the following logical reasoning:

Starting with (**):
F a b Bool, F [a] [c] d, b2=[c]
(by inst2)
F a b Bool, F [a] [c] d, b2=[c], F [a] [b] Bool
(by FD)
F a b Bool, F [a] [c] d, b2=[c], F [a] [b] Bool, [c] = [b]
(applying equalities and omitting unnecessary predicates)
F [a] [b] Bool, F [a] b2 d
(...and then follow your argument to reach (*)...)

Alternatively, if we start at (*):
F a b Bool, F [a] [b] d , b_2=[b]
(by inst2)
F a b Bool, F [a] [b] d , b_2=[b], F [a] [b] Bool
(applying equalities, rearranging, and omitting unnecessary predicates)
F [a] [b] Bool, F [a] b_2 d
(... and then follow you reasoning to reach (**) ...)

So it would appear that the two sets of predicates are logically equivalent.

> To conclude, fullness is a necessary condition to establish confluence
> of the CHR solver. Confluence is vital to guarantee completeness of
> type inference.
>
>
> I don't think that fullness is an onerous condition.

I agree with you that, in practice, many classes probably use "full"
FDs.  However, these extra conditions make the system more
complicated, and we should make clear what exactly are we getting in
return for the added complexity.

-Iavor
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: [Haskell] [Fwd: undecidable & overlapping instances: a bug?]

Mark P. Jones
In reply to this post by Martin Sulzmann
Hi All,

Here are my responses to the recent messages, starting with some
summary comments:

- I agree with Martin that the condition I posted a few days ago is
   equivalent to the "*refined* weak coverage condition" in your
   paper.  The "refined" tag here is important---I missed it the
   first time and thought he was referring to the WCC at the start of
   Section 6.1, and I would strongly recommend against using that
   (unrefined) version.  I confess that I haven't read the paper
   carefully enough to understand why you included that first WCC at
   all; I don't see any reason why you would want to use that
   particular condition in a practical implementation, and I don't
   see any advantages to it from a theoretical perspective either.
   (Maybe you just used it as a stepping stone to help motivate the
   refined WCC?)

- I believe that the refined WCC (or the CC or even the original WCC
   if you really want) are sufficient to ensure soundness.  I don't
   agree that the (B) restriction to "full" FDs is necessary.

- I think it is dangerous to tie up soundness with questions about
   termination.  The former is a semantic property while the latter
   has to do with computability.  Or, from a different perspective,
   soundness is essential, while termination/decidability is "only"
   desirable.  I know that others have thought more carefully than
   I have about conditions on the form of class and instance
   declarations that will guarantee decidability and I don't have
   much to say on that topic in what follows.  However, I will
   suggest that one should start by ensuring soundness and then,
   *separately*, consider termination.  (Of course, it's perfectly
   ok---just not essential---if conditions used to ensure soundness
   can also be used to guarantee termination.)

Stop reading now unless you're really interested in the details!
I'm going to start by explaining some key (but hopefully
unsurprising) ideas before I respond in detail to the example
that Martin posted.

Interpreting Class, Instance, and Dependency Decls:
---------------------------------------------------
I'll start by observing that Haskell's class, instance, and
functional dependency declarations all map directly to formulas in
standard predicate logic.  The correspondence is straightforward, so
I'll just illustrate it with some examples:

   Declaration               Formula
   -----------               -------
   class Eq a => Ord a       forall a. Ord a => Eq a
   instance Eq a => Eq [a]   forall a. Eq a => Eq [a]
   class C a b | a -> b      forall a, b. C a b /\ C a c => b = c

This correspondence between declarations and formulas seems to be
very similar to what you did in the paper using CHRs.  I haven't
read the paper carefully enough to know what extra mileage and/or
problems you get by using CHRs.  To me, predicate logic seems more
natural, but I don't think that matters here---I just wanted to
acknowledge that essentially the same idea is in your paper.

In the following, assuming some given program P, I'll write I(P)
for the conjunction of all the instance declaration formulas in
P; C(P) for the conjunction of all the class declaration formulas;
and F(P) fo the conjunction of all the dependency declaration
formulas.

Semantics and Soundness:
------------------------
We'll adopt a simple and standard semantics in which each
n-parameter type class is described by an n-place relation, that is,
a set of n-tuples of types.  Of course, in the most common, one
parameter case, this is equivalent to viewing the class as a set of
types.  Each tuple in this relation is referred to as an "instance"
of the class, and the predicate syntax C t1 ... tn is just a
notation for asserting that the tuple (t1, ..., tn) is an instance
of C.

We can assign a meaning to each of the classes in a given program P
by using the smallest model of the instance declaration formulas,
I(P).  (Existence of such a model is guaranteed; it is the union of
an increasing chain of approximations.)

What about the superclass properties in C(P)?  We expect to use
information from C(P) to simplify contexts during type inference.
For example, if we're using the standard Haskell prelude, then we
know that we can simplify (Eq a, Ord a) to (Ord a).  But how do we
know this is sound?  Haskell answers this by imposing restrictions
(third bullet in Section 4.3.2 of the report) on the use of
instance declarations to ensure that our (least) model of I(P) is
also a model of C(P).  This guarantees, for example, that every
type in the semantics of "Ord" will, as required, also be included
in the semantics of "Eq".

That should all seem pretty standard, but I've included it here
to make the point that we need to something very similar when we
introduce functional dependencies.  Specifically, we want to be
able to use properties from F(P) to simplify/improve contexts
during type inference, so we have to ensure that this is sound.
If we follow the strategy that was used to ensure soundness of
C(P) in Haskell, then we have to come up with a restriction on
instance declarations to ensure that our model of I(P) is also
a model of F(P).  I claim that any of the three conditions we've
talked about previously---using your terminology, the CC, WCC or
refined WCC---are sufficient to ensure this soundness property.
I haven't completed a formal proof, but I think it should be a
pretty straightforward exercise in predicate logic.

If a program fails to meet the criteria that ensure soundness of a
superclass declaration, the Haskell compiler should reject that
code.  Exactly the same thing should happen if the input program
doesn't satisfy the functional dependencies.  Further restrictions
are needed to ensure that there is a well-defined semantics (i.e.,
dictionary/evidence) for each type class instance; this is where
overlapping instances complicate the picture, for example.
However, for what we're doing here, it's sufficient to focus on
the semantics of classes (i.e., on which types are instances of
which classes) and to defer the semantics of overloaded operators
to a separate discussion.

Responding to Martin's Example:
-------------------------------

 > Here's an example (taken from the paper).
 >
 >   class F a b c | a->b
 >   instance F Int Bool Char
 >   instance F a b Bool => F [a] [b] Bool

This is an unfortunate example because, in fact, the second instance
declaration doesn't contribute anything to the class F!  Using the
semantics I outlined above, the class F is just {(Int, Bool, Char)},
with only the one instance that comes from the first instance
declaration.  This is unfortunate because, in the presence of
functional dependencies---or, in general, any form of
"improvement"---you have to take account of satisfiability.  In the
example here, any predicate of the form F a b Bool is
unsatisfiable, and so any time we attempt to transform an inferred
context containing a predicate of this form into some "simplified"
version, then there is a sense in which all we are doing is proving
that False is equivalent to False (or that anything follows from a
contradiction).

In what follows, let's refer to the program above as Q1.  However,
to explore a case where the predicates of interest are satisfiable,
let's also consider an extended version, Q2, that adds one extra
instance declaration:

     instance F Char Int Bool

Now the semantics of Q2 looks something like:

  { (Int, Bool, Char),
    (Char, Int, Bool),
    ([Char], [Int], Bool),
    ([[Char]], [[Int]], Bool),
    ([[[Char]]], [[[Int]]], Bool),
    ([[[[Char]]]], [[[[Int]]]], Bool), ... }

 > The FD is not full because the class parameter c is not involved in
 > the FD. We will show now that the CHR solver is not confluent.

Although it's just one example, it is worth pointing out that both
Q1 and Q2 lead to a semantics for F that is consistent/sound with
respect to the declared functional dependency.  So far as our
semantics is concerned, the fact that the dependency is not full
is not significant.

I don't dispute your claim that fullness is necessary to establish
confluence with CHRs.  But that is a consequence of the formal
proof/rewrite system that you're using, and not of the underlying
semantics.

 > Here is the translation to CHRs (see the paper for details)
 >
 >   rule F a b1 c, F a b2 d  ==> b1=b2      -- (FD)
 >   rule F Int Bool Char    <==> True       -- (Inst1)
 >   rule F Int a b           ==> a=Bool     -- (Imp1)
 >   rule F [a] [b] Bool     <==> F a b Bool -- (Inst2)
 >   rule F [a] c d           ==> c=[b]      -- (Imp2)

I'm a little surprised that you need Imp1 and Imp2 here.  I didn't
include analogs of these when I gave a predicate logic rendering
for dependency declarations above because they can be obtained as
derived properties.  Imp1, for example:

     Suppose F Int a b.
     From Inst1, we know that F Int Bool Char.
     Hence by FD, we know that a = Bool.

My best guess is that you included the Imp1 and Imp2 rules because,
although the "<==>" symbol suggests the possibility of bidirectional
rewriting, you actually want to view the above as oriented rewrite
rules?  From my perspective, however, the FD rule is the definition
of the functional dependency on F, and everything that you want to
know about this dependency can be derived from it ...

 > The above CHRs are not confluent. For example,
 > there are two possible CHR derivations for the initial
 > constraint store F [a] [b] Bool, F [a] b2 d
 >
 >     F [a] [b] Bool, F [a] b2 d
 > -->_FD (means apply the FD rule)
 >     F [a] [b] Bool, F [a] [b] d , b2=[b]
 > --> Inst2
 >     F a b Bool, F [a] [b] d , b_2=[b]         (*)
 >
 > Here's the second CHR derivation
 >
 >     F [a] [b] Bool, F [a] b2 d
 > -->_Inst2
 >     F a b Bool, F [a] b2 d
 > -->_Imp2
 >     F a b Bool, F [a] [c] d, b2=[c]           (**)
 >
 > (*) and (**) are final stores (ie no further CHR are applicable).
 > Unfortunately, they are not logically equivalent (one store says
 > b2=[b] whereas the other store says b2=[c]).

I think it's a mistake to expect these formulas to be logically
equivalent, at least in the strong syntactic sense that you seem
to be assuming here.  As I've mentioned previously, if we're going
to use improvement, then we have to take account of satisfiability.

In the context of Q1, this is an example in which each of the two
derivations above are essentially equivalent to False --> False.
But, in the general case, if we comparing (*) with (**), then the
main difference seems to be that the latter involves the use of
a "new" variable, c, in place of the variable b that appears in
the latter.  However, (*) includes F [a] [b] d and (**) includes
F [a] [c] d, and so we can deduce from the functional dependency
(i.e., your FD rule) that [b]=[c], and hence b=c.  If we're
allowed to inline that equality (i.e., use it to replace all
occurences of c with b), then (**) becomes identical to (*).

 > To conclude, fullness is a necessary condition to establish confluence
 > of the CHR solver.

I think that the standard notion of confluence is the wrong
property to be looking at here.  Or, to put it another way, I don't
think it is reasonable to expect confluence in the presence of
improvement.  That said, it might still be appropriate and useful
to formulate a notion of "confluence modulo satisfiability"
in which the equalities involved in the definition of confluence
are weakened to "equalities modulo satisfiability" ...

 > Confluence is vital to guarantee completeness of type inference.

I disagree.  I established the completeness of type inference in
the presence of improvement (without requiring any kind of
confluence-like property) as part of my work in 1994 on
"Simplifying and Improving Qualified Types."  (See Section 4 of
the technical report version, for example, which includes
details---like the proofs---that I had to omit from the later
FPCA '95 version for lack of space.)  Functional dependencies
are a special case of improvement, so I'd dispute the claim in
your paper that you are the first to formalize and prove
soundness, completeness and decidability.  (However, I'm
sympathetic to the fact that you missed this, because my results
on improvement were published six years before my formulation of
functional dependencies; if you were only looking at later
publications, you wouldn't have found anything.)

It's worth pointing out that, to achieve my completeness result,
I had to do exactly what I am suggesting that you should do now:
modify your formalism to account for satisfiability.  In my work,
for example, that meant introducing a "satisfiability ordering"
on type schemes and replacing the conventional notion of principal
types with a concept of "principal satisfiable types".

Let me take this opportunity to reflect on my understanding of
your main contributions and the relationship to my work.  In my
general, formal system, simplification and improvement rules can
be used at any point during type inference, but there are no
language features---such as explicit type signatures or instance
declarations---that *require* the use of such rules.  As a result,
decidability, in the general case, is easy to establish, but it
is arguably not very useful.

In my paper, I suggested that, in practice, for any specific
application of improvement, a language designer would likely want
to adopt a more algorithmic approach that invokes suitably defined
"improvement" and "simplification" functions at specific points
during type inference (coupled, for example, with
generalization).  With this approach, soundness, completeness, and
decidability of type inference become dependent on the soundness
and termination properties of the improvement and simplification
functions.  This, I think, is the most significant technical issue
that you are addressing in your paper.  In concrete terms, I think
this comes down to a question of whether the entailment relation
C1 ||- C2 between contexts C1 and C2 is decidable.  I hadn't heard
of the "Paterson Conditions" before I saw the reference on the
Trac page (sorry Ross!), but it sounds like they are sufficient to
guarantee good theoretical properties but also clear and flexible
enough to be useful in practice.

 > I don't think that fullness is an onerous condition.

Well that's probably true, but requiring fullness means that
there's one extra concept that programmers have to grapple with,
and one extra hurdle that they have to overcome in those
situations where a non-full FD seems to provide a more natural
solution.

There are at least a couple of ways to avoid the non-full FD in
your example program.  For example, we could replace every use
of (F a b c) with a pair of constraints (F1 a b, F2 a c) where
the classes F1 and F2 are as follows:

   class F1 a b | a -> b
   class F2 a c

This is an example of what's called normalization in relational
databases.  Because the type class relations that we use in
practice are fairly simple (just a few columns) and because
normalized relations have some structural benefits, I'd suspect
that the use of normalized type class relations, like the F1+F2
pair above, is usually a hallmark of better design than the
non-normalized version (like the original version with just F).
Nevertheless, unless there are strong technical reasons to require
one version over the other, I'd prefer to leave choices like this
open so that individual programmers can make the most appropriate
call for a given application.

....

I'm sorry I haven't given you this kind of feedback previously.
Simon had invited me to take a look at the paper on at least one
previous occassion, but it had not found its way to the front of my
reading queue.  Even now, I've only skimmed the paper (e.g., I
started with the Trac web page, figured out that I needed to know
what a "full" FD might be, and then used a text search of the pdf
to locate the definition.)  But I can see that there is a lot more
in the paper beyond the aspects we're discussing here, particularly
with respect to criteria for establishing termination, so I
apologize again if I've overlooked other relevant sections.

I hope my comments are still useful!

All the best,
Mark

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: [Haskell] [Fwd: undecidable & overlapping instances: a bug?]

Martin Sulzmann
In reply to this post by Iavor Diatchki
Iavor Diatchki writes:
 > Hello,
 >
 > On 10/19/07, Martin Sulzmann <[hidden email]> wrote:
 > > Simon Peyton-Jones writes:
 > >  > ...
 > >  > Like you, Iavor, I find it very hard to internalise just why (B) and (C) are important.  But I believe the paper gives examples of why they are, and Martin is getting good at explaining it. Martin: can you give an example, once more, of the importance of (B) (=fullness)?
 > >  >
 > >
 > > Fullness (B) is a necessary condition to guarantee that the constraint
 > > solver (aka CHR solver) derived from the type class program is confluent.
 > >
 > > Here's an example (taken from the paper).
 > >
 > >   class F a b c | a->b
 > >   instance F Int Bool Char
 > >   instance F a b Bool => F [a] [b] Bool
 > >
 > > The FD is not full because the class parameter c is not involved in
 > > the FD. We will show now that the CHR solver is not confluent.
 > >
 > > Here is the translation to CHRs (see the paper for details)
 > >
 > >   rule F a b1 c, F a b2 d  ==> b1=b2      -- (FD)
 > >   rule F Int Bool Char    <==> True       -- (Inst1)
 > >   rule F Int a b           ==> a=Bool     -- (Imp1)
 > >   rule F [a] [b] Bool     <==> F a b Bool -- (Inst2)
 > >   rule F [a] c d           ==> c=[b]      -- (Imp2)
 > >
 > >
 > > The above CHRs are not confluent. For example,
 > > there are two possible CHR derivations for the initial
 > > constraint store F [a] [b] Bool, F [a] b2 d
 > >
 > >     F [a] [b] Bool, F [a] b2 d
 > > -->_FD (means apply the FD rule)
 > >     F [a] [b] Bool, F [a] [b] d , b2=[b]
 > > --> Inst2
 > >     F a b Bool, F [a] [b] d , b_2=[b]         (*)
 > >
 > >
 > > Here's the second CHR derivation
 > >
 > >     F [a] [b] Bool, F [a] b2 d
 > > -->_Inst2
 > >     F a b Bool, F [a] b2 d
 > > -->_Imp2
 > >     F a b Bool, F [a] [c] d, b2=[c]           (**)
 > >
 > >
 > > (*) and (**) are final stores (ie no further CHR are applicable).
 > > Unfortunately, they are not logically equivalent (one store says
 > > b2=[b] whereas the other store says b2=[c]).
 >
 > But what is wrong with applying the following logical reasoning:
 >

Well, you apply the above CHRs from left to right *and* right to
left. In contrast, I apply the CHRs only from left to right.

 > Starting with (**):
 > F a b Bool, F [a] [c] d, b2=[c]
 > (by inst2)
 > F a b Bool, F [a] [c] d, b2=[c], F [a] [b] Bool
 > (by FD)
 > F a b Bool, F [a] [c] d, b2=[c], F [a] [b] Bool, [c] = [b]
 > (applying equalities and omitting unnecessary predicates)
 > F [a] [b] Bool, F [a] b2 d
 > (...and then follow your argument to reach (*)...)
 >
 > Alternatively, if we start at (*):
 > F a b Bool, F [a] [b] d , b_2=[b]
 > (by inst2)
 > F a b Bool, F [a] [b] d , b_2=[b], F [a] [b] Bool
 > (applying equalities, rearranging, and omitting unnecessary predicates)
 > F [a] [b] Bool, F [a] b_2 d
 > (... and then follow you reasoning to reach (**) ...)
 >
 > So it would appear that the two sets of predicates are logically equivalent.
 >

I agree that the two sets

F a b Bool, F [a] [b] d , b_2=[b]         (*)

and

F a b Bool, F [a] [c] d, b2=[c]           (**)

are logically equivalent wrt the above CHRs (see your proof).

The problem/challenge we are facing is to come up with a confluent and
terminating CHR solver. Why is this useful?
(1) We get a deterministic solver
(2) We can decide whether two constraints C1 and C2 are equal by
    solving (a) C1 -->* D1 and
            (b) C2 -->* D2
    and then checking that D1 and D2 are equivalent.
    The equivalence is a simple syntactic check here.

The CHR solving strategy applies rules in a fixed order (from left to
right). My example shows that under such a strategy the CHR solver
becomes non-confluent for type class programs with non-full FDs.

We can show non-confluence by having two derivations starting with the
same initial store leading to different final stores.

Recall:

     F [a] [b] Bool, F [a] b2 d
-->* F a b Bool, F [a] [b] d , b_2=[b]         (*)


     F [a] [b] Bool, F [a] b2 d
-->* F a b Bool, F [a] [c] d, b2=[c]           (**)

I said

 > > (*) and (**) are final stores (ie no further CHR are applicable).
 > > Unfortunately, they are not logically equivalent (one store says
 > > b2=[b] whereas the other store says b2=[c]).

More precisely, I meant here that (*) and (**) are not logically
equivalent *not* taking into account the above CHRs.
This means that (*) and (**) are different (syntactically).

 > > To conclude, fullness is a necessary condition to establish confluence
 > > of the CHR solver. Confluence is vital to guarantee completeness of
 > > type inference.
 > >
 > >
 > > I don't think that fullness is an onerous condition.
 >
 > I agree with you that, in practice, many classes probably use "full"
 > FDs.  However, these extra conditions make the system more
 > complicated, and we should make clear what exactly are we getting in
 > return for the added complexity.
 >

You can get rid of non-full FDs (see the paper). BTW, type functions
are full by construction.

Martin


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: [Haskell] [Fwd: undecidable & overlapping instances: a bug?]

Martin Sulzmann
In reply to this post by Mark P. Jones
Mark P Jones writes:
 > Hi All,
 >
 > Here are my responses to the recent messages, starting with some
 > summary comments:
 >
 > - I agree with Martin that the condition I posted a few days ago is
 >    equivalent to the "*refined* weak coverage condition" in your
 >    paper.  The "refined" tag here is important---I missed it the
 >    first time and thought he was referring to the WCC at the start of
 >    Section 6.1, and I would strongly recommend against using that
 >    (unrefined) version.  I confess that I haven't read the paper
 >    carefully enough to understand why you included that first WCC at
 >    all; I don't see any reason why you would want to use that
 >    particular condition in a practical implementation, and I don't
 >    see any advantages to it from a theoretical perspective either.
 >    (Maybe you just used it as a stepping stone to help motivate the
 >    refined WCC?)
 >

It's a stepping stone but WCC on it's own is useful to recover
termination (see the "zip" example in the paper).

 > - I believe that the refined WCC (or the CC or even the original WCC
 >    if you really want) are sufficient to ensure soundness.  I don't
 >    agree that the (B) restriction to "full" FDs is necessary.
 >
 > - I think it is dangerous to tie up soundness with questions about
 >    termination.  The former is a semantic property while the latter
 >    has to do with computability.  Or, from a different perspective,
 >    soundness is essential, while termination/decidability is "only"
 >    desirable.  I know that others have thought more carefully than
 >    I have about conditions on the form of class and instance
 >    declarations that will guarantee decidability and I don't have
 >    much to say on that topic in what follows.  However, I will
 >    suggest that one should start by ensuring soundness and then,
 >    *separately*, consider termination.  (Of course, it's perfectly
 >    ok---just not essential---if conditions used to ensure soundness
 >    can also be used to guarantee termination.)
 >


Well, decidability and completeness matters if we care about automatic
type inference.

Given some input program we'd like that to have a *decidable* algorithm
which computes a type for every well-typed program (*soundness*) and
yields failure if the program is ill-typed (*completeness*).

In "Simplifying and Improving Qualified Types." you give indeed
a sound, decidable and complete type inference algorithm
*if* the proof theory among constraints in some theory of qualified
types is sound, decidable and complete.

My main concern is to establish sufficient conditions such that the
proof theory is sound, decidable and complete. That is, there's a
decidable algorithm which says yes if QT |= C1 -> C2 and otherwise no
where QT |= C1 -> C2 means that in the qualified theory QT constraint
C1 entails constraint C2.

I think that our works are fairly complementary. Or are you saying
you've already given such an algorithm (which I indeed must have
missed).

I believe that you are offended by my statement that

 > Confluence is vital to guarantee completeness of type inference.

Let me clarify. Confluence is one of the suffient conditions to
guarantee completeness of CHR-based type inference. Apologies for any
confusion caused by my previous statement.

There are clearly other approaches possible to obtain a sound,
decidable and complete proof theory (and therefore to obtain sound,
decidable and complete type inference). But I yet need to see concrete
results which match the CHR-based approach/results described in:

Understanding Functional Dependencies via Constraint Handling Rules
Martin Sulzmann, Gregory J. Duck, Simon Peyton-Jones and Peter
J. Stuckey
In Journal of Functional Programming, 2007

Martin


 > Stop reading now unless you're really interested in the details!
 > I'm going to start by explaining some key (but hopefully
 > unsurprising) ideas before I respond in detail to the example
 > that Martin posted.
 >
 > Interpreting Class, Instance, and Dependency Decls:
 > ---------------------------------------------------
 > I'll start by observing that Haskell's class, instance, and
 > functional dependency declarations all map directly to formulas in
 > standard predicate logic.  The correspondence is straightforward, so
 > I'll just illustrate it with some examples:
 >
 >    Declaration               Formula
 >    -----------               -------
 >    class Eq a => Ord a       forall a. Ord a => Eq a
 >    instance Eq a => Eq [a]   forall a. Eq a => Eq [a]
 >    class C a b | a -> b      forall a, b. C a b /\ C a c => b = c
 >
 > This correspondence between declarations and formulas seems to be
 > very similar to what you did in the paper using CHRs.  I haven't
 > read the paper carefully enough to know what extra mileage and/or
 > problems you get by using CHRs.  To me, predicate logic seems more
 > natural, but I don't think that matters here---I just wanted to
 > acknowledge that essentially the same idea is in your paper.
 >
 > In the following, assuming some given program P, I'll write I(P)
 > for the conjunction of all the instance declaration formulas in
 > P; C(P) for the conjunction of all the class declaration formulas;
 > and F(P) fo the conjunction of all the dependency declaration
 > formulas.
 >
 > Semantics and Soundness:
 > ------------------------
 > We'll adopt a simple and standard semantics in which each
 > n-parameter type class is described by an n-place relation, that is,
 > a set of n-tuples of types.  Of course, in the most common, one
 > parameter case, this is equivalent to viewing the class as a set of
 > types.  Each tuple in this relation is referred to as an "instance"
 > of the class, and the predicate syntax C t1 ... tn is just a
 > notation for asserting that the tuple (t1, ..., tn) is an instance
 > of C.
 >
 > We can assign a meaning to each of the classes in a given program P
 > by using the smallest model of the instance declaration formulas,
 > I(P).  (Existence of such a model is guaranteed; it is the union of
 > an increasing chain of approximations.)
 >
 > What about the superclass properties in C(P)?  We expect to use
 > information from C(P) to simplify contexts during type inference.
 > For example, if we're using the standard Haskell prelude, then we
 > know that we can simplify (Eq a, Ord a) to (Ord a).  But how do we
 > know this is sound?  Haskell answers this by imposing restrictions
 > (third bullet in Section 4.3.2 of the report) on the use of
 > instance declarations to ensure that our (least) model of I(P) is
 > also a model of C(P).  This guarantees, for example, that every
 > type in the semantics of "Ord" will, as required, also be included
 > in the semantics of "Eq".
 >
 > That should all seem pretty standard, but I've included it here
 > to make the point that we need to something very similar when we
 > introduce functional dependencies.  Specifically, we want to be
 > able to use properties from F(P) to simplify/improve contexts
 > during type inference, so we have to ensure that this is sound.
 > If we follow the strategy that was used to ensure soundness of
 > C(P) in Haskell, then we have to come up with a restriction on
 > instance declarations to ensure that our model of I(P) is also
 > a model of F(P).  I claim that any of the three conditions we've
 > talked about previously---using your terminology, the CC, WCC or
 > refined WCC---are sufficient to ensure this soundness property.
 > I haven't completed a formal proof, but I think it should be a
 > pretty straightforward exercise in predicate logic.
 >
 > If a program fails to meet the criteria that ensure soundness of a
 > superclass declaration, the Haskell compiler should reject that
 > code.  Exactly the same thing should happen if the input program
 > doesn't satisfy the functional dependencies.  Further restrictions
 > are needed to ensure that there is a well-defined semantics (i.e.,
 > dictionary/evidence) for each type class instance; this is where
 > overlapping instances complicate the picture, for example.
 > However, for what we're doing here, it's sufficient to focus on
 > the semantics of classes (i.e., on which types are instances of
 > which classes) and to defer the semantics of overloaded operators
 > to a separate discussion.
 >
 > Responding to Martin's Example:
 > -------------------------------
 >
 >  > Here's an example (taken from the paper).
 >  >
 >  >   class F a b c | a->b
 >  >   instance F Int Bool Char
 >  >   instance F a b Bool => F [a] [b] Bool
 >
 > This is an unfortunate example because, in fact, the second instance
 > declaration doesn't contribute anything to the class F!  Using the
 > semantics I outlined above, the class F is just {(Int, Bool, Char)},
 > with only the one instance that comes from the first instance
 > declaration.  This is unfortunate because, in the presence of
 > functional dependencies---or, in general, any form of
 > "improvement"---you have to take account of satisfiability.  In the
 > example here, any predicate of the form F a b Bool is
 > unsatisfiable, and so any time we attempt to transform an inferred
 > context containing a predicate of this form into some "simplified"
 > version, then there is a sense in which all we are doing is proving
 > that False is equivalent to False (or that anything follows from a
 > contradiction).
 >
 > In what follows, let's refer to the program above as Q1.  However,
 > to explore a case where the predicates of interest are satisfiable,
 > let's also consider an extended version, Q2, that adds one extra
 > instance declaration:
 >
 >      instance F Char Int Bool
 >
 > Now the semantics of Q2 looks something like:
 >
 >   { (Int, Bool, Char),
 >     (Char, Int, Bool),
 >     ([Char], [Int], Bool),
 >     ([[Char]], [[Int]], Bool),
 >     ([[[Char]]], [[[Int]]], Bool),
 >     ([[[[Char]]]], [[[[Int]]]], Bool), ... }
 >
 >  > The FD is not full because the class parameter c is not involved in
 >  > the FD. We will show now that the CHR solver is not confluent.
 >
 > Although it's just one example, it is worth pointing out that both
 > Q1 and Q2 lead to a semantics for F that is consistent/sound with
 > respect to the declared functional dependency.  So far as our
 > semantics is concerned, the fact that the dependency is not full
 > is not significant.
 >
 > I don't dispute your claim that fullness is necessary to establish
 > confluence with CHRs.  But that is a consequence of the formal
 > proof/rewrite system that you're using, and not of the underlying
 > semantics.
 >
 >  > Here is the translation to CHRs (see the paper for details)
 >  >
 >  >   rule F a b1 c, F a b2 d  ==> b1=b2      -- (FD)
 >  >   rule F Int Bool Char    <==> True       -- (Inst1)
 >  >   rule F Int a b           ==> a=Bool     -- (Imp1)
 >  >   rule F [a] [b] Bool     <==> F a b Bool -- (Inst2)
 >  >   rule F [a] c d           ==> c=[b]      -- (Imp2)
 >
 > I'm a little surprised that you need Imp1 and Imp2 here.  I didn't
 > include analogs of these when I gave a predicate logic rendering
 > for dependency declarations above because they can be obtained as
 > derived properties.  Imp1, for example:
 >
 >      Suppose F Int a b.
 >      From Inst1, we know that F Int Bool Char.
 >      Hence by FD, we know that a = Bool.
 >
 > My best guess is that you included the Imp1 and Imp2 rules because,
 > although the "<==>" symbol suggests the possibility of bidirectional
 > rewriting, you actually want to view the above as oriented rewrite
 > rules?  From my perspective, however, the FD rule is the definition
 > of the functional dependency on F, and everything that you want to
 > know about this dependency can be derived from it ...
 >
 >  > The above CHRs are not confluent. For example,
 >  > there are two possible CHR derivations for the initial
 >  > constraint store F [a] [b] Bool, F [a] b2 d
 >  >
 >  >     F [a] [b] Bool, F [a] b2 d
 >  > -->_FD (means apply the FD rule)
 >  >     F [a] [b] Bool, F [a] [b] d , b2=[b]
 >  > --> Inst2
 >  >     F a b Bool, F [a] [b] d , b_2=[b]         (*)
 >  >
 >  > Here's the second CHR derivation
 >  >
 >  >     F [a] [b] Bool, F [a] b2 d
 >  > -->_Inst2
 >  >     F a b Bool, F [a] b2 d
 >  > -->_Imp2
 >  >     F a b Bool, F [a] [c] d, b2=[c]           (**)
 >  >
 >  > (*) and (**) are final stores (ie no further CHR are applicable).
 >  > Unfortunately, they are not logically equivalent (one store says
 >  > b2=[b] whereas the other store says b2=[c]).
 >
 > I think it's a mistake to expect these formulas to be logically
 > equivalent, at least in the strong syntactic sense that you seem
 > to be assuming here.  As I've mentioned previously, if we're going
 > to use improvement, then we have to take account of satisfiability.
 >
 > In the context of Q1, this is an example in which each of the two
 > derivations above are essentially equivalent to False --> False.
 > But, in the general case, if we comparing (*) with (**), then the
 > main difference seems to be that the latter involves the use of
 > a "new" variable, c, in place of the variable b that appears in
 > the latter.  However, (*) includes F [a] [b] d and (**) includes
 > F [a] [c] d, and so we can deduce from the functional dependency
 > (i.e., your FD rule) that [b]=[c], and hence b=c.  If we're
 > allowed to inline that equality (i.e., use it to replace all
 > occurences of c with b), then (**) becomes identical to (*).
 >
 >  > To conclude, fullness is a necessary condition to establish confluence
 >  > of the CHR solver.
 >
 > I think that the standard notion of confluence is the wrong
 > property to be looking at here.  Or, to put it another way, I don't
 > think it is reasonable to expect confluence in the presence of
 > improvement.  That said, it might still be appropriate and useful
 > to formulate a notion of "confluence modulo satisfiability"
 > in which the equalities involved in the definition of confluence
 > are weakened to "equalities modulo satisfiability" ...
 >
 >  > Confluence is vital to guarantee completeness of type inference.
 >
 > I disagree.  I established the completeness of type inference in
 > the presence of improvement (without requiring any kind of
 > confluence-like property) as part of my work in 1994 on
 > "Simplifying and Improving Qualified Types."  (See Section 4 of
 > the technical report version, for example, which includes
 > details---like the proofs---that I had to omit from the later
 > FPCA '95 version for lack of space.)  Functional dependencies
 > are a special case of improvement, so I'd dispute the claim in
 > your paper that you are the first to formalize and prove
 > soundness, completeness and decidability.  (However, I'm
 > sympathetic to the fact that you missed this, because my results
 > on improvement were published six years before my formulation of
 > functional dependencies; if you were only looking at later
 > publications, you wouldn't have found anything.)
 >
 > It's worth pointing out that, to achieve my completeness result,
 > I had to do exactly what I am suggesting that you should do now:
 > modify your formalism to account for satisfiability.  In my work,
 > for example, that meant introducing a "satisfiability ordering"
 > on type schemes and replacing the conventional notion of principal
 > types with a concept of "principal satisfiable types".
 >
 > Let me take this opportunity to reflect on my understanding of
 > your main contributions and the relationship to my work.  In my
 > general, formal system, simplification and improvement rules can
 > be used at any point during type inference, but there are no
 > language features---such as explicit type signatures or instance
 > declarations---that *require* the use of such rules.  As a result,
 > decidability, in the general case, is easy to establish, but it
 > is arguably not very useful.
 >
 > In my paper, I suggested that, in practice, for any specific
 > application of improvement, a language designer would likely want
 > to adopt a more algorithmic approach that invokes suitably defined
 > "improvement" and "simplification" functions at specific points
 > during type inference (coupled, for example, with
 > generalization).  With this approach, soundness, completeness, and
 > decidability of type inference become dependent on the soundness
 > and termination properties of the improvement and simplification
 > functions.  This, I think, is the most significant technical issue
 > that you are addressing in your paper.  In concrete terms, I think
 > this comes down to a question of whether the entailment relation
 > C1 ||- C2 between contexts C1 and C2 is decidable.  I hadn't heard
 > of the "Paterson Conditions" before I saw the reference on the
 > Trac page (sorry Ross!), but it sounds like they are sufficient to
 > guarantee good theoretical properties but also clear and flexible
 > enough to be useful in practice.
 >
 >  > I don't think that fullness is an onerous condition.
 >
 > Well that's probably true, but requiring fullness means that
 > there's one extra concept that programmers have to grapple with,
 > and one extra hurdle that they have to overcome in those
 > situations where a non-full FD seems to provide a more natural
 > solution.
 >
 > There are at least a couple of ways to avoid the non-full FD in
 > your example program.  For example, we could replace every use
 > of (F a b c) with a pair of constraints (F1 a b, F2 a c) where
 > the classes F1 and F2 are as follows:
 >
 >    class F1 a b | a -> b
 >    class F2 a c
 >
 > This is an example of what's called normalization in relational
 > databases.  Because the type class relations that we use in
 > practice are fairly simple (just a few columns) and because
 > normalized relations have some structural benefits, I'd suspect
 > that the use of normalized type class relations, like the F1+F2
 > pair above, is usually a hallmark of better design than the
 > non-normalized version (like the original version with just F).
 > Nevertheless, unless there are strong technical reasons to require
 > one version over the other, I'd prefer to leave choices like this
 > open so that individual programmers can make the most appropriate
 > call for a given application.
 >
 > ....
 >
 > I'm sorry I haven't given you this kind of feedback previously.
 > Simon had invited me to take a look at the paper on at least one
 > previous occassion, but it had not found its way to the front of my
 > reading queue.  Even now, I've only skimmed the paper (e.g., I
 > started with the Trac web page, figured out that I needed to know
 > what a "full" FD might be, and then used a text search of the pdf
 > to locate the definition.)  But I can see that there is a lot more
 > in the paper beyond the aspects we're discussing here, particularly
 > with respect to criteria for establishing termination, so I
 > apologize again if I've overlooked other relevant sections.
 >
 > I hope my comments are still useful!
 >
 > All the best,
 > Mark
 >
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Re: [Haskell] [Fwd: undecidable & overlappinginstances: a bug?]

Claus Reinke
In reply to this post by Mark P. Jones
Hi Mark,

>   Declaration               Formula
>   -----------               -------
>   class Eq a => Ord a       forall a. Ord a => Eq a
>   instance Eq a => Eq [a]   forall a. Eq a => Eq [a]
>   class C a b | a -> b      forall a, b. C a b /\ C a c => b = c
>
> This correspondence between declarations and formulas seems to be
> very similar to what you did in the paper using CHRs.  I haven't
> read the paper carefully enough to know what extra mileage and/or
> problems you get by using CHRs.  To me, predicate logic seems more
> natural, but I don't think that matters here---I just wanted to
> acknowledge that essentially the same idea is in your paper.

i wondered about that "extra mileage" question when i first
encountered chrs and the fd-to-chr translations - here's my
take on it:

while chr started out with a predicate logic semantics, it helps
to think of them as representing a subset of linear logic instead
(which is where recent chr semantics are headed). other
relevant aspects are: committed choice language (no reliance
on backtracking), no implicit unification (left-hand sides of
rules are matched, unification has to be called explicitly in
right-hand sides, *after* committing to a rule).

what is helpful about using chr, then, is that they are closer
to implementations of functional dependencies, making it
easier to expose/discuss some general issues relevant to
implementations but not to abstract semantics (such as
incomplete and changing knowledge due to separate
compilation, or some implementations risking to lose
information), without having to wade through all other
implementation-level details.

in other words, chr expose a few more details, so more things
can go wrong; which is useful, because one can discuss how
to avoid these wrongs, providing more concrete guidelines
for implementations. they also avoid exposing details not
(currently) relevant to the application, as a mapping to prolog
would.

> I don't dispute your claim that fullness is necessary to establish
> confluence with CHRs.  But that is a consequence of the formal
> proof/rewrite system that you're using, and not of the underlying
> semantics.

i tried to argue that case (the confluence issues are due to the
translation, not inherent in functional dependencies) last year,
even suggested an alternate translation of fd to chr. something
like this (the 'TC a b' should be 'TC a1..an', i think, and the
notation is meant to correspond to figure 2 in the jfp paper):

http://www.haskell.org//pipermail/haskell-prime/2006-March/000893.html

simplifying slightly, the alternate translation uses an embedding
of classical into linear logic, so knowledge about fds never
decreases as long as it remains consistent, whereas the jfp
paper translation works directly in a linear logic (<==> is like
linear implication -o, consuming its pre-conditions), so applying
inference steps in the "wrong" order can lose knowledge
about fds, leading to loss of confluence (unless additional
restrictions are introduced).

if i recall correctly, the paper also relied on confluence as
one possible means to ensure termination, meaning that
results about termination or results relying on termination
were to some extent tied up with the extra conditions used
to ensure confluence. i believe that fixing the translation to
chr would get rid of the confluence issues discussed in the
paper alltogether, so it could have focussed on the
remaining issues, like termination, without distraction.

frustratingly, while the paper was still a draft at the time,
the discussion came too late to have an impact on the
version published in jfp.

hope this helps,
claus

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe