Can cardinality analysis subsume strictness analysis

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Can cardinality analysis subsume strictness analysis

Joachim Breitner-2
Hi,

after playing a bit with the Demand analyser and trying debug tricky
bugs (#8569), I begin to wonder: Can cardinality analysis subsume
strictness analysis?

Not directly, obviously, but if we change

data Count = One | Many
to
data Count = AtMostOne | ExactlyOne | AtLeastOne | Many
(or some other form that would allow us to express all possible
intervals in {0,1,?}), would the demand and cardinality analysis then
not subsume strictness, given that:
        A value is used strictly if is definitely used at least once?

That might simplify the Demand.lhs code, and also the demand signatures,
quite a bit, because currently it carries strictness and demand data
around separately, duplicating lots of stuff.

Greetings,
Joachim


--
Joachim ?nomeata? Breitner
  mail at joachim-breitner.de ? http://www.joachim-breitner.de/
  Jabber: nomeata at joachim-breitner.de  ? GPG-Key: 0x4743206C
  Debian Developer: nomeata at debian.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20131129/4dde7d1f/attachment.sig>

Reply | Threaded
Open this post in threaded view
|

Can cardinality analysis subsume strictness analysis

Joachim Breitner-2
Hi,

Am Freitag, den 29.11.2013, 11:54 +0000 schrieb Joachim Breitner:
> after playing a bit with the Demand analyser and trying debug tricky
> bugs (#8569), I begin to wonder: Can cardinality analysis subsume
> strictness analysis?

this is a more structured attempt to answer that question.

The motivation for this proposal is that I find it easier to think about
what the lubDemand, bothDemand etc. should be if I mostly have to think
about how often something is called (type Count below), instead of
thinking about Demand&Cardinality and Strictness (which feel very
similar to me) separately.

Currently we have these data structures (Demand.lhs, slightly simplified
by removing field names and type aliases):

data DmdType = DmdType (VarEnv JointDmd) [Demand] DmdResult
data CPRResult = NoCPR | RetProd | RetSum ConTag | BotCPR            

data Demand = JD MaybeStr MaybeUsed
data MaybeStr = Lazy | Str StrDmd
data StrDmd = HyperStr | SCall | SProd [MaybeStr] | HeadStr

data MaybeUsed = Abs | Use Count UseDmd
data UseDmd = UCall Count UseDmd | UProd [MaybeUsed] | UHead | Used                
data Count = One | Many

I suggest the following date type to replace the latter six above:

data CDemand = BotDemand | Use Count Demand -- C for cardinality
data Demand =  Call Count Demand | Prod [CDemand] | PolyUse | TopDemand
data Count = NoIdea | Zero | AtMostOne | AtLeastOne | ExactlyOne
  -- interesting intervals of {0, 1, ?}, corresponding to
             {0,1,?} {0}     {0,1}       {1,?}        {1}

Invariants and equations:
  * "Use Zero x" implies x = TopDemand. Therefore in the following:
     useZero = useZero TopDemand
  * TopDemand = Prod [Use NoIdea TopDemand,...] and
    TopDemand = Call (Use NoIdea TopDemand), but as ever we can use the
    latter form for piggy-backed boxity analysis.
  * BotDemand is the demand put on something that diverges for sure. So
    it is not only the ?bottom of the demand lattice? but also ?the
    demand of something bottoming? ? either way of thinking works.

And here is a transformation from old to new:

combine1 :: MaybeStr -> MaybeUsed      -> CDemand
 Lazy Abs                              -> useZero
 Lazy (Use c d)                        -> Use c' (convertLazy d)
   where c' = case c of One ->  AtMostOne
                        Many -> NoIdea
 (Str HyperStr) Abs                    -> BotDemand  
 (Str HyperStr) _                      -> ? not representable ? (Note 1)
 (Str _) Abs                           -> ? not representable ? (Note 2)
 (Str s) (Use c d)                     -> Use c' (combine2 s d)
   where c' = case c of One ->  ExactlyOne
                        Many -> AtLeastOne

convertLazy :: UseDMD                  -> Demand
 UCall c d                             -> Call c' (convertLazy d)
   where c' = case c of One -> AtMostOne -- see Note 4
                        Many -> NoIdea
 UProd ds                              -> UProd (map convertLazy ds)
 UHead                                 -> PolyUse useZero
 Used                                  -> TopDemand

combine2 :: StrDmd -> UseDMD           -> Demand
 SCall      (UCall Many d)             -> Call NoIdea (convertLazy d)
 SCall      (UCall One d)              -> Call (AtMostOne (convertLazy d))
 SCall      (UProd _)                  -> ? not representable ? (Note 3)
 SCall      (UHead _)                  -> PolyUse
 SCall      Used                       -> Call TopDemand
 (SProd ss) (UCall _ _)                -> ? not representable ? (Note 3)
 (SProd ss) (UProd ds)                 -> Prod (zipwith combine1 ss ds)
 (SProd ss) (UHead _)                  -> Prod (zipwith combine1 ss (repeat Abs))
 (SProd ss) Used                       -> Prod (zipwith combine1 ss (repeat Used))
 HeadStr    (UCall Many d)             -> Call NoIdea (convertLazy d)
 HeadStr    (UCall One d)              -> Call (AtMostOne (convertLazy d)
 HeadStr    (UProd ds)                 -> Prod (zipwith combine1 (repeat Lazy) ds)
 HeadStr    (UHead _)                  -> PolyUse
 HeadStr    Used                       -> PolyUse
 HyperStr   _                          -> ? dead code ? (handled by combine1)
 

Note 1: About HyperStr
        As far as I can see, this strictness demand is the one caused by
        diverging calls (error or non-termination), and is only relevant
        when it comes with a Use demand of Abs
Note 2: Strict but absent
        Besides diverging things (who currently put a hyper-strict &
        absent demand on free variables), nothing can be strict and
        absent, right?
Note 3: Disagreeing signatures
        When the strictness side sees a Call, the the demand side better
        not see a Product, and vica versa
Note 4: The Count in Call
        is currently either AtMostOne or NoIdea, i.e. there is notion of
        ?function is not called at all?. Does that correctly reflect
        what we have right now?


And here is the lattice. The nice thing is that there is a lattice for
Counts that is mostly independent from the other lattice, at least in
terms of thinking about it.
   
           NoIdea
            /  \
           /    \
   AtMostOnce   AtLeastOnce
     /   \        /
   Zero  ExactlyOne


          TopDemand
           /      \
    Use c Call  Use c Prod
           \      /
         Use c PolyUse
              |
          BotDemand


Some thoughts:

The lattices (both existing and proposed) are mostly complicated by two
special cases:
 * bottoming stuff with hyperstrict demand and
 * polymorphic seq.

As mentioned above, (HyperStr, Abs), i.e. the demand of error, becomes
BotDemand. We have no way of representing (HyperStr, Used), for example.
Would that occur?

For seq, I propose to use PolyUse (for ?unknown polymorphic use?). This
goes well the current use of (HeadStr, UHead) for seq. The strange is,
though, that HeadStr lives _above_ SCall und SProd, while UHead lives
_below_ UCall and UProd. This means that currently,
  case b of True -> x `seq` ()
            False -> case x of (a,b) -> ()
has a joint demand of d=(HeadStrict, UProd [Abs, Abs]) on x ? why not
d'=(SProd [Lazy, Lazy], UProd [Abs, Abs])? Anyways, in my lattice both
of these demands correspond to (Prod [absZero, absZero]), i.e. my
proposal hides the current differences. Are these wanted differences or
unwanted ones?


Some advantages of the code are:
 * Less duplication between StrDemand and UseDemand.
 * Non-redundant information about what is a function and what is a
   product.
 * Functions like cleanDemand simply take out Demand from CDemand, and
   the Count is the information that one needs to pass to deferAndUse,
   so less logic and less cases there.


So my proposal is not an isomorphism of the existing lattice (as
expected), and can neither express all existing combinations, nor
differentiate between all of them. Which is a good thing, iff those
values and differences are not desired.

Surely more interesting corner cases will turn up when and if one tries
to employ this scheme. Also, of course, there is still improvement for
further polishing. The question is: Is this worth pursuing, or is there
a reason why this cannot work, or maybe it would but it would not be an
improvement?


Thanks,
Joachim




--
Joachim ?nomeata? Breitner
  mail at joachim-breitner.de ? http://www.joachim-breitner.de/
  Jabber: nomeata at joachim-breitner.de  ? GPG-Key: 0x4743206C
  Debian Developer: nomeata at debian.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20131202/d29e5fba/attachment-0001.sig>