Quantcast

HughesPJ vs. Wadler-Leijen

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

HughesPJ vs. Wadler-Leijen

Evan Laforge
I've been trying to get my head around the Wadler-Leijen pretty
printing combinators for a while now and having some trouble.

Specifically, I have trouble getting them to pick optimal line breaks.
 The existing combinators like 'sep' (and everything built from it)
merge all elements with <$> and then 'group' the whole thing, with the
result that they either all go on one line, or get one line each.
This is quite ugly for large lists of small elements.  The other
alternative is 'fillSep', which does a separate 'group' on each
element.  Unfortunately, it then tends to make very bad line wrapping
decisions, e.g. you get:

Rec { hi = "there" }
Rec
  { hi = "there", hi = "there"
  , hi = "there"
  }
Rec
  { lab = "short", label =
                     [ 0, 1, 2
                     , 3, 4, 5
                     , 6, 7, 8
                     , 9, 10
                     , 11, 12
                     ]
  }

No matter how much fancy 'group's and 'nest's and whatnot I threw in,
it was always a choice between forcing wrapping on every element and
looking ugly for many small elements, or trying to fit more into one
line and having it wrap in the wrong place and wind up scrunched up on
the right margin.  Here's my latest attempt:

list = commas PP.lbracket PP.rbracket . map format

commas :: Doc -> Doc -> [Doc] -> Doc
commas left right xs = PP.group $
    left <+> punctuate (\x -> PP.group (x <$$> PP.comma <> PP.space)) xs
        <$> right

punctuate :: (Doc -> Doc) -> [Doc] -> Doc
punctuate f [] = mempty
punctuate f [x] = x
punctuate f (x:xs) = f x <> punctuate f xs

record :: Doc -> [(String, Doc)] -> Doc
record title fields = PP.group $
    PP.hang 2 $ title <$> (commas PP.lbrace PP.rbrace (map f fields))
    where
    f (label, field) = PP.hang 2 $ PP.group $
        PP.text label <+> PP.equals <$> field

But the thing is, the HughesPJ-using Language.Haskell.Pretty in
haskell-src gets the line wrapping just right.  So I investigated how
it works, and it's very simple, here's the reduced version:

class Pretty a where format :: a -> Doc

list :: (Pretty a) => [a] -> Doc
list = bracket_list . PP.punctuate PP.comma . map format

fsep' :: [Doc] -> Doc
fsep' [] = PP.empty
fsep' (d:ds) = PP.nest 2 (PP.fsep (PP.nest (-2) d:ds))

bracket_list :: [Doc] -> Doc
bracket_list = PP.brackets . PP.fsep

brace_list :: [Doc] -> Doc
brace_list = PP.braces . PP.fsep

record :: Doc -> [(String, Doc)] -> Doc
record title fields = title <> (brace_list (map field fields))
    where
    field (name, val) = fsep' [PP.text name, PP.equals, val]

----

This formats records like so:

Rec{hi = "there"}
Rec{hi = "there" hi = "there"
    hi = "there"}
Rec{label =
      [0, 1, 2, 3, 4, 5, 6, 7,
       8, 9, 10, 11, 12, 13,
       14, 15, 16, 17, 18, 19,
       20, 21, 22, 23, 24, 25,
       26, 27, 28, 29, 30]
    label =
      [0, 1, 2, 3, 4, 5, 6, 7,
       8, 9, 10, 11, 12, 13,
       14, 15, 16, 17, 18, 19,
       20, 21, 22, 23, 24, 25,
       26, 27, 28, 29, 30]}
Rec{lab = "short"
    label =
      [0, 1, 2, 3, 4, 5, 6, 7,
       8, 9, 10, 11, 12, 13,
       14, 15, 16, 17, 18, 19,
       20, 21, 22, 23, 24, 25,
       26, 27, 28, 29, 30]}

Much better, even if it's not my preferred style!  Of course WL
doesn't have fsep or the negative nest craziness (I don't even know
what it's doing there), but it has the more general 'group'.  However,
no matter how complicated I got with WL it just never came out right.
In contrast, a very simple HughesPJ implementation gets it right.  The
thing is, when trying to figure out which pretty print library to use,
the consensus is that WL is just all around better, even though
HughesPJ is somewhat standard (but there are 6 WL variants on hackage,
and no (?) HughesPJ ones).  So am I just using it wrong?  If I
translate the HughesPJ one over directly into LW, here's what I get:

Rec{hi = "there"}
Rec{hi = "there", hi =
  "there", hi = "there"}
Rec{label = [0, 1, 2, 3, 4, 5,
  6, 7, 8, 9, 10, 11, 12, 13,
  14, 15, 16, 17, 18, 19, 20,
  21, 22, 23, 24, 25, 26, 27,
  28, 29, 30], label = [0, 1,
  2, 3, 4, 5, 6, 7, 8, 9, 10,
  11, 12, 13, 14, 15, 16, 17,
  18, 19, 20, 21, 22, 23, 24,
  25, 26, 27, 28, 29, 30]}
Rec{lab = "short", label = [0,
  1, 2, 3, 4, 5, 6, 7, 8, 9,
  10, 11, 12, 13, 14, 15, 16,
  17, 18, 19, 20, 21, 22, 23,
  24, 25, 26, 27, 28, 29, 30]}

So is WL really all it's cracked up to be?  Am I using it wrong?  I
was going to suggest some consolidation in the pretty printing library
packages, but now I'm not even sure which style should "win"...

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: HughesPJ vs. Wadler-Leijen

Stephen Tetley-2
A quick suggestion - does setting the ribbon_frac to something like
0.8 improve things?

The Show instance for wl-pprint's Doc uses 0.4 which I've found too low.

This means you'll have to write your own display function using
`renderPretty`...

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: HughesPJ vs. Wadler-Leijen

Ivan Lazar Miljenovic
On 20 March 2012 20:24, Stephen Tetley <[hidden email]> wrote:
> A quick suggestion - does setting the ribbon_frac to something like
> 0.8 improve things?
>
> The Show instance for wl-pprint's Doc uses 0.4 which I've found too low.
>
> This means you'll have to write your own display function using
> `renderPretty`...

I also found a few spacing/indentation related bugs in WL when I was
writing wl-pprint-text; does it work better for you?


--
Ivan Lazar Miljenovic
[hidden email]
http://IvanMiljenovic.wordpress.com

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: HughesPJ vs. Wadler-Leijen

Stephen Tetley-2
Hi Ivan

I haven't found any bugs in WL, however I do find the API somewhat
confusing regarding line breaking (I would need to consult the manual
to tell you the difference between linebreak, softline etc.). This is
likely my failing rather than WL as usually I want formatting - "I
know the layout" - rather than pretty printing - "the fit function
finds the best layout".

I think there is room in the design space for a library whose API
"favors" formatting rather than pretty-printing. I.e it has line
printing that cannot be undone by `group`, or the combinators that use
group are given more long-winded makes to make them secondary. I've
bits and bobs on the go to do this, but nothing near a concrete
library.

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: HughesPJ vs. Wadler-Leijen

Stephen Tetley-2
Ahem - there was a severe typo in my last message. Usually I wouldn't
spam the list to repair my failings but edit distance on the error in
that message was so large it made no sense at all.


> printing that cannot be undone by `group`, or the combinators that use
> group are given more long-winded **names** to make them secondary. I've
> bits and bobs on the go to do this, but nothing near a concrete
> library.

Apologies to all.

(Funny how I can spot typos after the fact...)

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: HughesPJ vs. Wadler-Leijen

Evan Laforge
In reply to this post by Stephen Tetley-2
On Tue, Mar 20, 2012 at 2:24 AM, Stephen Tetley
<[hidden email]> wrote:
> A quick suggestion - does setting the ribbon_frac to something like
> 0.8 improve things?

Nope.  The ribbon (IMO both an undescriptive name and underdocumented)
only constraints the number of non-indent characters per line.  So it
makes the line breaks in different places, but the underlying problem
of it not knowing where lines should be broken remains.

> The Show instance for wl-pprint's Doc uses 0.4 which I've found too low.

It's off the subject, but I alway thought 'ribbon' was odd as the
single knob available.  I never really saw a rationale for why it's so
important.  The old Hughes-PJ paper says it looks nice to have a
"ribbon" of text snaking across the page, but I think it looks nice to
preserve vertical space by filling lines as much as possible.
Difference of opinion I guess.

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: HughesPJ vs. Wadler-Leijen

Evan Laforge
In reply to this post by Stephen Tetley-2
On Tue, Mar 20, 2012 at 6:52 AM, Stephen Tetley
<[hidden email]> wrote:
> Hi Ivan
>
> I haven't found any bugs in WL, however I do find the API somewhat
> confusing regarding line breaking (I would need to consult the manual
> to tell you the difference between linebreak, softline etc.). This is
> likely my failing rather than WL as usually I want formatting - "I
> know the layout" - rather than pretty printing - "the fit function
> finds the best layout".

Yeah, the 'group' combinator is at the center of it, but it took me
some fiddling around to get a feel for how it worked... and I still
don't have a feel for how it works when nested (as it is pervasively
if you use fillSep and the like).  Maybe it's elegantly minimal, but
it doesn't seem to be that intuitive, unless someday I come to a
realization that clears it all up.  The thing is, I don't think there
is a fit function that finds the best layout, I think it simply does
what the composition of groups tells it to do.

> I think there is room in the design space for a library whose API
> "favors" formatting rather than pretty-printing. I.e it has line
> printing that cannot be undone by `group`, or the combinators that use
> group are given more long-winded makes to make them secondary. I've
> bits and bobs on the go to do this, but nothing near a concrete
> library.

What I think I would like is some way to express a hierarchy of line
breaks.  So if I'm formatting a list, there's a break before/after
each comma and they are all equally good breaks.  But then if I nest
and format a list of lists, the outer breaks are considered better
breaks than the inner ones.  This would preserve the hierarchical
structure of the data by trying to break on the largest chunks first,
and control indentation too.  In fact, HughesPJ's fsep (or maybe it's
the 'best' in renderStyle) seems to get that right all on its own.

I also have a personal style that's hard to reconcile with the pprint
combinators, namely that lists that fit on one line don't have spaces
around the brackets: [1, 2, 3], but ones that must be wrapped do, and
the close bracket lines up with the open one:

[ 1, 2
, 3, 4
]


One other thing I've thought about is to have a pretty printer have
the option of returning a list of Docs, in increasing detail.  Then a
smart viewer (perhaps HTML + JS) could let you expand things by
clicking on them.  Or maybe it would be more practical to just teach
vim or emacs the output syntax and let folding take care of it.

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Loading...