# PrefixMap: code review request

39 messages
12
Open this post in threaded view
|

 Hi, I'm a newish Haskell hacker with way too much experience hacking   Lisp.    At first, I found my Haskell code looking very lisp-y.   I   think my code is becoming more idiomatic haskell.  I would be very   grateful to anyone who would take a glance at the code below and   point out any un-idiomatic usages that jump out.  It's a small module   from a program which looks for palindromes in a list of words.   Thanks very much. Cheers, David \begin{code} {-# OPTIONS -O2 -optc-O3 #-} module PrefixMap (PrefixMap,fromDistinctAscPairList,searchMap) where import Data.List import qualified Data.Map as Map import Test.HUnit \end{code} The PrefixMap datastructure implements a Prefix Tree which allows a key/value relationship. \begin{code} data PrefixMap k v = Node (Maybe v) (Map.Map k (PrefixMap k v))                       deriving (Show) \end{code} A PrefixMap is built from an alphabet enumerating the possible constituents of keys and a list of pairs of keys and objects. A key is a string of elements of the alphabet.  The list must be distinct and in ascending order.  The constraint is not checked. \begin{code} fromDistinctAscPairList :: Ord k => [k]->[([k],v)]->PrefixMap k v fromDistinctAscPairList alphabet pairList =      build alphabet Nothing (partList pairList alphabet) partList :: Ord k => [([k],v)]->[k]->[(k,[([k],v)])] partList pairs alphabet = reverse . fst $foldl' f ([],pairs) alphabet where f (result,pairs) l = (result',rest) where (part,rest) = span ((==l) . head . fst) pairs result' = if null part then result else (l,part):result build :: Ord k => [k]->(Maybe v)->[(k,[([k],v)])]->(PrefixMap k v) build alphabet val pairs = Node val$ Map.fromDistinctAscList treePairs      where treePairs = [(c,mkITree l)|(c,l)<-pairs]           mkITree l = build alphabet x (partList l' alphabet)               where (x,l') = findNode $snipKeys l snipKeys :: Ord k => [([k],v)]->[([k],v)] snipKeys l = [(k,v) | (_:k,v) <- l] findNode :: Ord k => [([k], v)] -> (Maybe v, [([k], v)]) findNode l = if null suffix then (Nothing,l) else ((Just$ snd.head $suffix),prefix++(tail suffix)) where (prefix,suffix) = span (not.null.fst) l \end{code} searchMap applies a function to each object in the PrefixTree that is on the path specified by the key and the subtree below it and returns a list of the results. \begin{code} searchMap :: Ord k => (v -> vv) -> [k] -> PrefixMap k v -> [vv] searchMap f [] t = walk f t [] searchMap f (k:ks) (Node v al) = maybe rest ((:rest) . f) v where rest = maybe [] (searchMap f ks) (Map.lookup k al) walk :: (a -> b) -> PrefixMap k a -> [b] -> [b] walk f (Node Nothing al) z = Map.fold (walk f) z al walk f (Node (Just x) al) z = Map.fold (walk f) (f x:z) al test1 = TestCase (do input <- readFile "words.txt" let dict = words input pairs = zip dict dict alpha = ['a'..'z'] ftree = fromDistinctAscPairList alpha pairs fAnswer = searchMap id "assert" ftree rtree = fromDistinctAscPairList alpha$ sort $zip (map reverse dict) dict rAnswer = searchMap id "tressa" rtree assertEqual "forward search" ["as","ass","assertedly","asserted", "asserters","asserter","asserting", "assertions","assertion","assertively", "assertivenesses","assertiveness", "assertive","assertors","assertor", "asserts","assert"] fAnswer assertEqual "reverse search" ["reassert","overassert","assert"] rAnswer ) tests = TestList [TestLabel "Tree Test" test1] \end{code} \end{document} -------------------------------- David F. Place mailto:[hidden email] _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe Reply | Threaded Open this post in threaded view | ## Re: PrefixMap: code review request  On Feb 27, 2006, at 2:30 PM, David F.Place wrote: > Hi, > > I'm a newish Haskell hacker with way too much experience hacking > Lisp. At first, I found my Haskell code looking very lisp-y. I > think my code is becoming more idiomatic haskell. I would be very > grateful to anyone who would take a glance at the code below and > point out any un-idiomatic usages that jump out. It's a small > module from a program which looks for palindromes in a list of > words. Thanks very much. [snip] > partList :: Ord k => [([k],v)]->[k]->[(k,[([k],v)])] > partList pairs alphabet = reverse . fst$ foldl' f ([],pairs) alphabet >     where f (result,pairs) l = (result',rest) >               where (part,rest) = span ((==l) . head . fst) pairs >    result' = if null part >         then result >         else (l,part):result I don't think I've ever seen nested "where"s before ;-)  I'd probably   avoid that; it's really hard to read.  If your function is   sufficiently complicated that it needs its own "where" clause, you   should probably just promote it to the top level.  If it is truly   internal, you can avoid exporting it with the module export list. [snip]  > searchMap :: Ord k => (v -> vv) -> [k] -> PrefixMap k v -> [vv] Humm... double "v" seems like a pretty poor choice for a type   variable name. [snip] Just a couple of general comments: -- you don't seem to like horizontal whitespace much.  I know, I   know, whitespace placement can be a highly personal thing, but I find   most haskellers usually use a bit more horizontal whitespace,   particularly in function signatures.  The arrow is almost always   written with surrounding spaces.  I personally like space after   commas in tuples and lists.  Several of your list comprehensions   would also be easier to read with a bit of whitespace.  I also tend   to like '=' signs lined up in a column for lets, pattern function   definitions and wheres. -- Nested tuple and lists types are pretty hard to read.  In your   code [([k],v)] appears a lot.  Consider defining a type alias for it. Rob Dockins Speak softly and drive a Sherman tank. Laugh hard; it's a long way to the bank.            -- TMBG _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe
Open this post in threaded view
|

## Re: PrefixMap: code review request

 On Feb 27, 2006, at 3:11 PM, Robert Dockins wrote: > -- Nested tuple and lists types are pretty hard to read.  In your   > code [([k],v)] appears a lot.  Consider defining a type alias for it. Funny, of course as an inveterate lisp hacker, I am completely   insensitive to nesting depth.  :-)  Your suggestion cleans up the   code quite nicely. -------------------------------- David F. Place mailto:[hidden email] _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe
Open this post in threaded view
|

## Re: PrefixMap: code review request

 In reply to this post by David Place David F.Place wrote: [snip] > partList :: Ord k => [([k],v)]->[k]->[(k,[([k],v)])] > partList pairs alphabet = reverse . fst $foldl' f ([],pairs) alphabet > where f (result,pairs) l = (result',rest) > where (part,rest) = span ((==l) . head . fst) pairs > result' = if null part > then result > else (l,part):result If the above code is put into a non-literate file as: partList :: Ord k => [([k],v)]->[k]->[(k,[([k],v)])] partList pairs alphabet = reverse . fst$ foldl' f ([],pairs) alphabet      where f (result,pairs) l = (result',rest)                where (part,rest) = span ((==l) . head . fst) pairs      result' = if null part           then result           else (l,part):result there is a parse error (using ghc) at the line beginning with result'. This binding doesn't line up with anything. Also the second 'where' is dangerously close to the column started by the 'f' after the first 'where' (may not be noticeable in this email due to whatever font it is being displayed in but it's only indented by one space) which makes it a bit tricky to read. I suggest: partList :: Ord k => [([k],v)]->[k]->[(k,[([k],v)])] partList pairs alphabet = reverse . fst $foldl' f ([],pairs) alphabet where f (result,pairs) l = (result',rest) where (part,rest) = span ((==l) . head . fst) pairs result' = if null part then result else (l,part):result or: partList :: Ord k => [([k],v)]->[k]->[(k,[([k],v)])] partList pairs alphabet = reverse . fst$ foldl' f ([],pairs) alphabet     where        f (result,pairs) l = (result',rest)           where              (part,rest) = span ((==l) . head . fst) pairs              result' =                 if null part                    then result                    else (l,part):result because always starting an 'if' construct on a new line ensures that you will never ever have any problems with it's layout (especially helpful for people used to C) when you use it in a 'do' block. Also, both the above variants ensure that your code can be edited using variable width fonts, any tabbing regime you like (as long as leading whitespace only has tabs and never spaces), and will be immune to identifier renamings. The golden rule for 'safe' layout is always to start a layout block on the next line after 'do' 'where' 'of' 'let' and to try to resist the tempation to save a line by squashing the first binding onto the same line as the 'let' etc. The second variant has the added advantage that the horizontal indentation is kept under control (eg in the above all indenting is 3 spaces further in) whereas when people write things like if p == q then let a = 4                           b = if a ==4 then let q = 2                                                          s = 56 after only 3 indents the code is already half way across the screen (not to mention the fact that the above layout is completely brittle and can be destroyed by a simple search-and-replace op to change 'q' to 'hello') Of course all the above is just easy-to-talk-about technicalities of layout and you were really interested in getting feedback about idiomatic usage - hopefully someone else will give some feedback about that (I'm too lazy...) :-) Regards, Brian. _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe
Open this post in threaded view
|

## Re: PrefixMap: code review request

 On Feb 27, 2006, at 5:54 PM, Brian Hulley wrote: > there is a parse error (using ghc) at the line beginning with   > result'. This binding doesn't line up with anything. Also the   > second 'where' is dangerously close to the column started by the   > 'f' after the first 'where' (may not be noticeable in this email   > due to whatever font it is being displayed in but it's only   > indented by one space) which makes it a bit tricky to read. Whoops, that's noise in the transmission.  In my original file and my   original email, it is indented correctly.  As for other indentation   issues, I always use whatever emacs suggests.  Is that not a good   strategy? -------------------------------- David F. Place mailto:[hidden email] _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe
Open this post in threaded view
|

## Re: PrefixMap: code review request

Open this post in threaded view
|

 In reply to this post by David Place David F.Place wrote: > partList :: Ord k => [([k],v)]->[k]->[(k,[([k],v)])] > partList pairs alphabet = reverse . fst foldl' f ([],pairs) alphabet > where f (result,pairs) l = (result',rest) > where (part,rest) = span ((==l) . head . fst) pairs > result' = if null part > then result > else (l,part):result > I would write something like: ... where f (result, pairs) l = case span ((==l) . head . fst) pairs of ([], rest) -> ( result, rest) (part, rest) -> ((l, part):result, rest) -- Lennart _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe Reply | Threaded Open this post in threaded view | ## Re: PrefixMap: code review request  In reply to this post by David Place On 27/02/06, David F. Place <[hidden email]> wrote: > > On Feb 27, 2006, at 5:54 PM, Brian Hulley wrote: > > > there is a parse error (using ghc) at the line beginning with > > result'. This binding doesn't line up with anything. Also the > > second 'where' is dangerously close to the column started by the > > 'f' after the first 'where' (may not be noticeable in this email > > due to whatever font it is being displayed in but it's only > > indented by one space) which makes it a bit tricky to read. > > Whoops, that's noise in the transmission. In my original file and my > original email, it is indented correctly. As for other indentation > issues, I always use whatever emacs suggests. Is that not a good > strategy? > I find that I often really dislike emacs' choices for indenting -- so much so that I had to turn off smart indenting altogether to avoid holding down the spacebar all the time (of course it remaps tab). The simple indenter gives all the possible sane selections for lining things up and lets you indent further if you want. I'd normally go with the first of Brian's suggestions for laying out an if expression (condition on one line, then and else aligned at a deeper depth right after it), unless the condition was really long, in which case I might switch to the second to get a little more space on that first line. I don't personally worry about tab width, as long as things line up in any given function definition. I usually use 4 spaces when not thinking about it, but sometimes 3, or even 2, or far more, if it makes things look better. Most editors are quite good at putting the cursor at the same indent level on the following line, and this is all the help you tend to need in this regard when editing a file. (Random musing: It would be neat to have an editor which dropped subtle vertical bars under the start of blocks, to make it easier to visually apply the layout rule, and see when you haven't indented far enough, and the block is broken.) Always tell your editor to leave tabs out of your files. There are very few Haskell programmers who actually leave tabs in their source, and mixing spaces and tabs can be deadly. You can set this in emacs by adding (setq-default indent-tabs-mode nil) to your .emacs if you haven't already done so. If you also use vim, set expandtab is the line you want. :) - Cale _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe Reply | Threaded Open this post in threaded view | ## Re: PrefixMap: code review request  In reply to this post by Brian Hulley Brian Hulley wrote: > Whoever thought up the original Haskell layout rule assumed that people > would be happy using a single fixed width font, tabs set to 8 spaces, > and didn't care about the brittleness of the code (in the face of > identifier renamings) it allowed one to write. Are you complaining that Haskell permits you to write code with these problems, or that it requires you to? The latter is not true. Instead of keyword clause1 clause2 you can always write keyword clause1 clause2 or keyword { clause1 ; clause2 } Both styles are insensitive to tab width and renaming of identifiers. The off-side rule only comes into play when you don't include an explicit {, so you can suppress it entirely if you prefer. If you have a different layout rule in mind I'd be interested in hearing it, but I think Haskell's is quite good overall. Lisp has similar indentation problems despite the lack of a layout rule, since people commonly write (foo (...) (...)) Renaming foo can't confuse the compiler, but I've never had a Haskell layout error change the meaning of my program. At worst it causes the program to be rejected. I do edit my source code in a fixed-width font, but it's a very pretty font (Bitstream Vera Sans Mono). It's a small price to pay for the convenience of being able to use 2D layout, even in languages where it's not significant, and in comments. -- Ben _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe Reply | Threaded Open this post in threaded view | ## Re: PrefixMap is a Trie  In reply to this post by David Place David F.Place wrote: > The PrefixMap datastructure implements a Prefix Tree which allows a > key/value relationship. > > \begin{code} > > data PrefixMap k v = Node (Maybe v) (Map.Map k (PrefixMap k v)) > deriving (Show) > \end{code} You may compare your code with Keith's implemenation of a Trie. http://article.gmane.org/gmane.comp.lang.haskell.libraries/2571Find attached a modified version (that I never tested, though). Christian {- | Module : Trie Copyright : (c) Keith Wansbrough 2005, C. Maeder 2006 License : BSD-style Maintainer : none Stability : experimental Portability : portable This module provides a very basic implementation of the Trie data type, with no great concern for efficiency, or for completeness of API. original version modified using Data.Map and added functions insert and null -} module Trie ( -- * Data type Trie, -- * Constructors empty, insert, unit, plus, plus_C, -- * Primitive discriminators, accessors and mutators null, value, children, value_u, children_u, -- * Basic operations preOrder, upwards, downwards, -- * Derived operations takeWhile, takeWhile_V, fringe, ) where import Prelude hiding (takeWhile, null) import qualified Data.Map as Map import Data.Maybe import Control.Monad -- |A Trie with key elements of type k -- (keys of type [k] ) and values of type v . data Trie k v = Trie { value :: Maybe v, children :: Map.Map k (Trie k v) } -- |Modify the 'children' field of a trie. value_u :: (Maybe v -> Maybe v) -> Trie k v -> Trie k v value_u f p = p { value = f (value p) } -- |Modify the 'children' field of a trie. children_u :: (Map.Map k (Trie k v) -> Map.Map k (Trie k v)) -> Trie k v -> Trie k v children_u f p = p { children = f (children p) } -- |The empty trie. empty :: Trie k v empty = Trie { value = Nothing, children = Map.empty } -- |Test for the empty trie null :: Trie k v -> Bool null t = isNothing (value t) && Map.null (children t) -- |The singleton trie. unit :: Ord k => [k] -> v -> Trie k v unit [] x = Trie { value = Just x, children = Map.empty } unit (k:ks) x = Trie { value = Nothing , children = Map.singleton k (unit ks x) } insert :: Ord k => [k] -> (Maybe v -> Maybe v) -> Trie k v -> Trie k v insert l f t = case l of [] -> t { value = f (value t) } k : ks -> let cs = children t nt = insert ks f Map.findWithDefault empty k cs               in if null nt then t { children = Map.delete k cs }                  else t { children = Map.insert k nt cs } -- |Combining two tries.  The first shadows the second. plus :: Ord k => Trie k v -> Trie k v -> Trie k v plus p1 p2 =     Trie {           value = mplus (value p1) (value p2),           children = Map.unionWith plus (children p1) (children p2)          } -- |Combining two tries.  If the two define the same key, the -- specified combining function is used. plus_C :: Ord k => (v -> v -> v) -> Trie k v -> Trie k v -> Trie k v plus_C f p1 p2 =     Trie {           value =  lift f (value p1) (value p2),           children = Map.unionWith (plus_C f) (children p1) (children p2)          }     where lift _ Nothing y = y           lift _ x Nothing = x           lift _ (Just x) (Just y) = Just (f x y) -- |Enumerate all (key,value) pairs, in preorder. preOrder :: Ord k => [k] -> Trie k v -> [([k],v)] preOrder ks p = getNode p                 ++ concatMap (\(k,p') -> preOrder (ks++[k]) p')                              (Map.toList (children p))     where getNode q = maybe [] (\ v -> [(ks,v)]) (value q) -- |An upwards accumulation on the trie. upwards :: Ord k => (Trie k v -> Trie k v) -> Trie k v -> Trie k v upwards f = f . children_u (Map.map (upwards f)) -- |A downwards accumulation on the trie. downwards :: Ord k => (Trie k v -> Trie k v) -> Trie k v -> Trie k v downwards f = children_u (Map.map (downwards f)) . f -- |Return the prefix of the trie satisfying   f . takeWhile :: Ord k => (Trie k v -> Bool) -> Trie k v -> Trie k v takeWhile f = downwards (children_u (Map.filter f)) -- |Return the prefix of the trie satisfying   f -- on all values present. takeWhile_V :: Ord k => (v -> Bool) -> Trie k v -> Trie k v takeWhile_V f = takeWhile (maybe True f . value) -- |Return the fringe of the trie (the trie composed of only the leaf nodes). fringe :: Ord k => Trie k v -> Trie k v fringe = upwards (\ p -> if Map.null (children p)                          then p else value_u (const Nothing) p) _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe
Open this post in threaded view
|

## Re: Re: PrefixMap: code review request

 In reply to this post by Ben Rudiak-Gould Ben Rudiak-Gould wrote: > Brian Hulley wrote: >> Whoever thought up the original Haskell layout rule assumed that >> people would be happy using a single fixed width font, tabs set to 8 >> spaces, and didn't care about the brittleness of the code (in the >> face of identifier renamings) it allowed one to write. > > Are you complaining that Haskell permits you to write code with these > problems, or that it requires you to? The latter is not true. Instead [snip] Just that it allows you to, because this means other people's code (which you may be editing) can be brittle. > If you have a different layout rule in mind I'd be interested in > hearing it, but I think Haskell's is quite good overall. Here is my proposed layout rule: 1) All layout keywords (where, of, let, do) must either be followed by a single element of the corresponding block type, and explicit block introduced by '{', or a layout block whose first line starts on the *next* line and whose indentation is accomplished *only* by tabs In particular, this allows:           let a = 56 in a*a and           let                 a = 56                 b = 78           in a*b but not           let a = 56                b = 78 or           let a = 56; b = 78                c = 90 I would also make it that explicit braces are not allowed to switch off the layout rule (ie they can be used within a layout), multiline strings would not be permitted, and multiline comments would not be permitted (pragmas could easily be used just by using --#) (I'd have a special keyword eg '{}module' instead of 'module' at the top of a file to switch off layout for the whole file if required, but making the presence of the layout rule depend on whether or not there are surrounding braces makes life *way* too complicated imho) This would give the following advantages: 1) When you see a ';' you could immediately tell which block it belongs to by looking backwards till the next '{' 2) Variable width fonts can be used, or different font faces to represent different sorts of identifier eg class names, tycons, value constructors, operators like seq as opposed to seq etc 3) Using only tabs ensures that vertical alignment goes to the same position on the screen regardless of the font and tabs could even have different widths just like in a wordprocessor 4) Any keypress has a localised effect on the parse tree of the buffer as a whole ( { " no longer kill everything which follows and there would be no {- ) 5) It paves the way for a much more immersive editing environment, but I can't say more about this at the moment because I haven't finished writing it yet and it will be a commercial product :-))) Using my self-imposed layout rule I'm currently editing all my Haskell code in a standard text editor using tabs set to 4 spaces and a variable width font and have no problems. Regards, Brian. _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe
Open this post in threaded view
|

## Re: PrefixMap is a Trie

 In reply to this post by Christian Maeder On Feb 28, 2006, at 8:33 AM, Christian Maeder wrote:You may compare your code with Keith's implemenation of a Trie. http://article.gmane.org/gmane.comp.lang.haskell.libraries/2571 Thanks for the pointer,  I searched for "Prefix Tree" which is an alternative name for trie so I didn't find that implementation.  Perhaps, as you suggest in your code, it's time for Data.Trie. -------------------------------- David F. Place [hidden email] _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe
Open this post in threaded view
|

## Layout rule (was Re: PrefixMap: code review request)

 In reply to this post by Brian Hulley Brian Hulley wrote: > Here is my proposed layout rule: > > 1) All layout keywords (where, of, let, do) must either be followed by a > single element of the corresponding block type, and explicit block > introduced by '{', or a layout block whose first line starts on the > *next* line I wouldn't have much trouble adapting to that. > and whose indentation is accomplished *only* by tabs You can't be serious. This would cause far more problems than the current rule. > I would also make it that explicit braces are not allowed to switch off > the layout rule (ie they can be used within a layout), I don't understand. What does "used within a layout" mean? > multiline strings would not be permitted, They aren't now, except with \ escapes. A stray " will be caught on the same line unless the line happens to end with \ and the next line happens to begin with \, which is exceedingly unusual. > and multiline comments would not be permitted > (pragmas could easily be used just by using --#) But --# doesn't introduce a comment. And this would make UNPACK pragmas rather inconvenient to use. > 1) When you see a ';' you could immediately tell which block it belongs > to by looking backwards till the next '{' I guess that might be helpful, but it doesn't seem easier than looking left to the beginning of the current line and then up to the first less-indented line. > 2) Variable width fonts can be used, They can be used now, if you adhere to a certain style, but not everyone likes that style. I wrote in C++ with a variable width font and tabs at one time, but eventually went back to fixed width. One reason was that I couldn't use comment layout conventions that tend (in my experience) to improve readability more than monospacing hurts it. Another reason was that glyph widths appropriate to natural languages didn't work all that well for source code. Spaces are much more important in source code than in natural language, for example. A proportional font designed for source code would be nice, but I haven't found one yet. Stroustrup used a mixture of proportional and monospaced glyphs in _The C++ Programming Language_ and it worked well. > or different font faces to > represent different sorts of identifier eg class names, tycons, value > constructors, operators like seq as opposed to seq etc Lots of editors do this with monospaced fonts; I think it's orthogonal to the layout issue. > 3) Using only tabs ensures that vertical alignment goes to the same > position on the screen regardless of the font and tabs could even have > different widths just like in a wordprocessor Requiring tabs is a really bad idea. Just forget it. Seriously. > 4) Any keypress has a localised effect on the parse tree of the buffer > as a whole ( { " no longer kill everything which follows and there would > be no {- ) I don't understand why this is an advantage. If you have an editor that highlights comments in green, then large sections of the program will flash green while you type a {- -} comment, which might be annoying, but it also means you'll never forget to close the comment, so the practical benefit of forbidding {- -}, as opposed to simply not typing it yourself, seems nil. > 5) It paves the way for a much more immersive editing environment, but I > can't say more about this at the moment because I haven't finished > writing it yet and it will be a commercial product :-))) I guess everything has been leading up to this, but my reaction is that it renders the whole debate irrelevant. The only reason layout exists in the first place is to make source code look good in ordinary text editors. If you have a high-level source code editor that manipulates the AST, then you don't need layout, or tabs, or any of that silly ASCII stuff. The only time you need to worry about layout is when interoperating with implementations that use the concrete syntax, and then there's nothing to stop you from exporting in any style you like. And when importing, there's no reason to place restrictions on Haskell's layout rule, because the visual layout you display in the editor need have no connection to the layout of the imported file. > Using my self-imposed layout rule I'm currently editing all my Haskell > code in a standard text editor using tabs set to 4 spaces and a variable > width font and have no problems. Which is the best argument for keeping the current rule! If it were changed as you propose, then someday Hugh Briley would come along and complain that Haskell's layout syntax squandered screen space---but he *wouldn't* be able to program in his preferred style, because it would no longer be allowed. Religious freedom is a good thing. {- Ben -} _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe
Open this post in threaded view
|

## Re: Layout rule (was Re: PrefixMap: code review request)

 Ben Rudiak-Gould wrote: > Brian Hulley wrote: >> Here is my proposed layout rule: >> >> 1) All layout keywords (where, of, let, do) must either be followed >> by a single element of the corresponding block type, and explicit >> block introduced by '{', or a layout block whose first line starts >> on the *next* line > > I wouldn't have much trouble adapting to that. > >> and whose indentation is accomplished *only* by tabs > > You can't be serious. This would cause far more problems than the > current rule. Why? Surely typing one tab is better than having to hit the spacebar 4 (or 8) times? > >> I would also make it that explicit braces are not allowed to switch >> off the layout rule (ie they can be used within a layout), > > I don't understand. What does "used within a layout" mean? I meant that {;} would be used just like any other construct that has to respect the layout rule so you could write        let             a = let { b = 6; z = 77;                 h = 99;                                p = 100} in b+z+h + p etc but not:        let             a = let { b = 6; z = 77;             h = 99;          -- this binding would be part of the outermost 'let'                                p = 100} in b+z+h + p > >> multiline strings would not be permitted, > > They aren't now, except with \ escapes. A stray " will be caught on > the same line unless the line happens to end with \ and the next line > happens to begin with \, which is exceedingly unusual. > >> and multiline comments would not be permitted >> (pragmas could easily be used just by using --#) > > But --# doesn't introduce a comment. And this would make UNPACK > pragmas rather inconvenient to use. -- # but I hadn't thought about UNPACK... The motivation in both points is to make it easy for an editor to determine which lines need to be re-parsed based on the number of leading tabs alone. > >> 1) When you see a ';' you could immediately tell which block it >> belongs to by looking backwards till the next '{' > > I guess that might be helpful, but it doesn't seem easier than > looking left to the beginning of the current line and then up to the first > less-indented line. There was an example posted on another thread where someone had got into confusion by using ; after a let binding in a do construct with an explicit brace after the 'do' but not after the 'let' (sorry I can't find it again). Also the current layout rule uses the notion of an implicit opening brace which is a to be regarded as a real opening brace as far as ';' in concerned but an unreal non-existent opening brace as far as '}' is concerned. Thus I think it is a real mix-up. > >> 2) Variable width fonts can be used, > > They can be used now, if you adhere to a certain style, but not > everyone likes that style. I wrote in C++ with a variable width font and > tabs > at one time, but eventually went back to fixed width. One reason was > that I couldn't use comment layout conventions that tend (in my > experience) > to improve readability more than monospacing hurts it. Another reason > was that glyph widths appropriate to natural languages didn't work > all that well for source code. Spaces are much more important in > source code than in natural language, for example. A proportional > font designed for source code would be nice, but I haven't found one > yet. Stroustrup used a mixture of proportional and monospaced glyphs > in _The C++ Programming Language_ and it worked well. >> or different font faces to >> represent different sorts of identifier eg class names, tycons, value >> constructors, operators like seq as opposed to seq etc > > Lots of editors do this with monospaced fonts; I think it's > orthogonal to the layout issue. For example on Windows Trebuchet MS is a very nice font, also Verdana, both of which are not monospaced. But yes I agree it's not a major issue and I just see the option of being able to use them as a nice side-effect. > >> 3) Using only tabs ensures that vertical alignment goes to the same >> position on the screen regardless of the font and tabs could even >> have different widths just like in a wordprocessor > > Requiring tabs is a really bad idea. Just forget it. Seriously. I'm really puzled here. I've been using tabs to indent my C++ code for at least 10 years and don't see the problem. The only problem would be if someone mixed tabs with spaces. Since it has to be either tabs only or spaces only I'd choose tabs only to save keystrokes. I suppose though it is always going to be a matter of personal taste... > >> 4) Any keypress has a localised effect on the parse tree of the >> buffer as a whole ( { " no longer kill everything which follows and >> there would be no {- ) > > I don't understand why this is an advantage. If you have an editor > that highlights comments in green, then large sections of the program > will flash green while you type a {- -} comment, which might be > annoying, but it also means you'll never forget to close the comment, > so the practical benefit of forbidding {- -}, as opposed to simply > not typing it yourself, seems nil. But it makes it much easier for the editor to determine where to start re-parsing from (see below). If you allow {- everything becomes a lot more complicated and who needs them anyway? In Visual C++ for example, you just select a block of text and type Control-K Control-C to put single line comments at the beginning of each line in the block. I think (apart from UNPACK) multi-line comments are just a left-over from very old days before single line comments were invented and editors had these simple macros built-in. > >> 5) It paves the way for a much more immersive editing environment, >> but I can't say more about this at the moment because I haven't >> finished writing it yet and it will be a commercial product :-))) > > I guess everything has been leading up to this, but my reaction is > that it renders the whole debate irrelevant. The only reason layout > exists in the first place is to make source code look good in > ordinary text editors. If you have a high-level source code editor that > manipulates the AST, > then you don't need layout, or tabs, or any of that silly ASCII > stuff. The only time you need to worry about layout is when > interoperating with implementations that use the concrete syntax, and > then there's nothing to stop you from exporting in any style you > like. And when importing, there's no reason to place restrictions on > Haskell's layout rule, because the visual layout you display in the > editor need have no connection to the layout of the imported file. Both 4) and 5) are because it is very much faster to type raw ASCII into an editor than to have to click on all kinds of boxes with the mouse. It is also surprisingly difficult to find an intuitive way of navigating a tree of boxes using only an ASCII keyboard, because of the conflict between the logical parent/child or sibling/sibling relationship and the way these could be laid out on the screen in 2d. Eg the right arrow could mean "go to the next sibling" but the next sibling may have to be laid out underneath the current node, so it all becomes very confusing. I don't think it is possible to lay out code in such a way that all parent/child relationships correspond to a vertical relationship on the screen, but possibly my thinking is too influenced by how programs are usually laid out. Thus I don't think an editor that forced people to work directly with the AST would ever catch on. Years ago I read about a Microsoft Project on "intentional programming" which was about manipulating an arbitrary AST directly but afaik nothing came of it since it was just too painful to use. I also read about some research to do with deriving programs interactively from a proof, where clicking on boxes etc comes into its own, but I don't think there is yet any interactive proof system that comes close to being able to derive "real world" software. Certainly I'd be very interested to know if there is. Currently all the ASCII editors I know of only do keyword highlighting, or occasional ("wait a second while I'm updating the buffer") identifier highlighting. What I'm trying to get however is complete grammatical highlighting and type checking that is instantaneous as the user types code, so this means that the highlighter/type checker needs a complete AST (with 'gap' nodes to represent spans of incomplete/bad syntax) to work from. However it is way too expensive to re-parse the whole of a big buffer after every keypress (I tried it with a parser written in C++ to parse a variant of ML and even with the optimized build and as many algorithmic optimizations as I could think of it was just far too slow, and I wasn't even trying to highlight duplicate identifiers or do type inference) Thus to get a fast responsive editing environment which needs to maintain a parse of the whole buffer to provide effective grammatical highlighting and not just trivial keyword highlighting it seems (to me) to be essential to be able to limit the effect of any keystroke especially when the user is just typing text from left to right but where there may be more stuff after the cursor eg if the user has gone back to editing a function at the top of a file. Things like {- would mean that all the parse trees for everything after it would have to be discarded. Also, flashing of highlighting on this scale could be very annoying for a user, so I'd rather just delete this particular possibility of the user getting annoyed when using my software :-) thus my hopeless attempts to convince everyone that {- is bad news all round :-))) Of course you're right that for loading and saving files I could do Haskell -> My representation -> Haskell but then someone reading a tutorial on Haskell would find that my editor (or by the above arguments, any similar rich-feedback editor) didn't accept all the examples... > >> Using my self-imposed layout rule I'm currently editing all my >> Haskell code in a standard text editor using tabs set to 4 spaces >> and a variable width font and have no problems. > > Which is the best argument for keeping the current rule! If it were > changed as you propose, then someday Hugh Briley would come along and > complain that Haskell's layout syntax squandered screen space---but > he *wouldn't* be able to program in his preferred style, because it would > no longer be > allowed. Religious freedom is a good thing. Freedom has many dimensions, some interdependent: Simplifying the syntax by using a simpler layout rule would make it possible for people to create very efficient incremental parsers and in turn develop more responsive development environments for consensus Haskell, which in turn might lead to more people understanding and therefore using the language, more libraries, more possibilities for individual people to earn a living themselves by programming, more people thinking things out for themselves instead of absorbing corporate propaganda, thus fairer laws, justice, liberty, and *true human freedom* for all!!! :-) Regards, Brian. -------------------------------------------------------- "... flee from the Hall of Learning. This Hall is dangerous in its perfidious beauty, is needed but for thy probation. Beware, Lanoo, lest dazzled by illusive radiance thy soul should linger and be caught in its deceptive light."                                                    -- Voice of the Silence stanza 33 _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe
Open this post in threaded view
|

## Re: Layout rule (was Re: PrefixMap: code review request)

 On Wed, 1 Mar 2006, Brian Hulley wrote: > Ben Rudiak-Gould wrote: > > Brian Hulley wrote: > > > Here is my proposed layout rule: > > > > > > and whose indentation is accomplished *only* by tabs > > > > You can't be serious. This would cause far more problems than the > > current rule. > > Why? Surely typing one tab is better than having to hit the spacebar 4 (or 8) > times? > Not when it prevents me from ever exhibiting the slightest shred of style in my code. I use that control for readability purposes in my code. -- [hidden email] "My religion says so" explains your beliefs. But it doesn't explain why I should hold them as well, let alone be restricted by them. _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe
Open this post in threaded view
|

## Re: Layout rule (was Re: PrefixMap: code review request)

Open this post in threaded view
|

## Re: Layout rule

Open this post in threaded view
|

## Re: Layout rule (was Re: PrefixMap: code review request)

Open this post in threaded view
|

## Re: Layout rule (was Re: PrefixMap: code review request)

 In reply to this post by Brian Hulley On Wednesday 01 March 2006 02:36, Brian Hulley wrote: > Ben Rudiak-Gould wrote: > > Brian Hulley wrote: > >> Here is my proposed layout rule: > >> > >> 1) All layout keywords (where, of, let, do) must either be > >> followed by a single element of the corresponding block type, and > >> explicit block introduced by '{', or a layout block whose first > >> line starts on the *next* line > > > > I wouldn't have much trouble adapting to that. > > > >> and whose indentation is accomplished *only* by tabs > > > > You can't be serious. This would cause far more problems than the > > current rule. > > Why? Surely typing one tab is better than having to hit the spacebar > 4 (or 8) times? What kind of editor are you using? Notepad? I am used to hitting TAB key and get the correct number of spaces, according to how I configured my editor (NEdit) for the current language mode. TAB characters in program text should be forbidden by law. As well as editors that by default insert a tab char instead of spaces. Ben _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe