# Unicode Haskell source -- Yippie!

42 messages
123
Open this post in threaded view
|

## Re: Unicode Haskell source -- Yippie!

 On Thu, Apr 24, 2014 at 10:27 AM, Kyle Murphy wrote: It's an interesting feature, and nice if you want that sort of thing, but not something I'd personally want to see as the default. Deviating from the standard ASCII set of characters is just too much of a hurdle to usability of the language. On the other hand, maybe if its good enough for the entire field of Mathematics since forever there might be some benefit in it for us.  _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe
Open this post in threaded view
|

## Re: Unicode Haskell source -- Yippie!

 On 2014-04-26 17:58, David Fox wrote: > On Thu, Apr 24, 2014 at 10:27 AM, Kyle Murphy <[hidden email]> wrote: > >> It's an interesting feature, and nice if you want that sort of thing, but >> not something I'd personally want to see as the default. Deviating from the >> standard ASCII set of characters is just too much of a hurdle to usability >> of the language. >> > > On the other hand, maybe if its good enough for the entire field of > Mathematics since forever there might be some benefit in it for us. > Typing into a computer != Handwriting (in various significant ways). Most of mathematics notation predates computers/typewriters. Just compare writing a formula by hand and typing the same formula in (La)TeX. Regards, _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe
Open this post in threaded view
|

## Re: Unicode Haskell source -- Yippie!

 In reply to this post by David Fox-12 the vast majority of math is written using latex, which while supporting unicode, is mostly ascii :) On Sat, Apr 26, 2014 at 11:58 AM, David Fox wrote: On Thu, Apr 24, 2014 at 10:27 AM, Kyle Murphy wrote: It's an interesting feature, and nice if you want that sort of thing, but not something I'd personally want to see as the default. Deviating from the standard ASCII set of characters is just too much of a hurdle to usability of the language. On the other hand, maybe if its good enough for the entire field of Mathematics since forever there might be some benefit in it for us.  _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe
Open this post in threaded view
|

## Re: Unicode Haskell source -- Yippie!

 In reply to this post by Nickolay Kudasov Nickolay Kudasov wrote: >> eg I would like to see \ spelled as λ > > ​I have symbol substitution enabled in Vim. E.g. when I write \ (and it is > syntactically lambda) I get λ. The same way composition (.) is replaced > with ∘. The same trick can be enabled for other operators as well. So I > have normal text and nice presentation in *my* text editor: it does not > bother anyone but me. I think this is the right approach. See also https://github.com/i-tu/Hasklig/The main problem with special Unicode characters, as I see it, is that it is no longer possible to distinguish characters unambiguously just by looking at them. Apart from questions of maintainability, this is also a potential security problem: it enables an attacker to slip in malicious code simply by importing a module whose name looks like a well known safe module. In a big and complex piece of software, such an attack might not be spotted for some time. Cheers Ben -- "Make it so they have to reboot after every typo." -- Scott Adams _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe
Open this post in threaded view
|

## Re: Unicode Haskell source -- Yippie!

 In reply to this post by David Fox-12 On Sat, Apr 26, 2014 at 9:28 PM, David Fox wrote: On Thu, Apr 24, 2014 at 10:27 AM, Kyle Murphy wrote: It's an interesting feature, and nice if you want that sort of thing, but not something I'd personally want to see as the default. Deviating from the standard ASCII set of characters is just too much of a hurdle to usability of the language. On the other hand, maybe if its good enough for the entire field of Mathematics since forever there might be some benefit in it for us.  Chris spoke of his choice of Idris over Agda related to not going overboard with unicode. The FAQ he linked to has this to say:| And I'm sure that in a few years time things will be different and software will | cope better and it will make sense to revisit this. For now, however, I would | prefer not to allow arbitrary unicode symbols in operators.1. I'd like to underscore the 'arbitrary'.  Why is ASCII any less arbitrary -- apart from an increasingly irrelevant historical accident -- than Arabic, Bengali, Cyrillic, Deseret? [Hint: Whats the A in ASCII?]  By contrast math may at least have some pretensions to universality? 2. Maybe its a good time now to 'revisit'?  Otherwise like klunky-qwerty, it may happen that when the technological justifications for an inefficient choice are long gone, social inertia will prevent any useful change. On Sun, Apr 27, 2014 at 3:00 PM, Ben Franksen wrote: The main problem with special Unicode characters, as I see it, is that it is no longer possible to distinguish characters unambiguously just by looking at them. Apart from questions of maintainability, this is also a potential security problem: it enables an attacker to slip in malicious code simply by importing a module whose name looks like a well known safe module. In a big and complex piece of software, such an attack might not be spotted for some time. Bang on!However the Pandora-box is already open and the creepy-crawlies are all over us. Witness:GHCi, version 7.6.3: http://www.haskell.org/ghc/  :? for help Loading package ghc-prim ... linking ... done.Loading package integer-gmp ... linking ... done.Loading package base ... linking ... done.Prelude> let а = 1Prelude> a:11:1: Not in scope: a' Prelude> In case you cant see it the two a's are different unicode characters:CYRILLIC SMALL LETTER AvsLATIN SMALL LETTER A RegardsRusi _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe
Open this post in threaded view
|

## Re: Unicode Haskell source -- Yippie!

 On 2014-04-27 13:45, Rustom Mody wrote: > > 1. I'd like to underscore the 'arbitrary'.  Why is ASCII any less arbitrary > -- apart from an increasingly irrelevant historical accident -- than > Arabic, Bengali, Cyrillic, Deseret? [Hint: Whats the A in ASCII?]  By > contrast math may at least have some pretensions to universality? The symbols in math are also mostly arbitrary. In effect they should be considered as "parallel" to the Cyrillic, Latin or Greek alphabets. (Of course math borrows quite a few symbols from the latter, but I digress.) > > 2. Maybe its a good time now to 'revisit'?  Otherwise like klunky-qwerty, > it may happen that when the technological justifications for an inefficient > choice are long gone, social inertia will prevent any useful change. > Billions of people have QWERTY keyboards. Unless you come up with something *radically* better then they're not going to change. Inertia has made anything but incremental change impossible. (I note that Microsoft actually managed to change the QWERTY keyboard incrementally a decade or two ago by adding the Windows and Context Menu keys. Of course that didn't removing/change any of the existing functionality of the basic QWERTY, so it was a relatively small change.) Using "macros" like "\" (for lambda) or "\sum_{i=0}^{n} i" and having the editor/IDE display that differently is at least semi-practical for typing stuff into your computer using QWERTY. Regards, _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe
Open this post in threaded view
|

## Re: Unicode Haskell source -- Yippie!

 In reply to this post by Rustom Mody On Sun, Apr 27, 2014 at 7:45 AM, Rustom Mody wrote: 1. I'd like to underscore the 'arbitrary'.  Why is ASCII any less arbitrary -- apart from an increasingly irrelevant historical accident -- than Arabic, Bengali, Cyrillic, Deseret? [Hint: Whats the A in ASCII?]  By contrast math may at least have some pretensions to universality? Math notations are not as universal as many would like to think, sadly.And I am not sure the historical accident is really irrelevant; as the same "accident" was involved in most of the computer languages and protocols we use daily, I would not be at all surprised to find that there are subtle dependencies buried in the whole mess --- similar to how (most... sigh) humans pick up language and culture signals as children too young to apply any kind of critical analysis to it, and can have real problems trying to eradicate or modify them later. (Yes, languages can be fixed. But how many tools do you use when working with them? It's almost certainly more than the ones that immediately come to mind or are listed on e.g. Hackage. In particular, that ligature may be great in your editor and unfortunate when you pop a terminal and grep for it --- especially if you start extending this to other languages so you need a different set of ligatures [a different font!] for each language....) -- brandon s allbery kf8nh                               sine nomine associates unix, openafs, kerberos, infrastructure, xmonad        http://sinenomine.net _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe
Open this post in threaded view
|

## Re: Unicode Haskell source -- Yippie!

Open this post in threaded view
|

## Re: Unicode Haskell source -- Yippie!

 On 27 Apr 2014, at 19:58, Rustom Mody wrote: > If you had the choice would you allow that f-i ligature to be thus > confusable with the more normal fi?  I probably wouldn't but nobody is > asking us and the water that's flowed under the bridge cannot be > 'flowed' > backwards (to the best of my knowledge!) > > In case that seems far-fetched consider the scenario: > 1. Somebody loads (maybe innocently) the code involving variables like > 'fine' > into a 'ligature-happy 'IDE/editor' > 2. The editor quietly changes all the fine to ﬁne. > 3. Since all those variables are in local scope nothing untoward is > noticed > 4. Until someone loads it into an 'old-fashioned' editor... and > then... I develop Hasklig, and have enjoyed the discussion about the pros and cons of ligatures in coding fonts. However, I really must protest this line of reasoning since it is based on false premises. As an opentype feature, ligatures have nothing to do with the 'fi' and 'fl' unicode points, (which are legacy only, and heavily discouraged by the unicode consortium), or with unicode at all. The encoding of the file could be pure ASCII for all the ligatures care. The font used changes how the text looks, and nothing else. When speaking of special unicode symbols in code, I agree with most objections raised against them :) br, Ian P.S. Sorry for potential repost - I'm getting automatic rejects _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe
Open this post in threaded view
|

## Re: Unicode Haskell source -- Yippie!

 In reply to this post by Rustom Mody On 25/04/2014, at 5:15 AM, Rustom Mody wrote: > x ÷ y   = divMod x y This one looks wrong to me. In common usage, ÷ indicates plain old division, e.g., 3÷2 = 1½. See for example http://en.wikipedia.org/wiki/Table_of_mathematical_symbolsOne possibility would be > x ÷ y = x / y :: Rational _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe
Open this post in threaded view
|

## Re: Unicode Haskell source -- Yippie!

Open this post in threaded view
|

## Re: Unicode Haskell source -- Yippie!

 In reply to this post by Rustom Mody On 26/04/2014, at 1:30 AM, Rustom Mody wrote: > On Fri, Apr 25, 2014 at 6:32 PM, Chris Warburton <[hidden email]> wrote: > Rustom Mody <[hidden email]> writes: > > > As for APL, it failed for various reasons eg > > - mixing up assembly language (straight line code with gotos) with > > functional idioms > > - the character set was a major hurdle in the 60s. Thats not an issue today > > when most OSes/editors are unicode compliant I strongly suspect that the failure of APL had very little to do with the character set.  When APL was introduced, the character set was just a matter of dropping in a different golf-ball.  Later, it was just bits on a screen.  Heck in 1984 I was using C and LaTeX on an IBM mainframe where the terminals displayed curly braces as spaces, and oddly enough that didn't kill C...  In any case, it was possible to enter any arbitrary APL text using straight ASCII, so that was no great problem. There were a number of much more serious issues with APL. (1) In "classic" APL everything is an n-dimensional array, either an array of characters or an array of (complex) numbers. An absolutely regular array.  Want to process a collection of records where some of the fields are strings?  No can do. Want to process a collection of strings of different length? No can do: you must use a 2-dimensional array, padding all the strings to the same length.  Want type checking? Hysterical laughter. APL2 "fixed" this by introducing nested arrays.  This is powerful, but occasionally clumsy.  And it is positional, not named.  You *can* represent trees, you can represent records with mixed fields, you can do all sorts of stuff. But it's positional, not named. (2) There aren't _that_ many APL symbols, and it didn't take too long to learn them, and once you did, they weren't that hard to remember.  (Although the use of the horseshoe symbols in APL2 strikes me as *ab*use.) Problem is, a whole lot of other things were done with numbers.  Here's the trig functions:         0 ◦ x sqrt(1-x**2)         1 ◦ x sin x         ¯1 ◦ x arcsin x         2 ◦ x cos x         ¯2 ◦ x arccos x         3 ◦ x tan x         ¯3 ◦ x arctan x         4 ◦ x sqrt(x**2+1)         ¯4 ◦ x sqrt(x**2-1)         5 ◦ x sinh x         ¯5 ◦ x arcsinh x         6 ◦ x cosh x         ¯6 ◦ x arccosh x         7 ◦ x tanh x         ¯7 ◦ x arctanh x Who thought _that_ was a good idea? Well, presumably it was the same person who introduced the "I-beam functions".  A range of system functions (time of day, cpu time used, space available, ...) were distinguished by *numbers*. (3) Which brings me to the dialect problem.  No two systems had the *same* set of I-beam functions.  You couldn't even rely on two systems having the same *kind* of approach to files.  There were several commercial APL systems, and they weren't priced for the hobbyist or student. _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe
Open this post in threaded view
|

## Re: Unicode Haskell source -- Yippie!

 In reply to this post by Ben Franksen On 27/04/2014, at 9:30 PM, Ben Franksen wrote: > The main problem with special Unicode characters, as I see it, is that it is > no longer possible to distinguish characters unambiguously just by looking > at them. "No longer"? Hands up all the people old enough to have used "coding forms". Yes, children, there was a time when programmers wrote their programs on printed paper forms (sort of like A4 tipped sideways) so that the keypunch girls (not my sexism, historical accuracy) knew exactly which column each character went in.  And at the top of each sheet was a row of boxes for you to show how you wrote 2 Z 7 1 I ! 0 O and the like. For that matter, I recall a PhD thesis from the 80s in which the author spent a page grumbling about the difficulty of telling commas and semicolons apart... > Apart from questions of maintainability, this is also a potential > security problem: it enables an attacker to slip in malicious code simply by > importing a module whose name looks like a well known safe module. In a big > and complex piece of software, such an attack might not be spotted for some > time. Again, considering the possibilities of "1" "i" "l", I don't think we actually have a new problem here. Presumably this can be addressed by tools: "here is are some modules, tell me what exactly they depend on" not entirely unlike ldd(1). Of course, the gotofail bug shows that it's not enough to _have_ tools like that, you have to use them and review the results periodically. _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe
Open this post in threaded view
|

## Re: Unicode Haskell source -- Yippie!

 In reply to this post by Richard A. O'Keefe On Mon, Apr 28, 2014 at 6:46 AM, Richard A. O'Keefe wrote: On 25/04/2014, at 5:15 AM, Rustom Mody wrote: > x ÷ y   = divMod x y This one looks wrong to me. In common usage, ÷ indicates plain old division, e.g., 3÷2 = 1½. See for example http://en.wikipedia.org/wiki/Table_of_mathematical_symbols One possibility would be > x ÷ y = x / y :: Rational Thanks Richard for (as usual!) look at that list with a fine-toothed combI started with writing a corresponding list for python:http://blog.languager.org/2014/04/unicoded-python.html As you will see I mention there that ÷ mapped to divMod is one but hardly the only possibility.That list is mostly about math, not imperative features and so carries over from python to haskell mostly unchanged. Please (if you have 5 minutes) glance at it and give me your comments. I may then finish a similar one for Haskell.Thanks Rusi-- http://www.the-magus.inhttp://blog.languager.org _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe
Open this post in threaded view
|

## Re: Unicode Haskell source -- Yippie!

Open this post in threaded view
|

## Re: Unicode Haskell source -- Yippie!

 Hi RichardThanks for a vigorous and rigorous appraisal of my blog post:http://blog.languager.org/2014/04/unicoded-python.html However this is a Haskell list and my post being not just a discussion about python but some brainstorming for how python could change, a detailed discussion about that is probably too off-topic here dont you think? So for now let me address just one of your points, which is appropriate for this forum.I'd be pleased to discuss the other points you raise off list.Also, while Ive learnt a lot from this thread, I also see some confusions and fallacies. So before drilling down into details and losing the forest for the trees, I'd prefer to start with a broad perspective rather than a narrow technological focus -- more at end. On Tue, Apr 29, 2014 at 11:04 AM, Richard A. O'Keefe wrote: Before speaking of "Apl's mistakes", one should be clear about what exactly those mistakes *were*. I should point out that the symbols of APL, as such, were not a problem.  But the *number* of such symbols was.  In order to avoid questions about operator precedence, APL *hasn't* any.  In the same way, Smalltalk has an extensible set of 'binary selectors'. If you see an expression like         a ÷> b ~@ c which operator dominates which?  Smalltalk adopted the same solution as APL:  no operator precedence. Before Pascal, there was something approaching a consensus in programming languages that         **                      tightest         *,/,div,mod         unary and binary +,-         relational operators         not         and         or In order to make life easier with user-defined operators, Algol 68 broke this by making unary operators (including not and others you haven't heard of like 'down' and 'upb') bind tightest. As it turned out, this make have made life easier for the compiler, but not for people. In order, allegedly, to make life easier for students, Pascal broke this by making 'or' and 'and' at the same level as '+' and '*'. To this day, many years after Pascal vanished (Think Pascal is dead, MrP is dead, MPW Pascal is dead, IBM mainframe Pascal died so long ago it doesn't smell any more, Sun Pascal is dead, ...) a couple of generations of programmers believe that you have to write         (x > 0) && (x < n) in C, because of what their Pascal-trained predecessor taught them. If we turn to Unicode, how should we read         a ⊞ b ⟐ c Maybe someone has a principled way to tell.  I don't.Without claiming to cover all cases, this is a 'principle'If we have:(⊞) :: a -> a -> b (⟐) :: b -> b -> cthen ⊞'s precedence should be higher than ⟐.This is what makes it natural to have the precedences of (+) (<) (&&) in decreasing order. This is also why the bitwise operators in C have the wrong precedence:x & 0xF == 0xFhas only 1 meaningful interpretation; C chooses the other!The error comes (probably) from treating & as close to the logical operators like && whereas in fact it is more kin to arithmetic operators like +. There are of course other principles:Dijkstra argued vigorously that boolean algebra being completely symmetric in (∨,True)  (∧, False),  ∧, ∨ should have the same precedence. Evidently not too many people agree with him!----------------------To come back to the broader questions. I started looking at Niklas' link (thanks Niklas!)http://www.haskell.org/ghc/docs/latest/html/users_guide/syntax-extns.html#unicode-syntax and I find that the new unicode chars for -<< and >>- are missing.Ok, a minor doc-bug perhaps? Poking further into that web-page, I find that it hascharset=ISO-8859-1Running w3's validator http://validator.w3.org/ on it one gets:No DOCTYPE found! What has this got to do with unicode in python source?That depends on how one sees it.When I studied C (nearly 30 years now!) we used gets as a matter of course. Today we dont.Are Kernighan and Ritchie wrong in teaching it?Are today's teacher's wrong in proscribing it? I believe the only reasonable outlook is that truth changes with time: it was ok then; its not today.Likewise DOCTYPE-missing and charset-other-than-UTF-8. Random example  showing how right yesterday becomes wrong today:http://www.sitepoint.com/forums/showthread.php?660779-Content-type-iso-8859-1-or-utf-8 Unicode vs ASCII in program source is similar (I believe).My thoughts on this (of a philosophical nature) are:http://blog.languager.org/2014/04/unicode-and-unix-assumption.html If we can get the broader agreements (disagreements!) out of the way to start with, we may then look at the details.Thanks and regards, Rusi _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe
Open this post in threaded view
|

## Re: Unicode Haskell source -- Yippie!

 On Wednesday 30 April 2014, 13:51:38, Rustom Mody wrote: > Without claiming to cover all cases, this is a 'principle' > If we have: > (⊞) :: a -> a -> b > (⟐) :: b -> b -> c > then ⊞'s precedence should be higher than ⟐. But what if (⟐) :: b -> b -> a? > This is what makes it natural to have the precedences of (+) (<) (&&) in > decreasing order. > > This is also why the bitwise operators in C have the wrong precedence: > x & 0xF == 0xF > has only 1 meaningful interpretation; C chooses the other! > The error comes (probably) from treating & as close to the logical > operators like && whereas in fact it is more kin to arithmetic operators > like +. That comes from & and | being logical operators in B. Quoth Dennis Ritchie (http://cm.bell-labs.com/who/dmr/chist.html in the section "Neonatal C"): > to make the conversion less painful, we decided to keep the precedence of > the & operator the same relative to ==, and merely split the precedence of > && slightly from &. Today, it seems that it would have been preferable to > move the relative precedences of & and ==, and thereby simplify a common C > idiom _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe
Open this post in threaded view
|

## Re: Unicode Haskell source -- Yippie!

 On Wed, Apr 30, 2014 at 2:33 PM, Daniel Fischer wrote: > x & 0xF == 0xF > has only 1 meaningful interpretation; C chooses the other! > The error comes (probably) from treating & as close to the logical > operators like && whereas in fact it is more kin to arithmetic operators > like +. That comes from & and | being logical operators in B. Quoth Dennis Ritchie (http://cm.bell-labs.com/who/dmr/chist.html in the section "Neonatal C"): > to make the conversion less painful, we decided to keep the precedence of > the & operator the same relative to ==, and merely split the precedence of > && slightly from &. Today, it seems that it would have been preferable to > move the relative precedences of & and ==, and thereby simplify a common C > idiom Nice! I learn a bit of history.Hope we learn from it! viz. Some things which are easy in a state of transition become painful in a (more) steady state. _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe
Open this post in threaded view
|

## Re: Unicode Haskell source -- Yippie!

 In reply to this post by Daniel Fischer On Wed, Apr 30, 2014 at 2:33 PM, Daniel Fischer wrote: On Wednesday 30 April 2014, 13:51:38, Rustom Mody wrote: > Without claiming to cover all cases, this is a 'principle' > If we have: > (⊞) :: a -> a -> b > (⟐) :: b -> b -> c > then ⊞'s precedence should be higher than ⟐. But what if (⟐) :: b -> b -> a? Sorry, missed that question tucked away :-)I did say a (not the) principle, not claiming to cover all cases! I guess it should be non-associative (ie infix without l/r) same precedence? _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe
 In reply to this post by Rustom Mody I wrote >> If we turn to Unicode, how should we read >> >>         a â b â c >> >> Maybe someone has a principled way to tell.  I don't. Rustom Mody wrote: > > Without claiming to cover all cases, this is a 'principle' > If we have: > (â) :: a -> a -> b > (â) :: b -> b -> c > > then â's precedence should be higher than â. I always have trouble with "higher" and "lower" precedence, because I've used languages where the operator with the bigger number binds tighter and languages where the operator with the bigger number gets to dominate the other.  Both are natural enough, but with opposite meanings for "higher". This principle does not explain why * binds tighter than +, which means we need more than one principle. It also means that if OP1 :: a -> a -> b and OP2 :: b -> b -> a then OP1 should be higher than OP2 and OP2 should be higher than OP1, which is a bit of a puzzler, unless perhaps you are advocating a vaguely CGOL-ish asymmetric precedence scheme where the precedence on the left and the precedence on the right can be different. For the record, let me stipulate that I had in mind a situation where OP1, OP2 : a -> a -> a.  For example, APL uses the floor and ceiling operators infix to stand for max and min.  This principle offers us no help in ordering max and min. Or consider APL again, whence I'll borrow (using ASCII because this is webmail tonight)     take, rotate :: Int -> Vector t -> Vector t Haskell applies operator precedence before it does type checking, so how would it know to parse     n take m rotate v as (n take (m rotate` v))? I don't believe there was anything in my original example to suggest that either operator had two operands of the same type, so I must conclude that this principle fails to provide any guidance in that case (like this one). > This is what makes it natural to have the precedences of (+) (<) (&&) in > decreasing order. > > This is also why the bitwise operators in C have the wrong precedence: Oh, I agree with that! > The error comes (probably) from treating & as close to the logical > operators like && whereas in fact it is more kin to arithmetic operators > like +. The error comes from BCPL where & and && were the same operator (similarly | and ||).  At some point in the evolution of C from BCPL the operators were split apart but the bitwise ones left in the wrong place. > > There are of course other principles: > Dijkstra argued vigorously that boolean algebra being completely symmetric > in > (â¨,True)  (â§, False),  â§, â¨ should have the same precedence. > > Evidently not too many people agree with him! Sadly, I am reading this in a web browser where the Unicode symbols are completely garbled.  (More precisely, I think it's WebMail doing it.)  Maybe Unicode isn't ready for prime time yet? You might be interested to hear that in the Ada programming language, you are not allowed to mix 'and' with 'or' (or 'and then' with 'or else') without using parentheses.  The rationale is that the designers did not believe that enough programmers understood the precedence of and/or.  The GNU C compiler kvetches when you have p && q || r without otiose parentheses.  Seems that there are plenty of designers out there who agree with Dijkstra, not out of a taste for well-engineered notation, but out of contempt for the Average Programmer. > When I studied C (nearly 30 years now!) we used gets as a matter of > course. > Today we dont. Hmm.  I started with C in late 1979.  Ouch.  That's 34 and a half years ago.  This was under Unix version 6+, with a slightly "pre-classic" C.  A little later we got EUC Unix version 7, and a 'classic' C compiler that, oh joy, supported /\ (min) and \/ (max) operators.  [With a bug in the code generator that I patched.] > Are Kernighan and Ritchie wrong in teaching it? > Are today's teacher's wrong in proscribing it? > > I believe the only reasonable outlook is that truth changes with time: it > was ok then; its not today. In this case, bull-dust!  gets() is rejected today because a botch in its design makes it bug-prone.  Nothing has changed. It was bug-prone 34 years ago.  It has ALWAYS been a bad idea to use gets().  Amongst other things, the Unix manuals have always presented the difference between gets() -- discards the terminator -- and fgets() -- annoyingly retains the terminator -- as a bug which they thought it was too late to fix; after all, C had hundreds of users!  No, it was obvious way back then:  you want to read a line?  Fine, WRITE YOUR OWN FUNCTION, because there is NO C library function that does quite what you want.  The great thing about C was that you *could* write your own line-reading function without suffering. Not only would your function do the right thing (whatever you conceived that to be), it would be as fast, or nearly as fast, as the built-in one.  Try doing *that* in PL/I! No, in this case, *opinions* may have changed, peoples *estimation* of and *tolerance for* the risks may have changed, but the truth has not changed. > > Likewise DOCTYPE-missing and charset-other-than-UTF-8. > Random example  showing how right yesterday becomes wrong today: > http://www.sitepoint.com/forums/showthread.php?660779-Content-type-iso-8859-1-or-utf-8Well, "missing" DOCTYPE is where it starts to get a bit technical. An SGML document is basically made up of three parts:   - an SGML declaration (meta-meta-data) that tells the     parser, amongst other things, what characters to use for     delimiters, whether various things are case sensitive,     what the numeric limits are, and whether various features     are enabled.   - a Document Type Declaration (meta-data) that conforms to     the lexical rules set up by the SGML declaration and     defines (a) the grammar rules and (b) a bunch of macros.   - a document (data). The SGML declaration can be supplied to a parser as data (and yes, I've done that), or it can be stipulated by convention (as the HTML standards do).  In the same way, the DTD can be   - completely declared in-line   - defined by reference with local amendments   - defined solely by reference   - known by convention. If there is a convention that a document without a DTD uses a particular DTD, SGML is fine with that.  (It's all part of "entity management", one of the minor arcana of SGML.) As for the link in question, it doesn't show right turning into wrong.  A quick summary of the sensible part of that thread:    - If you use a tag to specify the encoding of your      file, it had better be *right*.      This has been true ever since tags first existed.    - If you have a document in Latin 1 and any characters      outside that range are written as character entity references      or numeric character references, there is no need to change.      No change of right to wrong here!    - If you want to use English punctuation marks like dashes and      curly quotes, using UTF-8 will let you write these characters      without character entities or NCRs.      This is only half true.  It will let you do this conveniently      IF your local environment has fonts that include the characters.      (Annoyingly, in Mac OS 10.6, which I'm typing on,      Edit|Special characters is not only geographically confused,      listing Coptic as a *European* script -- last type I checked      Egypt was still in Africa -- but it doesn't display any Coptic      characters.  In the Mac OS 10.7 system I normally use,      Edit|Special characters got dramatically worse as an interface,      but no more competent with Coptic characters.  Just because a      character is in Unicode doesn't mean it can be *used*,      practically speaking.)      Instead of saying that what is wrong has become or is becoming      right, I'd prefer to say that what was impossible is becoming      possible and what was broken (Unicode font support) is gradually      getting fixed.    - Some Unicode characters, indeed, some Latin 1 characters, are      so easy to confuse with other characters that it is advisable      to use character entities.      Again, nothing about wrong turning into right.  This was good      advice as soon as Latin 1 came out. > Unicode vs ASCII in program source is similar (I believe). Well, not really.  People using specification languages like Z routinely used characters way outside the ASCII range; one way was to use LaTeX.  Another way was to have GUI systems that let you key in using LaTeX character names or menus but see the intended characters.  Back in about 1984 I was able to use a 16-bit character set on the Xerox Lisp Machines.  I've still got a manual for the XNS character set somewhere.  In one of the founding documents for the ISO Prolog standard, I recommended, in 1984, that the Prolog standard.  That's THREE YEARS before Unicode was a gleam in its founders' eyes. This is NOT new.  As soon as there were bit-mapped displays and laser printers, there was pressure to allow a wider range of characters in programs.  Let me repeat that: 30 years ago I was able to use non-ASCII characters in computer programs. *Easily*, via virtual keyboards. In 1987, the company I was working at in California revamped their system to handle 16-bit characters and we bought a terminal that could handle Japanese characters.  Of course this was because we wanted to sell our system in Japan. But this was shortly before X11 came out; the MIT window system of the day was X10 and the operating system we were using the 16-bit characters on was VMS.  That's 27 years ago. This is not new. So what _is_ new? * A single standard.   Wait, we DON'T have a single standard.  We have a single   standard *provider* issuing a rapid series of revisions   of an increasingly complex standard, where entire features   are first rejected outright, then introduced, and then   deprecated again.  Unicode 6.3 came out last year with   five new characters (bringing the total to 110,122),   over a thousand new character *variants*, two new normative   properties, and a new BIDI algorithm which I don't yet   understand.  And Unicode 7.0 is due out in 3 months.   Because of this   - different people WILL have tools that understand different     versions of Unicode.  In fact, different tools in the same     environment may do this.   - your beautiful character WILL show up as garbage or even     blank on someone's screen UNLESS it is an old or extremely     popular (can you say Emoji?  I knew you could.  Can you     teach me how to say it?) one.   - when proposing to exploit Unicode characters, it is VITAL     to understand that the Unicode "stability" rules are and     which characters have what stable properties. * With large cheap discs, large fonts are looking like a lot less   of a problem.  (I failed to learn to read the Armenian letters,   but do have those.  I succeeded in learning to read the Coptic   letters -- but not the language(s)! -- but don't have those.   Life is not fair.) * We now have (a series of versions of) a standard character set   containing a vast number of characters.  I very much doubt whether   there is any one person who knows all the Unicode characters. * Many of these characters are very similar.  I counted 64 "right   arrow" characters before I gave up; this didn't include harpoons.   Some of these are _very_ similar.  Some characters are visibly   distinct, but normally regarded as mere stylistic differences.   For example, <= has at least three variations (one bar, slanted;   one bar, flat; two bars, flat) which people familiar with   less than or equal have learned *not* to tell apart. But they   are three different Unicode characters, from which we could   make three different operators with different precedence or   associativity, and of course type. > My thoughts on this (of a philosophical nature) are: > http://blog.languager.org/2014/04/unicode-and-unix-assumption.html> > If we can get the broader agreements (disagreements!) out of the way to > start with, we may then look at the details. I think Haskell can tolerate an experimental phase where people try out a lot of things as long as everyone understands that it *IS* an experimental phase, and as long as experimental operators are kept out of Hackage, certainly out of the Platform, or at least segregate it into areas with big flashing "danger" signs. I think a *small* number of "pretty" operators can be added to Haskell, without the sky falling, and I'll probably quite like the result.  (Does anyone know how to get a copy of the collected The Squiggolist?)  Let's face it, if a program is full of Armenian identifiers or Ogham ones I'm not going to have a clue what it's about anyway.  But keeping the "standard" -- as in used in core modules -- letter and operator sets smallish is probably a good idea. _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe