Thank you all for the responses. Here's an example:
As I alrerady said, I tried to parse the MMIXAL assembly language. Each instruction has up to three operands, looking like this: @+4 (Jump for bytes forward) "foo" (the string foo" '0'>>(1+2) etc. A string literal may contain anything but a newline, (there are no escape codes or similar). But when I add a check for a newline, the parser just fails and the next one is tried. This is undesired, as I want to return an error like "unexpected newline" instead. How is this handled in other parsers? _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
Hi,
Robert Clausecker wrote: > Each instruction has up to three operands, looking like this: > > @+4 (Jump for bytes forward) > "foo" (the string foo" > '0'>>(1+2) > > etc. A string literal may contain anything but a newline, (there are > no escape codes or similar). But when I add a check for a newline, > the parser just fails and the next one is tried. This is undesired, as > I want to return an error like "unexpected newline" instead. How is > this handled in other parsers? I would expect that the other parsers are tried, but fail, because they do not accept an initial quotation mark. You get two errors messages then: 1. Unexpected newline after quotation mark 2. Unexpected quotation mark These two error messages reflect the two ways to solve the problem: Either delete the first quotation mark, or add a second one. Tillmann PS. Please use "Reply" to answer posts, so that they can be put into the same thread. _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
In reply to this post by Robert Clausecker
On Wed, 2 Mar 2011 14:14:02 +0100, you wrote:
>Thank you all for the responses. Here's an example: > >As I alrerady said, I tried to parse the MMIXAL assembly language. >Each instruction has up to three operands, looking like this: > > @+4 (Jump for bytes forward) > "foo" (the string foo" > '0'>>(1+2) > >etc. A string literal may contain anything but a newline, (there are >no escape codes or similar). But when I add a check for a newline, >the parser just fails and the next one is tried. This is undesired, as >I want to return an error like "unexpected newline" instead. How is >this handled in other parsers? Tillman's reply is absolutely correct. If a particular sequence of characters is invalid according to your grammar, then _all_ of the alternatives in scope at that point should fail to parse that sequence. If that's not happening, then there's something wrong with the way you've expressed your grammar. I don't know how much experience you have with language grammars, but it might be helpful to try to write down MMIXAL's grammar using EBNF notation, as a starting point. -Steve Schafer _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
In reply to this post by Robert Clausecker
Apologies if this has been answered already (I've got a bit lost with
this thread), but the *try* here seems to be giving you precisely the behaviour you don't want. *try* means backtrack on failure, and try the next parser. So if you want ill formed strings to throw an error if they aren't properly enclosed in double quotes don't use try. <|> try $ (char '"' *> (StringLit . B.pack <$> manyTill (notChar '\n') (char '"'))) _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
Actually this is stranger than I thought - from testing it seems like
Attoparsec's (<|>) is different to Parsec's. From what I'm seeing Attoparsec appears to do a full back track for (<|>) regardless of whether the string lexer is wrapped in try, whereas Parsec needs try to backtrack. On 2 March 2011 16:24, Stephen Tetley <[hidden email]> wrote: > > *try* means backtrack on failure, and try the next parser. So if you > want ill formed strings to throw an error if they aren't properly > enclosed in double quotes don't use try. _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
Actually, It's not <|> that's different, it's the string combinator.
In Parsec, string matches each character one at a time. If the match fails, any partial input it matched is consumed. In attoparsec, string matches either the entire thing or not, as a single step. If it fails to match, no input is consumed. Carl On Wed, Mar 2, 2011 at 9:51 AM, Stephen Tetley <[hidden email]> wrote: > Actually this is stranger than I thought - from testing it seems like > Attoparsec's (<|>) is different to Parsec's. From what I'm seeing > Attoparsec appears to do a full back track for (<|>) regardless of > whether the string lexer is wrapped in try, whereas Parsec needs try > to backtrack. > > On 2 March 2011 16:24, Stephen Tetley <[hidden email]> wrote: > >> >> *try* means backtrack on failure, and try the next parser. So if you >> want ill formed strings to throw an error if they aren't properly >> enclosed in double quotes don't use try. > > _______________________________________________ > Haskell-Cafe mailing list > [hidden email] > http://www.haskell.org/mailman/listinfo/haskell-cafe > _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
Free forum by Nabble | Edit this page |