Parsing of bytestrings with non-String errors?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Parsing of bytestrings with non-String errors?

Magnus Therning
I've looked at polyparse and attoparsec and they seem to have in common that
the error always is a String.  My current ideas for a project would be a lot
easier if I could just return some other type, something that I can pattern
match on.

Is there a parser combinator library out there that works on bytestrings and
allows using a custom error type?

Or maybe there's some very basic reason why String is so commonly used?

/M

--
Magnus Therning                        (OpenPGP: 0xAB4DFBA4)
magnus@therning.org          Jabber: magnus@therning.org
http://therning.org/magnus         identi.ca|twitter: magthe


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe

signature.asc (205 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Parsing of bytestrings with non-String errors?

Malcolm Wallace
> Is there a parser combinator library out there that works on  
> bytestrings and
> allows using a custom error type?

The HuttonMeijerWallace combinators (distributed with polyparse) have  
the custom error type, but not the bytestrings.

> Or maybe there's some very basic reason why String is so commonly  
> used?

I don't think there is any deep reason.  Strings are convenient, that  
is all.

My guesstimate would be that if you take (e.g.) the polyparse  
combinators, and manually rewrite String everywhere to a parameter e,  
(only when it represents an error of course), it would take you maybe  
an hour in total, including fixing up any site you missed that the  
typechecker catches for you.

Regards,
     Malcolm

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Parsing of bytestrings with non-String errors?

Permjacov Evgeniy
In reply to this post by Magnus Therning
On 02/21/2010 11:57 PM, [hidden email] wrote:

> Message: 2
> Date: Sun, 21 Feb 2010 12:36:21 +0000
> From: Magnus Therning <[hidden email]>
> Subject: [Haskell-cafe] Parsing of bytestrings with non-String errors?
> To: haskell-cafe <[hidden email]>
> Message-ID: <[hidden email]>
> Content-Type: text/plain; charset="utf-8"
>
> I've looked at polyparse and attoparsec and they seem to have in common that
> the error always is a String.  My current ideas for a project would be a lot
> easier if I could just return some other type, something that I can pattern
> match on.
>
> Is there a parser combinator library out there that works on bytestrings and
> allows using a custom error type?
>
> Or maybe there's some very basic reason why String is so commonly used?
>
>  

You can try to play with ParsecT / ErrorT

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Parsing of bytestrings with non-String errors?

Bryan O'Sullivan
In reply to this post by Magnus Therning
On Sun, Feb 21, 2010 at 4:36 AM, Magnus Therning <[hidden email]> wrote:
I've looked at polyparse and attoparsec and they seem to have in common that
the error always is a String.  My current ideas for a project would be a lot
easier if I could just return some other type, something that I can pattern
match on.

It would be easy enough to add this, but you'd end up with a slightly convoluted API. Because of the presence of fail in all monadic APIs, you'd have to support only-a-string as a failure result in some form, so your failure type would have to be something like Either String a.

There's no support for this in attoparsec simply because I haven't needed it. I suspect the same is true of other libraries, nothing deeper.

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Parsing of bytestrings with non-String errors?

Magnus Therning
On 22/02/10 18:44, Bryan O'Sullivan wrote:

> On Sun, Feb 21, 2010 at 4:36 AM, Magnus Therning <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     I've looked at polyparse and attoparsec and they seem to have in common
>     that the error always is a String.  My current ideas for a project would
>     be a lot easier if I could just return some other type, something that I
>     can pattern match on.
>
>
> It would be easy enough to add this, but you'd end up with a slightly
> convoluted API. Because of the presence of fail in all monadic APIs, you'd
> have to support only-a-string as a failure result in some form, so your
> failure type would have to be something like Either String a.
My thoughts went more like a parser type like

    data Parser e a = ...

Possibly with the addition that 'e' implements a class that goes something
like

    class ParserError e where
        baseError :: e
        addError :: e -> e -> e

(At first I thought that maybe Monoid would do, but both a identity and
associativity feels awkward in this case. :-)

With 'String' implemented something like

    instance ParserError String where
        baseError = "Parser error, expected:\n"
        addError = (++)

Then I believe 'Parser String' would be equivalent to the existing attoparsec
parser (as found in the 0.7 series).

I still haven't convinced myself that this will work though.  Also, I had a
look at attoparsec on bitbucket, and there are some major changes between 0.7
and 0.8.  I realised I'll have to spend a lot more time understanding the code
than I initially hoped.  Right now that is unlikely to happen any time soon :(

> There's no support for this in attoparsec simply because I haven't
> needed it. I suspect the same is true of other libraries, nothing deeper.

Yeah, that's what I thought.  In a current project I just have a need to
differentiate between errors in different parts of the parser.  And handling
those errors would just be simple if I could use pattern matching rather than
inspect strings.

/M

--
Magnus Therning                        (OpenPGP: 0xAB4DFBA4)
magnus@therning.org          Jabber: magnus@therning.org
http://therning.org/magnus         identi.ca|twitter: magthe


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe

signature.asc (205 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Parsing of bytestrings with non-String errors?

Bryan O'Sullivan
On Mon, Feb 22, 2010 at 2:38 PM, Magnus Therning <[hidden email]> wrote:
My thoughts went more like a parser type like

   data Parser e a = ...

Yes, I knew that's where you were going :-)

The trouble is, you'd still have to deal with
fail :: Monad m => String -> m a
which would require your failure type to somehow accept a string. Plumbing that in would be a little more awkward than your initial exporations suggest :-\

You have two problems. The first is how to construct a value of your failure type that accepts a String parameter so that you can support users of "fail". The second is that you might need to pass extra information to construct your failure value when naïve code uses fail or mzero, otherwise you will only get useful error values out quite infrequently.
 
I still haven't convinced myself that this will work though.  Also, I had a
look at attoparsec on bitbucket, and there are some major changes between 0.7
and 0.8.

Even though those changes represent a major modification to the internals of attoparsec, they shouldn't really affect what you want to do, or anything interesting about how to do it.

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Parsing of bytestrings with non-String errors?

Magnus Therning
On Tue, Feb 23, 2010 at 00:39, Bryan O'Sullivan <[hidden email]> wrote:

> On Mon, Feb 22, 2010 at 2:38 PM, Magnus Therning <[hidden email]>
> wrote:
>>
>> My thoughts went more like a parser type like
>>
>>    data Parser e a = ...
>
> Yes, I knew that's where you were going :-)
> The trouble is, you'd still have to deal with
> fail :: Monad m => String -> m a
> which would require your failure type to somehow accept a string. Plumbing
> that in would be a little more awkward than your initial exporations suggest
> :-\
> You have two problems. The first is how to construct a value of your failure
> type that accepts a String parameter so that you can support users of
> "fail". The second is that you might need to pass extra information to
> construct your failure value when naïve code uses fail or mzero, otherwise
> you will only get useful error values out quite infrequently.

Yes, I suspected there'd be something I had missed.

I guess it'd would require 'Parser e a' to have a 'fail' that's
similar to the one in 'Maybe'.  Users would then be forced to use
'<?>' to get useful error info out.  Would that be an unworkable
situation?

>> I still haven't convinced myself that this will work though.  Also, I had
>> a
>> look at attoparsec on bitbucket, and there are some major changes between
>> 0.7
>> and 0.8.
>
> Even though those changes represent a major modification to the internals of
> attoparsec, they shouldn't really affect what you want to do, or anything
> interesting about how to do it.

Ah, that's good.  I think I'll have to postpone any work on this for
now though, and instead implement a 'String -> MyErrorType' function
for, hopefully, temporary use.  I've already been sidetracked twice
before ;-)

/M

--
Magnus Therning                        (OpenPGP: 0xAB4DFBA4)
magnus@therning.org          Jabber: magnus@therning.org
http://therning.org/magnus         identi.ca|twitter: magthe
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe