looking for alex+happy examples that uses location annotated tokens and location information in err msgs

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

looking for alex+happy examples that uses location annotated tokens and location information in err msgs

Ömer Sinan Ağacan
Does anyone have any example alex+happy programs that uses location
location annotated tokens and location information of token while
reporting parse errors?

I already wrote lexer and my function type is something like:

> runLexer :: String -> [(Token, AlexPosn)]

where AlexPosn type contains location information. But I have no idea
how can I use that in happy to parse and use location information in
parse error messages. There are lots of examples in interwebs but as
far as I can see none of them works on location-annotated tokens.

Thanks..

---
Ömer Sinan Ağacan
http://osa1.net
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: looking for alex+happy examples that uses location annotated tokens and location information in err msgs

Mateusz Kowalczyk
On 08/03/14 21:47, Ömer Sinan Ağacan wrote:

> Does anyone have any example alex+happy programs that uses location
> location annotated tokens and location information of token while
> reporting parse errors?
>
> I already wrote lexer and my function type is something like:
>
>> runLexer :: String -> [(Token, AlexPosn)]
>
> where AlexPosn type contains location information. But I have no idea
> how can I use that in happy to parse and use location information in
> parse error messages. There are lots of examples in interwebs but as
> far as I can see none of them works on location-annotated tokens.
>
> Thanks..
>
> ---
> Ömer Sinan Ağacan
> http://osa1.net
> _______________________________________________
> Haskell-Cafe mailing list
> [hidden email]
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>

I believe GHC does this. See the compiler/parser directory of the GHC
repo. Perhaps it's not the most lightweight example but it's the only
readily available one that I can think of.

--
Mateusz K.
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: looking for alex+happy examples that uses location annotated tokens and location information in err msgs

Erik de Castro Lopo-34
In reply to this post by Ömer Sinan Ağacan
Ömer Sinan Ağacan wrote:

> Does anyone have any example alex+happy programs that uses location
> location annotated tokens and location information of token while
> reporting parse errors?
>
> I already wrote lexer and my function type is something like:
>
> > runLexer :: String -> [(Token, AlexPosn)]
>
> where AlexPosn type contains location information. But I have no idea
> how can I use that in happy to parse and use location information in
> parse error messages. There are lots of examples in interwebs but as
> far as I can see none of them works on location-annotated tokens.

Also look at the DDC codebase:

   https://github.com/DDCSF/ddc

Erik
--
----------------------------------------------------------------------
Erik de Castro Lopo
http://www.mega-nerd.com/
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: looking for alex+happy examples that uses location annotated tokens and location information in err msgs

Ömer Sinan Ağacan
In reply to this post by Mateusz Kowalczyk
I think GHC uses alex+happy in very different way than explained in
the docs.. For example it doesn't use "posn", "monadUserState" etc.
wrappers as explained in alex documentations. I also can't figure out
how does it keep track of token locations.. I guess It's just too
complicated for me to understand without spending several hours.

---
Ömer Sinan Ağacan
http://osa1.net


2014-03-09 9:49 GMT+02:00 Mateusz Kowalczyk <[hidden email]>:

> On 08/03/14 21:47, Ömer Sinan Ağacan wrote:
>> Does anyone have any example alex+happy programs that uses location
>> location annotated tokens and location information of token while
>> reporting parse errors?
>>
>> I already wrote lexer and my function type is something like:
>>
>>> runLexer :: String -> [(Token, AlexPosn)]
>>
>> where AlexPosn type contains location information. But I have no idea
>> how can I use that in happy to parse and use location information in
>> parse error messages. There are lots of examples in interwebs but as
>> far as I can see none of them works on location-annotated tokens.
>>
>> Thanks..
>>
>> ---
>> Ömer Sinan Ağacan
>> http://osa1.net
>> _______________________________________________
>> Haskell-Cafe mailing list
>> [hidden email]
>> http://www.haskell.org/mailman/listinfo/haskell-cafe
>>
>
> I believe GHC does this. See the compiler/parser directory of the GHC
> repo. Perhaps it's not the most lightweight example but it's the only
> readily available one that I can think of.
>
> --
> Mateusz K.
> _______________________________________________
> Haskell-Cafe mailing list
> [hidden email]
> http://www.haskell.org/mailman/listinfo/haskell-cafe
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: looking for alex+happy examples that uses location annotated tokens and location information in err msgs

Ömer Sinan Ağacan
In reply to this post by Erik de Castro Lopo-34
I don't think DDC uses alex/happy for parsing:

```
➜  ddc git:(master) find . -iname "*.y"
➜  ddc git:(master) find . -iname "*.x"
```

and there are some Parsec parsers in `packages/ddc-core/DDC/Core/Parser`.

---
Ömer Sinan Ağacan
http://osa1.net


2014-03-09 10:47 GMT+02:00 Erik de Castro Lopo <[hidden email]>:

> Ömer Sinan Ağacan wrote:
>
>> Does anyone have any example alex+happy programs that uses location
>> location annotated tokens and location information of token while
>> reporting parse errors?
>>
>> I already wrote lexer and my function type is something like:
>>
>> > runLexer :: String -> [(Token, AlexPosn)]
>>
>> where AlexPosn type contains location information. But I have no idea
>> how can I use that in happy to parse and use location information in
>> parse error messages. There are lots of examples in interwebs but as
>> far as I can see none of them works on location-annotated tokens.
>
> Also look at the DDC codebase:
>
>    https://github.com/DDCSF/ddc
>
> Erik
> --
> ----------------------------------------------------------------------
> Erik de Castro Lopo
> http://www.mega-nerd.com/
> _______________________________________________
> Haskell-Cafe mailing list
> [hidden email]
> http://www.haskell.org/mailman/listinfo/haskell-cafe
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: looking for alex+happy examples that uses location annotated tokens and location information in err msgs

Mateusz Kowalczyk
In reply to this post by Ömer Sinan Ağacan
On 09/03/14 11:15, Ömer Sinan Ağacan wrote:
> I think GHC uses alex+happy in very different way than explained in
> the docs.. For example it doesn't use "posn", "monadUserState" etc.
> wrappers as explained in alex documentations. I also can't figure out
> how does it keep track of token locations.. I guess It's just too
> complicated for me to understand without spending several hours.
>
> ---
> Ömer Sinan Ağacan
> http://osa1.net

It doesn't use the wrappers provided, it instead uses its own data
structures to do what the wrappers let you do and more. You can look for
the data types defined in those modules to try and shed some light on
how it's all done. I did mention it's not the most lightweight example ;)

--
Mateusz K.
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: looking for alex+happy examples that uses location annotated tokens and location information in err msgs

Ömer Sinan Ağacan
Ha! I finally managed to use any kinds of tokens in Happy parsers ...
Will post the code soon.

---
Ömer Sinan Ağacan
http://osa1.net


2014-03-09 13:44 GMT+02:00 Mateusz Kowalczyk <[hidden email]>:

> On 09/03/14 11:15, Ömer Sinan Ağacan wrote:
>> I think GHC uses alex+happy in very different way than explained in
>> the docs.. For example it doesn't use "posn", "monadUserState" etc.
>> wrappers as explained in alex documentations. I also can't figure out
>> how does it keep track of token locations.. I guess It's just too
>> complicated for me to understand without spending several hours.
>>
>> ---
>> Ömer Sinan Ağacan
>> http://osa1.net
>
> It doesn't use the wrappers provided, it instead uses its own data
> structures to do what the wrappers let you do and more. You can look for
> the data types defined in those modules to try and shed some light on
> how it's all done. I did mention it's not the most lightweight example ;)
>
> --
> Mateusz K.
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: looking for alex+happy examples that uses location annotated tokens and location information in err msgs

Ömer Sinan Ağacan
Okay guys, I finally did it and it turned out that Happy is actually
very easy to use .. only problem is that IMO documentation is not
great(or maybe I didn't pay enough attention reading it .. reading
docs is boring). Anyway, here's a short explanation(link for code is
also below):

Let's say you already wrote the lexer that returns `[TokPos]` where

  data Token = ... your tokens ...
  type TokPos = (Token, AlexPosn)

(AlexPosn contains location information)

and after telling Happy your token types, all you need to do is to
declare how tokens used in left-hand side of productions are matched
against tokens. Let's say I have this tokens:

  data Token = A String | B Int | C

I'll need to add this token declarations in Happy file:

  %token
    a   { (A $$, _) }
    b   { (B $$, _) }
    c   { (C, _) }

notice how I'm using ordinary pattern syntax. I didn't know this part
before. After that I can use a, b and c in left-hand side parts of
productions, like:

  Exp : a b { SomeConstructor $1 $2 }
      | c   { SomeOtherConstructor }

and that's it. I hope that helps somebody.

Working code:
* Lexer(Alex): https://github.com/osa1/minCaml.hs/blob/master/src/Lexer.x
* Parser(Happy): https://github.com/osa1/minCaml.hs/blob/master/src/Parser.y

---
Ömer Sinan Ağacan
http://osa1.net


2014-03-09 18:45 GMT+02:00 Ömer Sinan Ağacan <[hidden email]>:

> Ha! I finally managed to use any kinds of tokens in Happy parsers ...
> Will post the code soon.
>
> ---
> Ömer Sinan Ağacan
> http://osa1.net
>
>
> 2014-03-09 13:44 GMT+02:00 Mateusz Kowalczyk <[hidden email]>:
>> On 09/03/14 11:15, Ömer Sinan Ağacan wrote:
>>> I think GHC uses alex+happy in very different way than explained in
>>> the docs.. For example it doesn't use "posn", "monadUserState" etc.
>>> wrappers as explained in alex documentations. I also can't figure out
>>> how does it keep track of token locations.. I guess It's just too
>>> complicated for me to understand without spending several hours.
>>>
>>> ---
>>> Ömer Sinan Ağacan
>>> http://osa1.net
>>
>> It doesn't use the wrappers provided, it instead uses its own data
>> structures to do what the wrappers let you do and more. You can look for
>> the data types defined in those modules to try and shed some light on
>> how it's all done. I did mention it's not the most lightweight example ;)
>>
>> --
>> Mateusz K.
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: looking for alex+happy examples that uses location annotated tokens and location information in err msgs

Stephen Tetley-2
Hi Ömer

Are you rewriting Eijiro Sumii's MinCaml in Haskell?

I did the same a couple of years ago, though I used Parsec for parsing
and only implemented the higher transformations (beta reduction,
inlining, constant folding, useless variable elimination). As I have
no access to a Sparc machine, I didn't go as far as closure conversion
and code generation.

The code was only for my personal learning as MinCaml was the nicest /
smallest real compiler I could find. Code is in my Google repository -
it might be useful for reference, obviously for a learning exercise
you'd want to do-it-yourself:

https://code.google.com/p/copperbox/source/browse/  >> trunk >>
compiler >> HMinCaml

Best wishes

Stephen

On 9 March 2014 19:18, Ömer Sinan Ağacan <[hidden email]> wrote:

[Snip]

>
> Working code:
> * Lexer(Alex): https://github.com/osa1/minCaml.hs/blob/master/src/Lexer.x
> * Parser(Happy): https://github.com/osa1/minCaml.hs/blob/master/src/Parser.y
>
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: looking for alex+happy examples that uses location annotated tokens and location information in err msgs

Ömer Sinan Ağacan
Hi Stephen,

> Are you rewriting Eijiro Sumii's MinCaml in Haskell?

Yes!

> I did the same a couple of years ago, though I used Parsec for parsing
> and only implemented the higher transformations (beta reduction,
> inlining, constant folding, useless variable elimination). As I have
> no access to a Sparc machine, I didn't go as far as closure conversion
> and code generation.

At first I also used Parsec, but I wasn't happy with the parser. The
original grammar has 4 left-recursive productions and I had to
manually eliminate them(I guess you also did that). But after that the
grammar was still not LL(1) so I had to place some `try`s very
carefully. It worked fine, but resulting parser was parsing nested
lets in exponential time. For example, this program took about a
minute to parse:
https://github.com/esumii/min-caml/blob/master/test/spill.ml . So I
rewrote it in Happy and now it works great. (I still need to work on
error messages though. Also, I did not resolve all ambiguities so
Happy reports lots of shift-reduce conficts) I guess that problem with
Parsec parser could also be fixed but I just prefer implementation to
be similar to original definitions. (also, I always wanted to learn
Happy since GHC is using it and I want to start contributing to GHC in
the future)

I love starting with some tutorials and then improving final product
by adding more features etc. I started learning Haskell 3 years ago
with "write yourself a Scheme" tutorial and then I added
continuations/call-cc, exceptions and several other features. It
helped me a lot and it was very fun way to learn. I'm currently
planning to do something similar, I want to implement more features
once I have compiler for the original language working.

I also don't have a Sparc machine, so I'm planning to compile to
x86-64. I only know basics of x86-64, so I want to first compile to C.
After I have that working correctly, by looking to generated ASM from
the generated C sources, I want to implement x86-64 code generator. So
I also aim to learn x86-64 assembly.

At some point I also want to implement a garbage collector.. AFAIK no
heap allocation is done in the original language, so I may need some
more data types.. Maybe sum-types or records like in Haskell.

> The code was only for my personal learning as MinCaml was the nicest /
> smallest real compiler I could find. Code is in my Google repository -
> it might be useful for reference, obviously for a learning exercise
> you'd want to do-it-yourself:
>
> https://code.google.com/p/copperbox/source/browse/  >> trunk >>
> compiler >> HMinCaml

Thanks! I may give it a look if I get stuck at some point.

---
Ömer Sinan Ağacan
http://osa1.net
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: looking for alex+happy examples that uses location annotated tokens and location information in err msgs

Stephen Tetley-2
Hi Ömer

Sounds good, good luck with the project.

One thing I struggled with was re-implementing the type checker as the
Caml implementation uses ref cells. Fortunately there are a lot of
tutorials on the web covering type checking with a good few using
Haskell and greater or lesser degree of assignment.


On 9 March 2014 22:07, Ömer Sinan Ağacan <[hidden email]> wrote:
> Hi Stephen,
>
>> Are you rewriting Eijiro Sumii's MinCaml in Haskell?
>
> Yes!
>
[Snip]
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: looking for alex+happy examples that uses location annotated tokens and location information in err msgs

Ömer Sinan Ağacan
Hi Stephen,

> Sounds good, good luck with the project.

Thanks!

> One thing I struggled with was re-implementing the type checker as the
> Caml implementation uses ref cells. Fortunately there are a lot of
> tutorials on the web covering type checking with a good few using
> Haskell and greater or lesser degree of assignment.

I think ref cells used in OCaml code can be emulated in Haskell using
a map from ints to types. This more-or-less corresponds to managing
your own heap, where pointers point to types. I have something like
this in mind:

  type Unifications = M.Map TyVar Ty -- TyVar is basically an int

  data UnificationError
      = OccursCheck Ty Ty -- circular definition
      | UnificationError Ty Ty -- can't unify types
      | StrErr String -- required for Error instance
      deriving (Show)

  instance Error UnificationError where
      strMsg = StrErr

  -- | Unification monad that keeps track of unifications.
  newtype Unify a = Unify { unwrapUnify :: StateT Unifications (ErrorT
UnificationError Identity) a }
      deriving (Functor, Applicative, Monad, MonadState Unifications,
MonadError UnificationError)

  type TyEnv = M.Map Id Ty -- map from identifiers to types

  -- | Follow chains of type variables in the heap and remove type variables
  --   by connecting type variables in the type to final types in the chain.
  --   e.g. 1 |-> TyVar 2
  --        2 |-> TyVar 3
  --        3 |-> TyBool
  --   after `prune (TyVar 1)`, heap should be like:
  --        1 |-> TyBool
  --        2 |-> TyVar 3
  --        3 |-> TyBool
  prune :: Ty -> Unify Ty
  prune = ...

  -- | Unify two types.
  unify :: TyEnv -> Ty -> Ty -> Unify ()
  unify = ...

  -- | Infer type of an expression.
  typeCheck :: TyEnv -> Exp -> Unify Ty
  typeCheck = ...

I didn't implement it yet but it seems to me that this shouuld work.

Another approach might be using STRefs.

---
Ömer Sinan Ağacan
http://osa1.net
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: looking for alex+happy examples that uses location annotated tokens and location information in err msgs

Ömer Sinan Ağacan
Hi Stephen,

I just finished implementing the type checker. I used the first method
I mentioned in my previous mail, and it worked great. It may still
contain some bugs but it works fine on current test suite. (test
programs in original min-caml implementation).

If you're still interested, source code is here
https://github.com/osa1/minCaml.hs/blob/master/src/MinCaml/Typing.hs

---
Ömer Sinan Ağacan
http://osa1.net
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe