C++ Parser?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

C++ Parser?

Christopher Brown
Hi,

I have stumbled across language-c on hackage and I was wondering if anyone is aware if there exists a full C++ parser written in Haskell?

Many thanks,
Chris.



_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: C++ Parser?

Antoine Latter-2
On Tue, Jan 24, 2012 at 4:06 AM, Christopher Brown
<[hidden email]> wrote:
> Hi,
>
> I have stumbled across language-c on hackage and I was wondering if anyone is aware if there exists a full C++ parser written in Haskell?
>

I'm not aware of one.

When it comes to parsing C++, I've always been a fan of this essay:
http://www.nobugs.org/developer/parsingcpp/

It's a hobbyist's tale of looking into parsing C++ and then an
explanation of why he gave up. It's older, so perhaps the
state-of-the-art has advanced since then.

Antoine

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: C++ Parser?

Hans Aberg-2
In reply to this post by Christopher Brown
On 24 Jan 2012, at 11:06, Christopher Brown wrote:

> I have stumbled across language-c on hackage and I was wondering if anyone is aware if there exists a full C++ parser written in Haskell?

There is a yaccable grammar
  http://www.parashift.com/c++-faq-lite/compiler-dependencies.html#faq-38.11

You might run it through a parser generator that outputs Haskell code.
  http://www.haskell.org/haskellwiki/Applications_and_libraries/Compiler_tools

Hans



_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: C++ Parser?

Jason Dagit-3
In reply to this post by Christopher Brown
On Tue, Jan 24, 2012 at 2:06 AM, Christopher Brown
<[hidden email]> wrote:
> Hi,
>
> I have stumbled across language-c on hackage and I was wondering if anyone is aware if there exists a full C++ parser written in Haskell?

I don't think one exists.  I've heard it's quite difficult to get
template parsing working in an efficient manner.

My understanding is that "real" C++ compilers use the Edison Design
Group's parser: http://www.edg.com/index.php?location=c_frontend

For example, the Intel C++ compiler uses the edg front-end:
http://en.wikipedia.org/wiki/Intel_C%2B%2B_Compiler

I thought even microsoft's compiler (which is surprisingly c++
compliant) uses it but I can't find details on google about that.

There is at least one open source project using it, rose, so it's not
unthinkingable to use it from Haskell: http://rosecompiler.org/

Rose has had working haskell bindings in the past but they have bit
rotted a bit.  With rose you get support for much more than parsing
C++.  You also get C and Fortran parsers as well as a fair bit of
static analysis.  The downside is that rose is a big pile of C++
itself and is hard to compile on some platforms.

If you made a BSD3 licensed, fully functional, efficient C++ parser
that would be great.  If you made it so that it preserves comments and
the input well enough to do source to source transformations
(unparsing) that would be very useful.  I often wish I had rose
implemented in Haskell instead of C++.

Jason

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: C++ Parser?

Christopher Brown
Hi Everyone,

Thanks for everyone's kind responses: very helpful so far!

I fully appreciate and understand how difficult writing a C++ parser is. However I may need one for our new Paraphrase project, where I may be targeting C++ for writing a refactoring tool. Obviously I don't want to start writing one myself, hence I was asking if anyone new about an already existing implementation.

Rose looks interesting, I'll check that out, thanks!

Chris.




On 24 Jan 2012, at 14:40, Jason Dagit wrote:

> On Tue, Jan 24, 2012 at 2:06 AM, Christopher Brown
> <[hidden email]> wrote:
>> Hi,
>>
>> I have stumbled across language-c on hackage and I was wondering if anyone is aware if there exists a full C++ parser written in Haskell?
>
> I don't think one exists.  I've heard it's quite difficult to get
> template parsing working in an efficient manner.
>
> My understanding is that "real" C++ compilers use the Edison Design
> Group's parser: http://www.edg.com/index.php?location=c_frontend
>
> For example, the Intel C++ compiler uses the edg front-end:
> http://en.wikipedia.org/wiki/Intel_C%2B%2B_Compiler
>
> I thought even microsoft's compiler (which is surprisingly c++
> compliant) uses it but I can't find details on google about that.
>
> There is at least one open source project using it, rose, so it's not
> unthinkingable to use it from Haskell: http://rosecompiler.org/
>
> Rose has had working haskell bindings in the past but they have bit
> rotted a bit.  With rose you get support for much more than parsing
> C++.  You also get C and Fortran parsers as well as a fair bit of
> static analysis.  The downside is that rose is a big pile of C++
> itself and is hard to compile on some platforms.
>
> If you made a BSD3 licensed, fully functional, efficient C++ parser
> that would be great.  If you made it so that it preserves comments and
> the input well enough to do source to source transformations
> (unparsing) that would be very useful.  I often wish I had rose
> implemented in Haskell instead of C++.
>
> Jason


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: C++ Parser?

Jason Dagit-3
On Tue, Jan 24, 2012 at 6:54 AM, Christopher Brown
<[hidden email]> wrote:
> Hi Everyone,
>
> Thanks for everyone's kind responses: very helpful so far!
>
> I fully appreciate and understand how difficult writing a C++ parser is. However I may need one for our new Paraphrase project, where I may be targeting C++ for writing a refactoring tool. Obviously I don't want to start writing one myself, hence I was asking if anyone new about an already existing implementation.
>
> Rose looks interesting, I'll check that out, thanks!

I did some more digging after sending my email.  I didn't learn about
GLR parser when I was in school, but that seems to be what the cool
compilers use these days.  Then I discovered that Happy supports GLR,
that is happy!

Next I found that GLR supposedly makes C++ parsing much easier than
LALR, "The reason I wrote Elkhound is to be able to write a C++
parser. The parser is called Elsa, and is included in the distribution
below."  The elsa documentation should give you a flavor for what
needs to be done when making sense of C++:
http://scottmcpeak.com/elkhound/sources/elsa/index.html

NB: I don't think it's been seriously worked on since 2005 so I assume
it doesn't match the latest C++ spec.

The grammar that elsa parses is here, one warning is that it doesn't
reject all invalid programs (eg., it errs on the side of accepting too
much): http://scottmcpeak.com/elkhound/sources/elsa/cc.gr

I think the path of least resistance is pure rose without the haskell
support.  Having said that, I think the most fun direction would be
converting the elsa grammar to happy.  It's just that you'll have a
lot of work (read: testing, debugging, performance tuning, and then
adding vendor features) to do.  One side benefit is that you'll know
much more about the intricacies of C++ when you're done than if you
use someone else's parser.

Jason

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: C++ Parser?

Christopher Brown
Hi Jason,

Thanks very much for you thoughtful response.

I am intrigued about the Happy route: as I have never really used Happy before, am I right in thinking I could take the .gr grammar, feed it into Happy to generate a parser, or a template for a parser, and then go from there?

Chris.



On 24 Jan 2012, at 15:16, Jason Dagit wrote:

> On Tue, Jan 24, 2012 at 6:54 AM, Christopher Brown
> <[hidden email]> wrote:
>> Hi Everyone,
>>
>> Thanks for everyone's kind responses: very helpful so far!
>>
>> I fully appreciate and understand how difficult writing a C++ parser is. However I may need one for our new Paraphrase project, where I may be targeting C++ for writing a refactoring tool. Obviously I don't want to start writing one myself, hence I was asking if anyone new about an already existing implementation.
>>
>> Rose looks interesting, I'll check that out, thanks!
>
> I did some more digging after sending my email.  I didn't learn about
> GLR parser when I was in school, but that seems to be what the cool
> compilers use these days.  Then I discovered that Happy supports GLR,
> that is happy!
>
> Next I found that GLR supposedly makes C++ parsing much easier than
> LALR, "The reason I wrote Elkhound is to be able to write a C++
> parser. The parser is called Elsa, and is included in the distribution
> below."  The elsa documentation should give you a flavor for what
> needs to be done when making sense of C++:
> http://scottmcpeak.com/elkhound/sources/elsa/index.html
>
> NB: I don't think it's been seriously worked on since 2005 so I assume
> it doesn't match the latest C++ spec.
>
> The grammar that elsa parses is here, one warning is that it doesn't
> reject all invalid programs (eg., it errs on the side of accepting too
> much): http://scottmcpeak.com/elkhound/sources/elsa/cc.gr
>
> I think the path of least resistance is pure rose without the haskell
> support.  Having said that, I think the most fun direction would be
> converting the elsa grammar to happy.  It's just that you'll have a
> lot of work (read: testing, debugging, performance tuning, and then
> adding vendor features) to do.  One side benefit is that you'll know
> much more about the intricacies of C++ when you're done than if you
> use someone else's parser.
>
> Jason
>
> _______________________________________________
> Haskell-Cafe mailing list
> [hidden email]
> http://www.haskell.org/mailman/listinfo/haskell-cafe


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: C++ Parser?

Jason Dagit-3
On Tue, Jan 24, 2012 at 8:40 AM, Christopher Brown
<[hidden email]> wrote:
> Hi Jason,
>
> Thanks very much for you thoughtful response.
>
> I am intrigued about the Happy route: as I have never really used Happy before, am I right in thinking I could take the .gr grammar, feed it into Happy to generate a parser, or a template for a parser, and then go from there?

That's the basic idea although the details will be harder than that.
Happy is a parser generator (like Bison, Yacc, and ANTLR).  Happy and
elsa will have very different syntax for their grammar definitions.
You could explore taking the elkhound source and instead of generating
C++ you could  generate the input for happy, if that makes sense.  A
translation by hand would probably be easiest.

I would highly recommend making a few toy parsers with Happy + Alex
(alex is like lex or flex) to get a feel for it before trying to use
the grammar from elsa.

A quick google search pointed me at these examples:
http://darcs.haskell.org/happy/examples/

Jason

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: C++ Parser?

Nathan Howell-2
In reply to this post by Christopher Brown
On Tue, Jan 24, 2012 at 2:06 AM, Christopher Brown <[hidden email]> wrote:
I have stumbled across language-c on hackage and I was wondering if anyone is aware if there exists a full C++ parser written in Haskell?


The clang API is in C++ and will do just about everything you'd ever want to do with C/ObjC/C++ source.

-n

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: C++ Parser?

Stephen Tetley-2
In reply to this post by Christopher Brown
There is also the DMS from Ira Baxter's company Semantic Design's.
This is an industry proven refactoring framework that handles C++ as
well as other languages.

I think the Antlr C++ parser may have advanced since the article
Antoine Latter link to, but personally I'd run a mile before trying to
do any source transformation of C++ even if someone were waving a very
large cheque at me.

On 24 January 2012 14:54, Christopher Brown <[hidden email]> wrote:
> Hi Everyone,
>
> Thanks for everyone's kind responses: very helpful so far!
>
> I fully appreciate and understand how difficult writing a C++ parser is. However I may need one for our new Paraphrase project, where I may be targeting C++ for writing a refactoring tool.

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: C++ Parser?

David Laing
Hi all,

Just to add to the list - Qt Creator contains a pretty nice (and incremental) C++ parser.

Cheers,

Dave

On Wed, Jan 25, 2012 at 5:06 AM, Stephen Tetley <[hidden email]> wrote:
There is also the DMS from Ira Baxter's company Semantic Design's.
This is an industry proven refactoring framework that handles C++ as
well as other languages.

I think the Antlr C++ parser may have advanced since the article
Antoine Latter link to, but personally I'd run a mile before trying to
do any source transformation of C++ even if someone were waving a very
large cheque at me.

On 24 January 2012 14:54, Christopher Brown <[hidden email]> wrote:
> Hi Everyone,
>
> Thanks for everyone's kind responses: very helpful so far!
>
> I fully appreciate and understand how difficult writing a C++ parser is. However I may need one for our new Paraphrase project, where I may be targeting C++ for writing a refactoring tool.

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: C++ Parser?

Yin Wang
In reply to this post by Christopher Brown
I have written a C++ parser in Scheme, with a Parsec-style parser
combinator library. It can parse a large portion of C++ and I use it
to do structural comparison between ASTs. I made some macros so that
the parser combinators look like the grammar itself.

It's code is at:

http://github.com/yinwang0/ydiff/blob/master/parse-cpp.ss

A demo of the parse tree based comparison tool is at:

http://www.cs.indiana.edu/~yw21/demos/d8-3404-d8-8424.html


The bit of information I can tell you about parsing C++:

- C++'s grammar is not that bad if you see the consistency in it.
Parsing a major portion of C++ is not hard. I made the parser in two
days. It can parse most of Google's V8 Javascript compiler code. I
just need to fix some corner cases later.

- It is better to delay semantic checks to a later stage. Don't put
those into the parser. Parse a larger language first, and then walk
the parse tree to eliminate semantically wrong programs.

- Don't try translating from the formal grammar or parser generator
files for C++. They contain years of bugs and patches and you will
probably be confused looking at them. I wrote the parser just by
looking at some example C++ programs.



Cheers,
    Yin



On Tue, Jan 24, 2012 at 5:06 AM, Christopher Brown
<[hidden email]> wrote:

> Hi,
>
> I have stumbled across language-c on hackage and I was wondering if anyone is aware if there exists a full C++ parser written in Haskell?
>
> Many thanks,
> Chris.
>
>
>
> _______________________________________________
> Haskell-Cafe mailing list
> [hidden email]
> http://www.haskell.org/mailman/listinfo/haskell-cafe

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: C++ Parser?

Jason Dagit-3
On Wed, Feb 1, 2012 at 12:42 PM, Yin Wang <[hidden email]> wrote:

> I have written a C++ parser in Scheme, with a Parsec-style parser
> combinator library. It can parse a large portion of C++ and I use it
> to do structural comparison between ASTs. I made some macros so that
> the parser combinators look like the grammar itself.
>
> It's code is at:
>
> http://github.com/yinwang0/ydiff/blob/master/parse-cpp.ss
>
> A demo of the parse tree based comparison tool is at:
>
> http://www.cs.indiana.edu/~yw21/demos/d8-3404-d8-8424.html
>
>
> The bit of information I can tell you about parsing C++:

Thank you for the interesting response and example code (that I
haven't had a chance to look at yet).  How much support do you have
for templates?

Jason

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: C++ Parser?

Yin Wang
I haven't dealt explicitly with templates. I treat them as type
parameters (element $type-parameter). I don't check that they have
been declared at all. As explained, these are semantic checks and
should be deferred until type checking stage ;-)


Cheers,
    Yin




On Wed, Feb 1, 2012 at 4:07 PM, Jason Dagit <[hidden email]> wrote:

> On Wed, Feb 1, 2012 at 12:42 PM, Yin Wang <[hidden email]> wrote:
>> I have written a C++ parser in Scheme, with a Parsec-style parser
>> combinator library. It can parse a large portion of C++ and I use it
>> to do structural comparison between ASTs. I made some macros so that
>> the parser combinators look like the grammar itself.
>>
>> It's code is at:
>>
>> http://github.com/yinwang0/ydiff/blob/master/parse-cpp.ss
>>
>> A demo of the parse tree based comparison tool is at:
>>
>> http://www.cs.indiana.edu/~yw21/demos/d8-3404-d8-8424.html
>>
>>
>> The bit of information I can tell you about parsing C++:
>
> Thank you for the interesting response and example code (that I
> haven't had a chance to look at yet).  How much support do you have
> for templates?
>
> Jason

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe