Libraries to compare trees?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Libraries to compare trees?

dokondr
Please advise on Haskell libraries to compare trees in textual representation.
I need to compare both structure and node contents of two trees, find similar sub-trees, and need some metric to measure distance between two trees.
Also need advice on simple parser to convert textual tree representation into a data type convenient for tree manipulation (comparison, matching, etc.) What data type to use for trees with arbitrary structure?

Example trees:

*** Tree 1:
(ROOT
  (S
    (NP (DT The) (NN voice) (NN quality))
    (VP (VBD was)
      (ADJP (JJ clear) (RB too)))
    (. .)))


*** Tree 2: 
(ROOT
  (S
    (SBAR (IN Although)
      (S
        (NP (DT the) (NN battery) (NN life))
        (VP (VBD was) (RB not)
          (ADJP (JJ long)))))
    (, ,)
    (NP (DT that))
    (VP (VBZ is)
      (VP (VBN ok)
        (PP (IN for)
          (NP (PRP me)))))
    (. .)))

Thanks!
Dmitri


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Libraries to compare trees?

dokondr
My mistake: need advice on libraries and data types not for trees but for directed graphs.

On Thu, Oct 27, 2011 at 4:49 PM, dokondr <[hidden email]> wrote:
Please advise on Haskell libraries to compare trees in textual representation.
I need to compare both structure and node contents of two trees, find similar sub-trees, and need some metric to measure distance between two trees.
Also need advice on simple parser to convert textual tree representation into a data type convenient for tree manipulation (comparison, matching, etc.) What data type to use for trees with arbitrary structure?

Example trees:

*** Tree 1:
(ROOT
  (S
    (NP (DT The) (NN voice) (NN quality))
    (VP (VBD was)
      (ADJP (JJ clear) (RB too)))
    (. .)))


*** Tree 2: 
(ROOT
  (S
    (SBAR (IN Although)
      (S
        (NP (DT the) (NN battery) (NN life))
        (VP (VBD was) (RB not)
          (ADJP (JJ long)))))
    (, ,)
    (NP (DT that))
    (VP (VBZ is)
      (VP (VBN ok)
        (PP (IN for)
          (NP (PRP me)))))
    (. .)))

Thanks!
Dmitri






_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Libraries to compare trees?

Ivan Lazar Miljenovic
Well, for arbitrary directed graphs, FGL is probably your best bet, or
roll-your-own.

_But_ you'll need to write the parser yourself using something like
trifecta, uu-parsinglib, polyparse, parsec, etc.

It would help if you described the structure of these graphs and what
kind of support you'd want in a data structure.

On 28 October 2011 00:27, dokondr <[hidden email]> wrote:

> My mistake: need advice on libraries and data types not for trees but for
> directed graphs.
>
> On Thu, Oct 27, 2011 at 4:49 PM, dokondr <[hidden email]> wrote:
>>
>> Please advise on Haskell libraries to compare trees in textual
>> representation.
>> I need to compare both structure and node contents of two trees, find
>> similar sub-trees, and need some metric to measure distance between two
>> trees.
>> Also need advice on simple parser to convert textual tree representation
>> into a data type convenient for tree manipulation (comparison, matching,
>> etc.) What data type to use for trees with arbitrary structure?
>>
>> Example trees:
>>
>> *** Tree 1:
>> (ROOT
>>   (S
>>     (NP (DT The) (NN voice) (NN quality))
>>     (VP (VBD was)
>>       (ADJP (JJ clear) (RB too)))
>>     (. .)))
>>
>>
>> *** Tree 2:
>> (ROOT
>>   (S
>>     (SBAR (IN Although)
>>       (S
>>         (NP (DT the) (NN battery) (NN life))
>>         (VP (VBD was) (RB not)
>>           (ADJP (JJ long)))))
>>     (, ,)
>>     (NP (DT that))
>>     (VP (VBZ is)
>>       (VP (VBN ok)
>>         (PP (IN for)
>>           (NP (PRP me)))))
>>     (. .)))
>>
>> Thanks!
>> Dmitri
>>
>
>
>
>
>
> _______________________________________________
> Haskell-Cafe mailing list
> [hidden email]
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
>



--
Ivan Lazar Miljenovic
[hidden email]
IvanMiljenovic.wordpress.com

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Libraries to compare trees?

dokondr

On Fri, Oct 28, 2011 at 2:31 AM, Ivan Lazar Miljenovic <[hidden email]> wrote:
Well, for arbitrary directed graphs, FGL is probably your best bet, or
roll-your-own.

_But_ you'll need to write the parser yourself using something like
trifecta, uu-parsinglib, polyparse, parsec, etc.

It would help if you described the structure of these graphs and what
kind of support you'd want in a data structure.


Thanks for the feedback!
As I wrote in first message I need to compare both structure and node contents of two graphs, find similar sub-graphs, and need some metric to measure distance between two graphs. These graphs are produced by POS (part of speech) sentence tagging. POS tagging is done by Standford statistical parser: http://nlp.stanford.edu/software/lex-parser.shtml

For example in the graph G1:
(ROOT
  (S
    (NP (DT The) (NN voice) (NN quality))
    (VP (VBD was)
      (ADJP (JJ clear) (RB too)))
    (. .)))

G1 has an NP node (lets call this sub-graph SG1)  which has three child leaf nodes: (DT The) (NN voice) (NN quality), where
DT, NN - nodes names, and
"The", "voice", "quality" - corresponding leaf values of these nodes.

I need to compare this graph with graph G2:
(ROOT
  (S
    (SBAR (IN Although)
      (S
        (NP (DT the) (NN battery) (NN life))
        (VP (VBD was) (RB not)
          (ADJP (JJ long)))))
    (, ,)
    (NP (DT that))
    (VP (VBZ is)
      (VP (VBN ok)
        (PP (IN for)
          (NP (PRP me)))))
    (. .)))

G2 also has the NP node (let's call it sub-graph SG2)  which also has three child leaf nodes: (DT the) (NN battery) (NN life))
I need to find that G1 and G2 has sub-graphs SG1 and SG2 with the similar structure, but with different values of the leaf nodes.
I also need to devise some general metric that will allow me to estimate distance between any two graphs. This distance should account both for structural and leaf-node values similarity.

It would be easier to measure distance between vectors then graphs. So I am thinking how to convert directed graph (that results from POS tagging) into vector. Any ideas, links here?

Thanks!

On 28 October 2011 00:27, dokondr <[hidden email]> wrote:
> My mistake: need advice on libraries and data types not for trees but for
> directed graphs.
>
> On Thu, Oct 27, 2011 at 4:49 PM, dokondr <[hidden email]> wrote:
>>
>> Please advise on Haskell libraries to compare trees in textual
>> representation.
>> I need to compare both structure and node contents of two trees, find
>> similar sub-trees, and need some metric to measure distance between two
>> trees.
>> Also need advice on simple parser to convert textual tree representation
>> into a data type convenient for tree manipulation (comparison, matching,
>> etc.) What data type to use for trees with arbitrary structure?
>>
>> Example trees:
>>
>> *** Tree 1:
>> (ROOT
>>   (S
>>     (NP (DT The) (NN voice) (NN quality))
>>     (VP (VBD was)
>>       (ADJP (JJ clear) (RB too)))
>>     (. .)))
>>
>>
>> *** Tree 2:
>> (ROOT
>>   (S
>>     (SBAR (IN Although)
>>       (S
>>         (NP (DT the) (NN battery) (NN life))
>>         (VP (VBD was) (RB not)
>>           (ADJP (JJ long)))))
>>     (, ,)
>>     (NP (DT that))
>>     (VP (VBZ is)
>>       (VP (VBN ok)
>>         (PP (IN for)
>>           (NP (PRP me)))))
>>     (. .)))
>>
>> Thanks!
>> Dmitri
>>
>
>
>
>
>

> _______________________________________________
> Haskell-Cafe mailing list
> [hidden email]
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
>



--
Ivan Lazar Miljenovic
[hidden email]
IvanMiljenovic.wordpress.com



--
All the best,
Dmitri O. Kondratiev

"This is what keeps me going: discovery"
[hidden email]
http://sites.google.com/site/dokondr/welcome



_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Libraries to compare trees?

Ozgur Akgun
In reply to this post by dokondr
Hi.

On 27 October 2011 13:49, dokondr <[hidden email]> wrote:
Please advise on Haskell libraries to compare trees in textual representation.
I need to compare both structure and node contents of two trees, find similar sub-trees, and need some metric to measure distance between two trees.

This might help: http://hackage.haskell.org/package/gdiff-1.0

Best,
Ozgur

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Libraries to compare trees?

Rustom Mody
On Fri, Oct 28, 2011 at 8:54 PM, Ozgur Akgun <[hidden email]> wrote:
Hi.


On 27 October 2011 13:49, dokondr <[hidden email]> wrote:
Please advise on Haskell libraries to compare trees in textual representation.
I need to compare both structure and node contents of two trees, find similar sub-trees, and need some metric to measure distance between two trees.

This might help: http://hackage.haskell.org/package/gdiff-1.0

Best,
Ozgur


This is interesting.  Just putting some thoughts here. Please comment.

I am a user of emacs org-mode http://orgmode.org/.
Basically org imposes a tree structure onto plain text and when that is appropriate its quite a nifty tool.  Recently there was some discussion on the org list that diffs of org files were less than useful because while org understands hierarchical structure, diff doesn't.

I wonder what would be involved in setting up a bi-directional pipe between emacs and haskell so that orgmode could use gdiff's findings?

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe