Quantcast

darcs patch dependencies in dot format

classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

darcs patch dependencies in dot format

Soenke Hahn
Hi all!

Yesterday I wrote a little tool to output the dependencies of darcs
patches in dot format. The hardest part was to wrap my head around the
darcs API and find a way to let it compute the patch dependencies. I
don't know, if I got it right, but it looks correct at first glance.

Here is the code:

https://patch-tag.com/r/shahn/darcs2dot

To use it just execute the program in a darcs repo and it will output a
dot graph to stdout. The graph does not contain all dependencies, but
the transitive reduction. The patch names are truncated at 15 characters.

And here is an example graph:

http://open-projects.net/~shahn/patchDeps.pdf

These are the patch dependencies of one of my darcs repos
(https://patch-tag.com/r/shahn/hate). Some observations I made:

* There are two completely separate subgraphs. One subgraph seems to be
for the testsuite, the other for the actual code. This means, the two
don't depend on each other and could easily be put in two distinct repos.
* There is a "re-implementation" patch with a lot of incoming and
outgoing edges. (Which makes sense.)
* There are some sequences of patches (e.g. these four "allow ..."
patches in the upper left corner) that seem to contain related patches.
* darcs's patch system is awesome!

Any comments or suggestions?

Cheers,
Sönke

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: darcs patch dependencies in dot format

Felipe Almeida Lessa

Truly amazing! I wonder it would fare with larger repositories. =)

Cheers,

--
Felipe – enviado do meu Galaxy Tab.

Em 12/05/2012 09:52, "Sönke Hahn" <[hidden email]> escreveu:
Hi all!

Yesterday I wrote a little tool to output the dependencies of darcs
patches in dot format. The hardest part was to wrap my head around the
darcs API and find a way to let it compute the patch dependencies. I
don't know, if I got it right, but it looks correct at first glance.

Here is the code:

https://patch-tag.com/r/shahn/darcs2dot

To use it just execute the program in a darcs repo and it will output a
dot graph to stdout. The graph does not contain all dependencies, but
the transitive reduction. The patch names are truncated at 15 characters.

And here is an example graph:

http://open-projects.net/~shahn/patchDeps.pdf

These are the patch dependencies of one of my darcs repos
(https://patch-tag.com/r/shahn/hate). Some observations I made:

* There are two completely separate subgraphs. One subgraph seems to be
for the testsuite, the other for the actual code. This means, the two
don't depend on each other and could easily be put in two distinct repos.
* There is a "re-implementation" patch with a lot of incoming and
outgoing edges. (Which makes sense.)
* There are some sequences of patches (e.g. these four "allow ..."
patches in the upper left corner) that seem to contain related patches.
* darcs's patch system is awesome!

Any comments or suggestions?

Cheers,
Sönke

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: darcs patch dependencies in dot format

Soenke Hahn
On 05/13/2012 03:13 AM, Felipe Almeida Lessa wrote:
> Truly amazing! I wonder it would fare with larger repositories. =)

It does not scale well. I have not looked into optimization at all. For
example the algorithm to compute the transitive reduction is very naive
and brute force.

Somehow related questions are: What am I going to do with a dot-graph,
that has more than 500 vertices? Is there an intelligent way to reduce
the graph?


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: darcs patch dependencies in dot format

Francesco Mazzoli
On 13/05/12 14:55, Sönke Hahn wrote:
> Somehow related questions are: What am I going to do with a dot-graph,
> that has more than 500 vertices? Is there an intelligent way to reduce
> the graph?

Setting the `concentrate' parameter to true helps, but dot does not work
well with massive graphs.

Francesco.

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: darcs patch dependencies in dot format

Francesco Mazzoli
On 13/05/12 15:13, Francesco Mazzoli wrote:
> On 13/05/12 14:55, Sönke Hahn wrote:
>> Somehow related questions are: What am I going to do with a dot-graph,
>> that has more than 500 vertices? Is there an intelligent way to reduce
>> the graph?
>
> Setting the `concentrate' parameter to true helps, but dot does not work
> well with massive graphs.

Actually, "neato" doesn't work well with massive graph - and neither do
the other algorithms provided by GraphViz. dot is just the file format.

I found Gephi (https://gephi.org/) quite good when I had to visualize
big graphs, and it supports dot files so you can try it out easily.

Francesco.

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: darcs patch dependencies in dot format

Ketil Malde-5
In reply to this post by Soenke Hahn
Sönke Hahn <[hidden email]> writes:

> On 05/13/2012 03:13 AM, Felipe Almeida Lessa wrote:
>> Truly amazing!

Yes, nice work!

>> I wonder it would fare with larger repositories. =)

> It does not scale well. [...]
> Somehow related questions are: What am I going to do with a dot-graph,
> that has more than 500 vertices? Is there an intelligent way to reduce
> the graph?

Lacking a solution for the problem of drawing large graphs, making the
graph smaller might be the second choice. :-) One option might be to
only track dependencies back to a specified tag? Or between tags?

-k
--
If I haven't seen further, it is by standing in the footprints of giants

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: darcs patch dependencies in dot format

Simon Michael
In reply to this post by Soenke Hahn
On 5/12/12 5:52 AM, Sönke Hahn wrote:

> Hi all!
>
> Yesterday I wrote a little tool to output the dependencies of darcs
> patches in dot format. The hardest part was to wrap my head around the
> darcs API and find a way to let it compute the patch dependencies. I
> don't know, if I got it right, but it looks correct at first glance.
>
> Here is the code:
>
> https://patch-tag.com/r/shahn/darcs2dot
>
> To use it just execute the program in a darcs repo and it will output a
> dot graph to stdout. The graph does not contain all dependencies, but
> the transitive reduction. The patch names are truncated at 15 characters.
>
> And here is an example graph:
>
> http://open-projects.net/~shahn/patchDeps.pdf
>
> These are the patch dependencies of one of my darcs repos
> (https://patch-tag.com/r/shahn/hate). Some observations I made:
>
> * There are two completely separate subgraphs. One subgraph seems to be
> for the testsuite, the other for the actual code. This means, the two
> don't depend on each other and could easily be put in two distinct repos.
> * There is a "re-implementation" patch with a lot of incoming and
> outgoing edges. (Which makes sense.)
> * There are some sequences of patches (e.g. these four "allow ..."
> patches in the upper left corner) that seem to contain related patches.
> * darcs's patch system is awesome!
>
> Any comments or suggestions?
>
> Cheers,
> Sönke

That's nifty, thanks for sharing it. Cc'ing darcs-user.

I tried it in a few repos, like so:

$ darcs2dot > patchdeps.dot && dot patchdeps.dot -Tpdf -o patchdeps.pdf

In a 200-patch repo it ran quickly, giving: http://joyful.com/darcsden/simon/darcsum/raw/patchdeps.pdf

In a 2000-patch repo it took 22 hours: http://joyful.com/darcsden/simon/hledger/raw/patchdeps.pdf

In the 10000-patch Darcs repo I killed it after a few hours, but here are the early dependencies (up to tag 1.0.0rc2):
http://joyful.com/darcsden/simon/darcs-sm/raw/patchdeps.pdf

It should escape double-quotes in patch names, I did that manually.

I wonder how to simplify the graph further. Perhaps just the dependencies of tags would be interesting in some repos ?

Best,
-Simon


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: darcs patch dependencies in dot format

Joachim Breitner-2
Hi,

Am Montag, den 14.05.2012, 07:21 -0700 schrieb Simon Michael:
> I wonder how to simplify the graph further. Perhaps just the dependencies of tags would be interesting in some repos ?

I think you’d want to collapse [a]→[b] to [a,b], if no other edges leave
a or reach b. This way you only get arrows at branching and merging
points.

Greetings,
Joachim

--
Joachim "nomeata" Breitner
  [hidden email]  |  [hidden email]  |  GPG: 0x4743206C
  xmpp: [hidden email] | http://www.joachim-breitner.de/


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe

signature.asc (205 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: darcs patch dependencies in dot format

wren ng thornton
In reply to this post by Soenke Hahn
On 5/12/12 8:52 AM, Sönke Hahn wrote:
> Any comments or suggestions?

Cabalize it and release it on Hackage. But especially the cabalization
part :)

You should probably farm out the toDot rendering to one of the libraries
that focuses on that[1], since they'll have focused on the efficiency
issues--- or if they haven't, then you can contribute improvements
there, helping everyone win. In particular, you're using Strings which
is a notorious performance sink. Using Text or ByteStrings would be far
better.

Also, have you compared your transitive reduction to just outputting the
whole graph and then using `tred`? The latter approach has the distinct
downside of needing to serialize the whole graph; but it could still be
a win unless you intend to implement similar algorithms yourself. The
smart algorithms do *much* better than brute force.

And of course it'd be nice to be able to pass arguments to the program
in order to filter and otherwise manipulate the resulting graph. A lot
of that can be done by special programs which only know about the Dot
language (e.g., tred), so you should only focus on things which aren't
captured by the Dot language or are otherwise only knowable by querying
Darcs.


[1] Like graphviz or language-dot:

     http://hackage.haskell.org/package/graphviz
     http://hackage.haskell.org/package/language-dot

Though it doesn't look like those are used by the various other foo2dot
programs on Hackage:

     http://hackage.haskell.org/package/hs2dot
     http://hackage.haskell.org/package/prof2dot
     http://hackage.haskell.org/package/scons2dot
     http://hackage.haskell.org/package/vacuum-cairo
     http://hackage.haskell.org/package/vacuum-opengl

So perhaps there's some issue with those libraries...

--
Live well,
~wren

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: darcs patch dependencies in dot format

Ivan Lazar Miljenovic
On 16 May 2012 19:43, wren ng thornton <[hidden email]> wrote:

> On 5/12/12 8:52 AM, Sönke Hahn wrote:
>>
>> Any comments or suggestions?
>
>
> Cabalize it and release it on Hackage. But especially the cabalization part
> :)
>
> You should probably farm out the toDot rendering to one of the libraries
> that focuses on that[1], since they'll have focused on the efficiency
> issues--- or if they haven't, then you can contribute improvements there,
> helping everyone win. In particular, you're using Strings which is a
> notorious performance sink. Using Text or ByteStrings would be far better.
>
> Also, have you compared your transitive reduction to just outputting the
> whole graph and then using `tred`? The latter approach has the distinct
> downside of needing to serialize the whole graph; but it could still be a
> win unless you intend to implement similar algorithms yourself. The smart
> algorithms do *much* better than brute force.

I would like to point out that graphviz has a native implementation of
tred (well, analogous rather than exact re-implementation).

I also haven't joined this discussion before now, but some of the
reduction algorithms in Graphalyze [1] (as used in SourceGraph [2])
might be applicable, though I admit they're not the best possible
implementations

[1]: http://hackage.haskell.org/package/Graphalyze
[2]: http://hackage.haskell.org/package/SourceGraph

>
> And of course it'd be nice to be able to pass arguments to the program in
> order to filter and otherwise manipulate the resulting graph. A lot of that
> can be done by special programs which only know about the Dot language
> (e.g., tred), so you should only focus on things which aren't captured by
> the Dot language or are otherwise only knowable by querying Darcs.
>
>
> [1] Like graphviz or language-dot:
>
>    http://hackage.haskell.org/package/graphviz
>    http://hackage.haskell.org/package/language-dot
>
> Though it doesn't look like those are used by the various other foo2dot
> programs on Hackage:
>
>    http://hackage.haskell.org/package/hs2dot
>    http://hackage.haskell.org/package/prof2dot
>    http://hackage.haskell.org/package/scons2dot
>    http://hackage.haskell.org/package/vacuum-cairo
>    http://hackage.haskell.org/package/vacuum-opengl
>
> So perhaps there's some issue with those libraries...
>
> --
> Live well,
> ~wren
>
>
> _______________________________________________
> Haskell-Cafe mailing list
> [hidden email]
> http://www.haskell.org/mailman/listinfo/haskell-cafe



--
Ivan Lazar Miljenovic
[hidden email]
http://IvanMiljenovic.wordpress.com

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: darcs patch dependencies in dot format

Stephen Tetley-2
> On 16 May 2012 19:43, wren ng thornton <[hidden email]> wrote:

>> You should probably farm out the toDot rendering to one of the libraries
>> that focuses on that[1], since they'll have focused on the efficiency
>> issues--- or if they haven't, then you can contribute improvements there,
>> helping everyone win. In particular, you're using Strings which is a
>> notorious performance sink. Using Text or ByteStrings would be far better.
>>

I'm not sure swapping to Text or ByteStrings make be much great shakes
for this. If you are generating huge files, where it would count -
then the files are going to be a real problem for Graphviz to render
(unless Graphviz has seen some optimization recently...).

That said, I would recommend Sönke uses a pretty print library rather
than Printf as using the former makes for much more idiomatic for
Haskell and generally performs well enough for "generational"
activities even if it uses Strings internally.

Best wishes

Stephen

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: darcs patch dependencies in dot format

Ivan Lazar Miljenovic
On 17 May 2012 03:31, Stephen Tetley <[hidden email]> wrote:

>> On 16 May 2012 19:43, wren ng thornton <[hidden email]> wrote:
>
>>> You should probably farm out the toDot rendering to one of the libraries
>>> that focuses on that[1], since they'll have focused on the efficiency
>>> issues--- or if they haven't, then you can contribute improvements there,
>>> helping everyone win. In particular, you're using Strings which is a
>>> notorious performance sink. Using Text or ByteStrings would be far better.
>>>
>
> I'm not sure swapping to Text or ByteStrings make be much great shakes
> for this. If you are generating huge files, where it would count -
> then the files are going to be a real problem for Graphviz to render
> (unless Graphviz has seen some optimization recently...).

I found with graphviz that switching to Text gave two advantages:

1) Easier to require UTF-8 usage

2) Printing and parsing was faster

In part, the speed-up came from switching to wl-pprint-text rather
than pretty (the wl-pprint method of pretty-printing seems to have
some efficiency improvements in how concatenation is done over pretty)
and the Text backend for polyparse lets you use manySatisfy rather
than (many . satisfy) which is more efficient.

But even in my initial pseudo-port (going via String a lot, etc.)
there was a slight speed-up.

So I think there is still a valid reason to consider using Text (or
possibly ByteString, but I think that has potential issues).

>
> That said, I would recommend Sönke uses a pretty print library rather
> than Printf as using the former makes for much more idiomatic for
> Haskell and generally performs well enough for "generational"
> activities even if it uses Strings internally.
>
> Best wishes
>
> Stephen
>
> _______________________________________________
> Haskell-Cafe mailing list
> [hidden email]
> http://www.haskell.org/mailman/listinfo/haskell-cafe



--
Ivan Lazar Miljenovic
[hidden email]
http://IvanMiljenovic.wordpress.com

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: darcs patch dependencies in dot format

Soenke Hahn
In reply to this post by wren ng thornton
On 05/16/2012 11:43 AM, wren ng thornton wrote:
> On 5/12/12 8:52 AM, Sönke Hahn wrote:
>> Any comments or suggestions?
>
> Cabalize it and release it on Hackage. But especially the cabalization
> part :)

Both done:

http://hackage.haskell.org/package/darcs2dot



_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: darcs patch dependencies in dot format

Soenke Hahn
In reply to this post by Francesco Mazzoli
On 05/13/2012 04:16 PM, Francesco Mazzoli wrote:
> I found Gephi (https://gephi.org/) quite good when I had to visualize
> big graphs, and it supports dot files so you can try it out easily.

gephi looks very interesting, thanks.


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: darcs patch dependencies in dot format

Soenke Hahn
In reply to this post by Simon Michael
On 05/14/2012 04:21 PM, Simon Michael wrote:
> In a 2000-patch repo it took 22 hours:
> http://joyful.com/darcsden/simon/hledger/raw/patchdeps.pdf

:)

> It should escape double-quotes in patch names, I did that manually.

That should be fixed (in the repo).


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: darcs patch dependencies in dot format

Soenke Hahn
In reply to this post by wren ng thornton
On 05/16/2012 11:43 AM, wren ng thornton wrote:
> Also, have you compared your transitive reduction to just outputting the
> whole graph and then using `tred`? The latter approach has the distinct
> downside of needing to serialize the whole graph; but it could still be
> a win unless you intend to implement similar algorithms yourself. The
> smart algorithms do *much* better than brute force.

I've done some profiling and found that the executable is spending about
half of its time executing my brute force graph algorithm. So doing
something smarter here (like using "tred") seems like a good idea.

The bad news is that without running my inefficient algorithm the
executable still doesn't scale well. Maybe there is a better way to let
the darcs library compute the patch dependencies.

> And of course it'd be nice to be able to pass arguments to the program
> in order to filter and otherwise manipulate the resulting graph. A lot
> of that can be done by special programs which only know about the Dot
> language (e.g., tred), so you should only focus on things which aren't
> captured by the Dot language or are otherwise only knowable by querying
> Darcs.

Sounds reasonable. Command line options would be nice.


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Loading...