Quantcast

ANNOUNCE: fast-tags-0.0.1

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

ANNOUNCE: fast-tags-0.0.1

Evan Laforge
A while back I was complaining about the profusion of poorly
documented tags generators.  Well, there is still a profusion of
poorly documented tags generators... I was able to find 5 of them.

So, that said, here's my contribution to the problem: fast-tags,
haskell tag generator #6.

Why not use one of the other 5?

Two of them use haskell-src which means they can't parse my code.  Two
more use haskell-src-exts, which is slow and fragile, breaks on
partially edited source, and doesn't understand hsc.  Then there's the
venerable hasktags, but it's buggy and the source is a mess.  I fixed
a bug where it doesn't actually strip comments so it makes tags to
things inside comments, but then decided it would be easier to just
write my own.

fast-tags is fast because it has a parser that's just smart enough to
pick out the tags.  It can tagify my entire 300 module program in
about a second.  But it's also incremental, so it only needs to do
that the first time.  I have vim's BufWrite autocommand bound to
updating the tags every time a file is written, and it's fast enough
that I've never noticed the delay.  It understands hsc directly
(that's trivial, just ignore the # lines) so there's no need to run
hsc2hs before tagifying.  The result is tags which are automatically
up to date all the time, which is nice.

If people care about lhs and emacs tags then it wouldn't be hard to
support those too, and at that point I could replace hasktags and we'd
be back down to 5 again.  But I'm not even sure anyone uses hasktags,
since surely someone would have noticed that comment bug.

Anyway, it's been working great for me for a couple weeks so I
uploaded it to hackage: http://hackage.haskell.org/package/fast-tags

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: ANNOUNCE: fast-tags-0.0.1

Dag Odenhall
On 1 April 2012 00:23, Evan Laforge <[hidden email]> wrote:
So, that said, here's my contribution to the problem: fast-tags,
haskell tag generator #6.

I like that it doesn't give duplicate entries for type signatures and bindings. I'd like an option to recurse a directory, but i guess find+xargs will do. Even better: perhaps it could read a .cabal file and figure out source files from that. Maybe overkill, just a thought.

Thanks!

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: ANNOUNCE: fast-tags-0.0.1

Evan Laforge
On Sat, Mar 31, 2012 at 5:11 PM, [hidden email]
<[hidden email]> wrote:
> On 1 April 2012 00:23, Evan Laforge <[hidden email]> wrote:
>>
>> So, that said, here's my contribution to the problem: fast-tags,
>> haskell tag generator #6.
>
> I like that it doesn't give duplicate entries for type signatures and
> bindings. I'd like an option to recurse a directory, but i guess find+xargs
> will do. Even better: perhaps it could read a .cabal file and figure out
> source files from that. Maybe overkill, just a thought.

I dunno, I use zsh and just do **/*.hs*.  I do have a dependency
chaser that looks at imports to figure out all the modules, but since
the code I'm editing is not always attached to the rest of the
program, especially if it's work in progress, fast-tags **/*.hs* is
still the simplest and best.  And in practice I hook it up to the
editor's write command so a complete initialization is only needed the
first time.

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: ANNOUNCE: fast-tags-0.0.1

Roman Cheplyaka-2
In reply to this post by Evan Laforge
* Evan Laforge <[hidden email]> [2012-03-31 15:23:48-0700]
> A while back I was complaining about the profusion of poorly
> documented tags generators.  Well, there is still a profusion of
> poorly documented tags generators... I was able to find 5 of them.
>
> So, that said, here's my contribution to the problem: fast-tags,
> haskell tag generator #6.

It's useful to mention the limitations of this package, so that people
know what to expect and don't spend their time testing it to understand
that it doesn't suit their needs.

For example:
  doesn't generate tags for definitions without type signatures
  doesn't understand common extensions, such as type families

--
Roman I. Cheplyaka :: http://ro-che.info/

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: ANNOUNCE: fast-tags-0.0.1

Christopher Done
In reply to this post by Evan Laforge
On 1 April 2012 00:23, Evan Laforge <[hidden email]> wrote:
> Two of them use haskell-src which means they can't parse my code.  Two
> more use haskell-src-exts, which is slow and fragile, breaks on
> partially edited source, and doesn't understand hsc.

For what it's worth:

* As you say below, HSC is easily dealth with by ignoring # lines.

* haskell-src-exts is not slow. It can parse a 769 module codebase racking up
  to 100k lines of code in just over a second on my machine. That's
  good. Also, I don't think speed of the individual file matters, for
  reasons I state below.

* Broken source is not a big issue to me. Code is written with a GHCi session
  on-hand; syntactic issues are the least of my worries. I realise it
  will be for others.

The problem with haskell-src-exts is that it refuses to parse expressions for
which it cannot reduce the operator precedence, meaning it can't parse any
module that uses a freshly defined operator.

The reason I don't think individual file performance matters is that
the output can be cached. There's also the fact that if I modify a
file, and generate tags, I'm likely editing that file presently, and
I'm not likely to need jumping around which tags provides.

> Then there's the venerable hasktags, but it's buggy and the source
> is a mess. I fixed a bug where it doesn't actually strip comments
> so it makes tags to things inside comments, but then decided it
> would be easier to just write my own.

Hasktags is hardly buggy in my experience. The comments bug is minor. But I
agree that the codebase is messy and would be better handled as
Text. But again, speed on the individual basis isn't a massive issue here.

> fast-tags is fast because it has a parser that's just smart enough to
> pick out the tags.  It can tagify my entire 300 module program in
> about a second.

Unfortunately there appears to be a horrific problem with it, as the
log below shows:

$ time (find . -name '*.hs' | xargs hasktags -e)

real 0m1.573s
user 0m1.536s
sys 0m0.032s
$ cabal install fast-tags --reinstall --ghc-options=-O2
Resolving dependencies...
Configuring fast-tags-0.0.2...
Preprocessing executables for fast-tags-0.0.2...
Building fast-tags-0.0.2...
[1 of 1] Compiling Main             ( src/Main.hs,
dist/build/fast-tags/fast-tags-tmp/Main.o )
Linking dist/build/fast-tags/fast-tags ...
Installing executable(s) in /home/chris/.cabal/bin
$ time (find . -name '*.hs' | xargs fast-tags)
^C
real 10m39.184s
user 0m0.016s
sys 0m0.016s
$

I cancelled the program after ten minutes. The CPU was at 100% and
memory usage was slowly climbing, but only slowly. It's not an
infinite loop, however. If I delete the "tags" file and restrict the
search to only the src directory, it completes earlier, but gets slower.

$ time (find src -name '*.hs' | xargs hasktags -e)

real 0m0.113s
user 0m0.112s
sys 0m0.008s
$ time (find src -name '*.hs' | xargs fast-tags)

real 0m0.136s
user 0m0.120s
sys 0m0.020s
$ time (find src -name '*.hs' | xargs fast-tags)

real 0m0.250s
user 0m0.244s
sys 0m0.012s

So there appears to be an exponential component to the program. E.g.

$ time (find . -name '*.hs' | xargs fast-tags)
./lib/text-0.11.1.5/tests/benchmarks/src/Data/Text/Benchmarks/Pure.hs:435:
unexpected end of block after data * =
./lib/split-0.1.2.3/Data/List/Split/Internals.hs:68: unexpected end of
block after data * =
./lib/QuickCheck-2.4.1.1/Test/QuickCheck/Function.hs:51: unexpected
end of block after data * =

real 0m26.993s
user 0m26.590s
sys 0m0.324s

If I try to run again it hangs again. I expect it's somewhere around
sort/merge/removeDups. This is on GHC 7.2.1.

> But it's also incremental, so it only needs to do that the first
> time.

For what it's worth to anybody using hasktags, I've added this to
hasktags: https://github.com/chrisdone/hasktags/commits/master

I save the file data as JSON. I tried using aeson but that's buggy:
https://github.com/bos/aeson/issues/75 At any rate, it should cache
the generated tags rather than the file data, but I'd have to
restructure the hasktags program a bit and I didn't feel like that
yet.

hasktags has no problem with this codebase:

$ time (find . -name '*.hs' | xargs hasktags --cache)

real 0m1.512s
user 0m1.420s
sys 0m0.088s

and with the cache generated, it's half the time:

$ time (find . -name '*.hs' | xargs hasktags --cache)

real 0m0.780s
user 0m0.712s
sys 0m0.072s

> I have vim's BufWrite autocommand bound to
> updating the tags every time a file is written, and it's fast enough
> that I've never noticed the delay.  It understands hsc directly
> (that's trivial, just ignore the # lines) so there's no need to run
> hsc2hs before tagifying.  The result is tags which are automatically
> up to date all the time, which is nice.

This is the use-case I (and the users who have notified me of it) have
with Emacs in haskell-mode.

> If people care about lhs and emacs tags then it wouldn't be hard to
> support those too, and at that point I could replace hasktags and we'd
> be back down to 5 again.  But I'm not even sure anyone uses hasktags,
> since surely someone would have noticed that comment bug.

I like the fast-tags codebase so it would be nice to start using it,
but I hope you can test it on either a more substantial codebase or
just a different codebase. Or just grab some packages from Hackage and
test. Emacs support would be nice, I might add it myself if you can
fix the performance explosion. Right now hasktags is OK for me. I won't be
hacking on it in the future for more features because…

While we're on the topic I think haskell-src-exts is worth investing
time in, as it has semantic knowledge about our code. I am trying to
work on it so that it can preserve comments and output them, so that
we can start using it to pretty print our code, refactor our code,
etc. It could also be patched to handle operators as Operators [Exp]
rather than OpApp x (OpApp y), etc. I think.

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: ANNOUNCE: fast-tags-0.0.1

Christopher Done
By the way, I'm assuming that this library isn't an April Fools joke
by making a library called “fast” with explosive O(n²) time problems.
:-P

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: ANNOUNCE: fast-tags-0.0.1

Levent Erkok
Chris: You might be experiencing this issue:
http://hackage.haskell.org/trac/ghc/ticket/5783

Upgrading text and recompiling fast-tags should take care of this problem.

-Levent.

On Sun, Apr 1, 2012 at 10:12 AM, Christopher Done
<[hidden email]> wrote:
> By the way, I'm assuming that this library isn't an April Fools joke
> by making a library called “fast” with explosive O(n²) time problems.
> :-P
>
> _______________________________________________
> Haskell-Cafe mailing list
> [hidden email]
> http://www.haskell.org/mailman/listinfo/haskell-cafe

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: ANNOUNCE: fast-tags-0.0.1

Evan Laforge
In reply to this post by Christopher Done
> * haskell-src-exts is not slow. It can parse a 769 module codebase racking up
>  to 100k lines of code in just over a second on my machine. That's
>  good. Also, I don't think speed of the individual file matters, for
>  reasons I state below.

Wow, that's faster than my machine.

> * Broken source is not a big issue to me. Code is written with a GHCi session
>  on-hand; syntactic issues are the least of my worries. I realise it
>  will be for others.

I do to, but my usual practice is to have ghci in another window, save
the file, and hit :r over there.  So it's distracting when the tags
program spits out a bunch of syntax errors, I'm used to seeing those
in ghci.  And I save somewhat compulsively :)

> The problem with haskell-src-exts is that it refuses to parse expressions for
> which it cannot reduce the operator precedence, meaning it can't parse any
> module that uses a freshly defined operator.

Oh right, I remember having that problem too.

> The reason I don't think individual file performance matters is that
> the output can be cached. There's also the fact that if I modify a
> file, and generate tags, I'm likely editing that file presently, and
> I'm not likely to need jumping around which tags provides.

It's true for me too, though I like the convenience of retagging on
every single save.  But it's true that given incremental tags
regeneration, haskell-src-exts is plenty fast too.  I didn't put a lot
of thought into the name, but I mostly just wanted tags I could run on
every save.

>> Then there's the venerable hasktags, but it's buggy and the source
>> is a mess. I fixed a bug where it doesn't actually strip comments
>> so it makes tags to things inside comments, but then decided it
>> would be easier to just write my own.
>
> Hasktags is hardly buggy in my experience. The comments bug is minor. But I
> agree that the codebase is messy and would be better handled as
> Text. But again, speed on the individual basis isn't a massive issue here.

The comments thing was really big for me.  It made it miss a lot of tags.

> Unfortunately there appears to be a horrific problem with it, as the
> log below shows:

Ouch.  I probably have some kind of laziness problem in there.

I'll try downloading some stuff from cabal and try it on some
different codebases.

> I like the fast-tags codebase so it would be nice to start using it,
> but I hope you can test it on either a more substantial codebase or
> just a different codebase. Or just grab some packages from Hackage and
> test. Emacs support would be nice, I might add it myself if you can
> fix the performance explosion. Right now hasktags is OK for me. I won't be
> hacking on it in the future for more features because…

Will do.  I also realized 'x, y :: ' type stuff doesn't work.  And it
might be nice to support internal definitions and use vim's "static
tag" feature.

> While we're on the topic I think haskell-src-exts is worth investing
> time in, as it has semantic knowledge about our code. I am trying to
> work on it so that it can preserve comments and output them, so that
> we can start using it to pretty print our code, refactor our code,
> etc. It could also be patched to handle operators as Operators [Exp]
> rather than OpApp x (OpApp y), etc. I think.

Oh I agree haskell-src-exts is great and I love it.  I used it for
fix-imports.  It does support comments, but dealing with them was a
pain because they just have line numbers and you have to do some work
to figure out which bit of source they are "attached" to.  Part of the
problem is, of course, that "attached" is a fuzzy concept, but there
could definitely be some tools to make it easier.

Actually, if haskell-src-exts had a lenient parsing mode then it would
be easier to use and less buggy than a hand-written thing.

On Sun, Apr 1, 2012 at 1:44 PM, Levent Erkok <[hidden email]> wrote:
> Chris: You might be experiencing this issue:
> http://hackage.haskell.org/trac/ghc/ticket/5783

I'm guessing not, since he was using 0.0.2, which has the version
constraint.  And his symptoms are different.

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: ANNOUNCE: fast-tags-0.0.1

Evan Laforge
In reply to this post by Roman Cheplyaka-2
On Sun, Apr 1, 2012 at 3:27 AM, Roman Cheplyaka <[hidden email]> wrote:
> It's useful to mention the limitations of this package, so that people
> know what to expect and don't spend their time testing it to understand
> that it doesn't suit their needs.

Good point, I'll put the limitations and TODO stuff into the package
description.

> For example:
>  doesn't generate tags for definitions without type signatures

That was a conscious decision, though now that I think about it I
could assume they're unexported and use vim's static tags for those
definitions.  I don't know about the most common case, but I almost
always have signatures on top level definitions, and I don't really
feel like I need tags for where or let-bound definitions.

>  doesn't understand common extensions, such as type families

Oops, that was an oversight.  Should be easy enough to fix.

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: ANNOUNCE: fast-tags-0.0.1

Evan Laforge
In reply to this post by Evan Laforge
I recently uploaded fast-tags-0.0.3.  The main thing is that all the
performance problems I was able to find have been fixed---hopefully
will no longer be mistaken as an April Fools joke!  Here's copy and
paste from the hackage description:

Changes since 0.0.2:

Lots of speed ups, especially when given lots of files at once.

Support for type families and GADTs.

Support infix operators, multiple declarations per line, and fix
various other bugs that missed or gave bad tags.

Limitations:

No emacs tags, but they would be easy to add.

Not using a real haskell parser means there are more likely to be dark
corners that don't parse right.

Only top-level functions with type declarations are tagged. Top-level
functions without type declarations are skipped, as are ones inside
let or where.

Code has to be indented "properly", so brace and semicolon style with
strange dedents will probably confuse it.

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Loading...