haskell zlip read position

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

haskell zlip read position

rahul
Hi Guys,
    I am trying to parse a binary stream with the format for one entry
[headers, zlib compressed content] , with multiple entries.
I can use the Zlib library to get the content for the first entry after
the headers, but I cannot find a way to get the offset to start parsing
for the second entry.  Is there a way I can get this information out of
ZLib? or is there a better approach to doing this? Any pointers would be
very much appreciated.

Regards,
Rahul


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: haskell zlip read position

Henning Thielemann

On Wed, 20 Jul 2011, rahul wrote:

> Hi Guys,
>    I am trying to parse a binary stream with the format for one entry
> [headers, zlib compressed content] , with multiple entries.
> I can use the Zlib library to get the content for the first entry after
> the headers, but I cannot find a way to get the offset to start parsing
> for the second entry.  Is there a way I can get this information out of
> ZLib? or is there a better approach to doing this? Any pointers would be
> very much appreciated.

As far as I know, these are compressors for single files. Multiple files
can be compressed in connection with TAR, that can be manipulated from
Haskell using the 'tar' package.

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: haskell zlip read position

rahul
Hi,
| >   I am trying to parse a binary stream with the format for one entry
| >[headers, zlib compressed content] , with multiple entries.
| >I can use the Zlib library to get the content for the first entry after
| >the headers, but I cannot find a way to get the offset to start parsing
| >for the second entry.  Is there a way I can get this information out of
| >ZLib? or is there a better approach to doing this? Any pointers would be
| >very much appreciated.
|
| As far as I know, these are compressors for single files. Multiple
| files can be compressed in connection with TAR, that can be
| manipulated from Haskell using the 'tar' package.

Unfortunately the binary protocol itself is external, so can't use a different
type of compression.


                                    rahul
--
http://people.oregonstate.edu/~gopinatr/

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: haskell zlip read position

Nathan Howell-2
On Wed, Jul 20, 2011 at 11:50 AM, rahul <[hidden email]> wrote:
Unfortunately the binary protocol itself is external, so can't use a different
type of compression

Perhaps something like this would work: https://gist.github.com/1096039

I didn't test to make sure it works, but you could probably hack together a working solution using Data.Enumerator.Binary.isolate and the zlib-enum package.

-n

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: haskell zlip read position

rahul
Hi Nathan,
    Thank you very for the solution, since I am somewhat new to haskell, I
am taking some time to digest it :). But it seems that you are using
header -> streamLength to find the length of a single entry. However this
info is not present in the protocol I am parsing (git server pack files)

Have I understood your code correctly?

| > Unfortunately the binary protocol itself is external, so can't use a
| > different
| > type of compression
| >
|
| Perhaps something like this would work: https://gist.github.com/1096039
|
| I didn't test to make sure it works, but you could probably hack together a
| working solution using Data.Enumerator.Binary.isolate and the zlib-enum
| package.
|
| -n
---~*~---


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: haskell zlip read position

Nathan Howell-2
It was purely just for demonstration. I did update the code with a few more comments, but the enumerator package may not be the easiest thing to grok. You might try putting up your current code and someone might be able to recommend a better or easier approach.

If the git pack headers have lengths in them, you could do something as simple as calling hSeek to move a file handle to the next header and start your decoding over again.

On Wed, Jul 20, 2011 at 10:24 PM, rahul <[hidden email]> wrote:
Hi Nathan,
   Thank you very for the solution, since I am somewhat new to haskell, I
am taking some time to digest it :). But it seems that you are using
header -> streamLength to find the length of a single entry. However this
info is not present in the protocol I am parsing (git server pack files)

Have I understood your code correctly?

| > Unfortunately the binary protocol itself is external, so can't use a
| > different
| > type of compression
| >
|
| Perhaps something like this would work: https://gist.github.com/1096039
|
| I didn't test to make sure it works, but you could probably hack together a
| working solution using Data.Enumerator.Binary.isolate and the zlib-enum
| package.
|
| -n
---~*~---



_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: haskell zlip read position

rahul
Hi Nathan,

| It was purely just for demonstration. I did update the code with a few more
| comments, but the enumerator package may not be the easiest thing to grok.
| You might try putting up your current code and someone might be able to
| recommend a better or easier approach.

Thank you very much again, I am working on extracting a leaner version of my
parser that I can post to demonstrate the problem.

| If the git pack headers have lengths in them, you could do something as
| simple as calling hSeek to move a file handle to the next header and start
| your decoding over again.

That is the unfortunate part, git pack headers have the inflated
length rather than the entry length. So that length is unusable for
finding the next entry start.

| >    Thank you very for the solution, since I am somewhat new to haskell, I
| > am taking some time to digest it :). But it seems that you are using
| > header -> streamLength to find the length of a single entry. However this
| > info is not present in the protocol I am parsing (git server pack files)
| >
| > Have I understood your code correctly?
| >
| > | > Unfortunately the binary protocol itself is external, so can't use a
| > | > different
| > | > type of compression
| > | >
| > |
| > | Perhaps something like this would work: https://gist.github.com/1096039
| > |
| > | I didn't test to make sure it works, but you could probably hack together
| > a
| > | working solution using Data.Enumerator.Binary.isolate and the zlib-enum
| > | package.
| > |
| > | -n
| > ---~*~---
| >
| >
---~*~---


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe