Binary Data Access via PIC…??

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Binary Data Access via PIC…??

Nick Rudnick-2
On NL FP day, it struck me again when I saw an almost 1 MB *.hs file with apparent sole purpose of getting a quantity of raw data incorporated to the binary – applying some funny text encoding constructs. I remembered that, to my best knowledge, with major downside that it's compile time, this appears to be the best solution to me…

Another approach I did notice several times was, say, the use of super fast parsing, to read in binary data at run time.

Did I miss something?

Or, more specifically – I am speaking about that kind of binary data which is

(1) huge! – the 1 MB mentioned above rather being at the lower limit,
(2) completely independent from the version of the Haskell compiler,
(3) guaranteed (externally!) to match the structural requirements of the application referred to,
(4) well managed in some way, concerning ABI issues, too (e.g. versioning, metadata headers etc.),

and the question is in how far – as I believe other languages do, too – we can exploit PIC (position independent code), to read in really large quantities of binary data at run time or immediately before run time, without the need for parsing at all.

E.g., a textual data representation Haskell file will generate an an object file already, for which linking only should have a limited amount of assumptions regarding its inner structure. Imagining I have a huge but simple DB table, and a kind of converter which by some simplification of a Haskell compiler generates an object file that equally matches these (limited, as I believe) assumptions, and at the end can build a 'fake' the linker accepts instead of one dummy file skeleton – couldn't that be a way leading into the direction of directly getting in vast amounts of binary data in one part?

In case there are stronger integrity needs, extra metadata like should be usable for verification of the origin from a valid code generator.

Of course, while not completely necessary, true run time loading would be even greater… while direct interfacing to foreign (albeit simple) memory spaces deems much more intricate to me.

I regularly stumbled about such cases – so I do believe this to useful.

I would be happy to learn more about this – any thoughts…??

Cheers, and all the best, Nick






_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Binary Data Access via PIC…??

Ian Denhardt
Shameless plug for one of my own libraries, which seems at least
relevant to the problem space:

    https://hackage.haskell.org/package/capnp

Though as a disclaimer I haven't done any benchmarking myself; my
personal interest is more in RPC than in super-fast serialization.
There will be a release with RPC support sometime later this month.

That said, I have heard from one user who's using it to communicate with
a part of their application written in C++, who switched over to from
protobufs for perf, and because they needed to handle very large (>
2GiB) data.

-Ian

Quoting Nick Rudnick (2019-01-13 07:43:40)

>    On NL FP day, it struck me again when I saw an almost 1 MB *.hs file
>    with apparent sole purpose of getting a quantity of raw data
>    incorporated to the binary � applying some funny text encoding
>    constructs. I remembered that, to my best knowledge, with major
>    downside that it's compile time, this appears to be the best solution
>    to me�
>    Another approach I did notice several times was, say, the use of super
>    fast parsing, to read in binary data at run time.
>    Did I miss something?
>    Or, more specifically � I am speaking about that kind of binary data
>    which is
>    (1) huge! � the 1 MB mentioned above rather being at the lower limit,
>    (2) completely independent from the version of the Haskell compiler,
>    (3) guaranteed (externally!) to match the structural requirements of
>    the application referred to,
>    (4) well managed in some way, concerning ABI issues, too (e.g.
>    versioning, metadata headers etc.),
>    and the question is in how far � as I believe other languages do, too �
>    we can exploit PIC (position independent code), to read in really large
>    quantities of binary data at run time or immediately before run time,
>    without the need for parsing at all.
>    E.g., a textual data representation Haskell file will generate an an
>    object file already, for which linking only should have a limited
>    amount of assumptions regarding its inner structure. Imagining I have a
>    huge but simple DB table, and a kind of converter which by some
>    simplification of a Haskell compiler generates an object file that
>    equally matches these (limited, as I believe) assumptions, and at the
>    end can build a 'fake' the linker accepts instead of one dummy file
>    skeleton � couldn't that be a way leading into the direction of
>    directly getting in vast amounts of binary data in one part?
>    In case there are stronger integrity needs, extra metadata like should
>    be usable for verification of the origin from a valid code generator.
>    Of course, while not completely necessary, true run time loading would
>    be even greater� while direct interfacing to foreign (albeit simple)
>    memory spaces deems much more intricate to me.
>    I regularly stumbled about such cases � so I do believe this to useful.
>    I would be happy to learn more about this � any thoughts�??
>    Cheers, and all the best, Nick
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.