Storable laws

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
18 messages Options
Reply | Threaded
Open this post in threaded view
|

Storable laws

David Feuer
The documentation for pokeByteOff indicates that the following equality holds:

  pokeElemOff addr idx x =
    poke (addr `plusPtr` (idx * sizeOf x)) x

Notably, this ignores alignment. It thus seems to imply that sizeOf must always be a multiple of alignment; otherwise, non-zero indices could access non-aligned addresses.

Was this intentional? If so, I believe sizeOf and alignment should document the law. If not, then I believe the {poke,peek}ElemOff laws need to change to something like

  pokeElemOff addr idx x =
     poke (addr `plusPtr` (idx * lcm (sizeOf x) (alignment x))) x

_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Storable laws

Michael Sloan
I realize that these libraries are used on many architectures.
However, on modern x86 machines, that are less than ~7 years old, it
doesn't seem to matter all that much.
https://lemire.me/blog/2012/05/31/data-alignment-for-speed-myth-or-reality/
 In the comments, looks like somewhat older processors take a 40%
performance hit, which isn't good, but it isn't awful.

IIRC, of the processors that are actually used, there are two where
access alignment really matters - ARM and PowerPC.  If you're running
linux I believe it will handle the exception and do the unaligned
access.  However, of course it's really slow to do this.

I'm not sure if it makes sense to change the law.  Someone might be
relying on the behavior of instances that do not fulfill this law.
Perhaps in those cases it makes sense to just remove the instance.

-Michael

On Fri, Dec 15, 2017 at 9:33 PM, David Feuer <[hidden email]> wrote:

> The documentation for pokeByteOff indicates that the following equality
> holds:
>
>   pokeElemOff addr idx x =
>     poke (addr `plusPtr` (idx * sizeOf x)) x
>
> Notably, this ignores alignment. It thus seems to imply that sizeOf must
> always be a multiple of alignment; otherwise, non-zero indices could access
> non-aligned addresses.
>
> Was this intentional? If so, I believe sizeOf and alignment should document
> the law. If not, then I believe the {poke,peek}ElemOff laws need to change
> to something like
>
>   pokeElemOff addr idx x =
>      poke (addr `plusPtr` (idx * lcm (sizeOf x) (alignment x))) x
>
> _______________________________________________
> Libraries mailing list
> [hidden email]
> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>
_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Storable laws

Henning Thielemann

On Fri, 15 Dec 2017, Michael Sloan wrote:

> I'm not sure if it makes sense to change the law.  Someone might be
> relying on the behavior of instances that do not fulfill this law.
> Perhaps in those cases it makes sense to just remove the instance.

Storable is intended for data exchange with code written in other
languages. Thus we must use the same alignment rules as the system ABI.
_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Storable laws

Sven Panne-2
In reply to this post by David Feuer
2017-12-16 6:33 GMT+01:00 David Feuer <[hidden email]>:
The documentation for pokeByteOff indicates that the following equality holds:

  pokeElemOff addr idx x =
    poke (addr `plusPtr` (idx * sizeOf x)) x

[...] Was this intentional?

Yep, that was intentional, think of pokeElemOff as indexing into an array. Note that the FFI intentionally does not specify how to (un)marshal structs, only basic types and arrays of them. Doing anything else would be a) language-specific and b) platform-ABI-specific. Storable is just meant as a basic building block to do such more involved things.
 
If so, I believe sizeOf and alignment should document the law. [...]

Perhaps, but what exactly do you feel is missing there?

_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Storable laws

Sven Panne-2
In reply to this post by Henning Thielemann
2017-12-16 8:53 GMT+01:00 Henning Thielemann <[hidden email]>:
Storable is intended for data exchange with code written in other languages. Thus we must use the same alignment rules as the system ABI.

As already mentioned above, Storable is meant as a *basic* building block for doing that, without a reference to any ABI whatsoever. Another use case for Storable is e.g. marshaling things into some form suitable for serialization, which might have nothing to do with any system ABI. So the conclusion that Storable must somehow obey some alignment rules is not correct: *You* as the author of a Storable instance for a give purpose have to care about alignment/packing/etc.

Taking a step back: An ABI like https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf specifies a lot of things:

   * OS interface
   * File formats
   * Handling of exceptions
   * Linking
   * The calling convention for C (data types, data layout, registers, stack frames, ...)
   * ...

Even the calling convention for C++ is left out of most such ABI documents (perhaps available as a separate addendum), other languages are most of the time not even mentioned. *Some* calling convention has to be specified, otherwise you would have a hard time specifying the OS interface, and C is still the most natural choice for that. The FFI spec just gives you a small fraction of such an ABI:

   * A way to call C land, because you can't construct the right stack frame/register contents from Haskell land
   * A way to call back from C to Haskell
   * A very basic tooling to (un)marshal primitive data types and arrays of them (Storable)

Note that Storable itself is intentionally not tied to the ABI, it could as well be used to e.g. marshal packed graphics data to OpenGL etc. The right approach IMHO is to have separate classes for the various ABIs:

   * StorableAMD64
   * Storablex86
   * StorableARM32
   * StorableARM64
   * ...

Those classes would handle alignment/ordering issues for their respective platform, and we can have more instances for them, including sum types. Perhaps even a packed variant of these classes might be needed. In addition, perhaps an alias (subclass?) "StorableNative" of one of those classes represents the native ABI on the platform the program is running is needed, but then cross-compilation will be tricky.

I'm very open to extending the FFI toolset, but let's not retroactively interpret stuff into already existing things, at the danger of breaking an unknown amount of code.

Cheers,
    S.

_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Storable laws

Henning Thielemann
In reply to this post by Sven Panne-2

On Sat, 16 Dec 2017, Sven Panne wrote:

> 2017-12-16 6:33 GMT+01:00 David Feuer <[hidden email]>:
>       The documentation for pokeByteOff indicates that the following equality holds:
>
>   pokeElemOff addr idx x =
>     poke (addr `plusPtr` (idx * sizeOf x)) x
>
> [...] Was this intentional?
>
>
> Yep, that was intentional, think of pokeElemOff as indexing into an
> array. Note that the FFI intentionally does not specify how to
> (un)marshal structs, only basic types and arrays of them.
I thought that arrays require alignment of their elements.
_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Storable laws

Sven Panne-2
2017-12-16 15:16 GMT+01:00 Henning Thielemann <[hidden email]>:
I thought that arrays require alignment of their elements.

Yes, and if you start aligned, the pokeElemOff law keeps you aligned. If you don't start aligned, Storable never magically aligns the first element, anyway, so this must have been intentional (e.g. an array within packed data).

In theory one could have an e.g. 7-byte data type with 8-byte alignment requirements, but I think we can re-open the discussion when a processor manufacturer is masochistic enough to do that. ;-)


_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Storable laws

Henning Thielemann

On Sat, 16 Dec 2017, Sven Panne wrote:

> 2017-12-16 15:16 GMT+01:00 Henning Thielemann <[hidden email]>:
>       I thought that arrays require alignment of their elements.
>
>
> Yes, and if you start aligned, the pokeElemOff law keeps you aligned. If you don't start aligned, Storable never
> magically aligns the first element, anyway, so this must have been intentional (e.g. an array within packed
> data).
>
> In theory one could have an e.g. 7-byte data type with 8-byte alignment requirements, but I think we can re-open
> the discussion when a processor manufacturer is masochistic enough to do that. ;-)

I more think of a custom struct with size 12 bytes consisting of a 64 bit
and 32 bit word. It must be 8-byte aligned. You would have to align all
elements at multiples of 8-byte and the address difference between two
array elements is 16 not 12.

On x86 Linux there would be no problem because a 12 byte struct containing
a 64 bit word must already be padded to 16 byte. But that's an ABI
definition and Storable wants to keep independent from that, right?
_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Storable laws

Sven Panne-2
2017-12-16 16:45 GMT+01:00 Henning Thielemann <[hidden email]>:
I more think of a custom struct with size 12 bytes consisting of a 64 bit and 32 bit word. It must be 8-byte aligned. You would have to align all elements at multiples of 8-byte and the address difference between two array elements is 16 not 12.

On x86 Linux there would be no problem because a 12 byte struct containing a 64 bit word must already be padded to 16 byte. But that's an ABI definition and Storable wants to keep independent from that, right?

Yes, that's an ABI issue. Once again: Storable is *not* for structs, it never has been and will never be (because there is no single correct instance for composite struct-like types). Without any further assumptions, the gap-less definition is the only one which makes sense. And without instances for struct-like things, it even makes more sense.

So what you want is an instance of the proposed StorableAMD64 class. Taking your example: An array without gaps of your 12byte struct would e.g. be totally OK for OpenGL or a packed AMD64-conformant struct, so base shouldn't choose one for you.


_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Storable laws

Henning Thielemann

On Sat, 16 Dec 2017, Sven Panne wrote:

> Yes, that's an ABI issue. Once again: Storable is *not* for structs, it
> never has been and will never be (because there is no single correct
> instance for composite struct-like types).

One of hsc2hs most important tasks is to generate Storable instances of
Haskell records that shall be compatible to C structs. Is that one
completely invalid?
_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Storable laws

Henning Thielemann

On Sun, 17 Dec 2017, Merijn Verstraaten wrote:

> Tools like hsc2hs already pad the result of sizeOf to the appropriately
> aligned size (at least the ones I've used), so this is a non-issue for
> that scenario.

But that's tailored to C. If I understand Sven correctly, then this is not
valid for Storable.
_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Storable laws

Sven Panne-2
2017-12-17 13:41 GMT+01:00 Henning Thielemann <[hidden email]>:

On Sun, 17 Dec 2017, Merijn Verstraaten wrote:

Tools like hsc2hs already pad the result of sizeOf to the appropriately aligned size (at least the ones I've used), so this is a non-issue for that scenario.

But that's tailored to C. If I understand Sven correctly, then this is not valid for Storable.

I'm not sure if I understand the concerns about hsc2hs here correctly, but anyway: There are intentionally no Storable instances for struct-like things in the FFI spec, but that doesn't rule out that one can define instances for a special purpose (e.g. native ABI compliant ones) on your own. hsc2hs, c2hs, GreenCard etc. are tools to help you with that (and much more). This doesn't contradict anything in the FFI spec, you just choose one of several possible semantics in your program for types which are not even mentioned in the spec. OTOH, as an author of a general library I would be very hesitant to define such instances, they might not be what the library user expects.

Perhaps we really want to extend the FFI spec in such a way that only one kind of instance makes sense for a given type (including complex ones), e.g. by defining separate classes for the different purposes mentioned in this thread. But this would need some serious design thoughts to get this right and flexible enough while avoiding to break tons of FFI code already out there.

_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Storable laws

Evan Laforge
This is only roughly related, but long ago I completely replaced
Storable in my own hsc2hs-using code with CStorable, which is a copy
of Storable, except having only the instances that I choose.
Specifically, it has instances for CChar but not Char, for CBool but
not Bool, etc. because it's too dangerous to silently allow these
memory-corrupting instances.  I think reusing Storable for C
serialization was a mistake.  Replacing Storable was pretty easy
though, just copy-paste the 200-line Foreign module and change the
class name.

On Sun, Dec 17, 2017 at 11:02 AM, Sven Panne <[hidden email]> wrote:

> 2017-12-17 13:41 GMT+01:00 Henning Thielemann
> <[hidden email]>:
>>
>>
>> On Sun, 17 Dec 2017, Merijn Verstraaten wrote:
>>
>>> Tools like hsc2hs already pad the result of sizeOf to the appropriately
>>> aligned size (at least the ones I've used), so this is a non-issue for that
>>> scenario.
>>
>>
>> But that's tailored to C. If I understand Sven correctly, then this is not
>> valid for Storable.
>
>
> I'm not sure if I understand the concerns about hsc2hs here correctly, but
> anyway: There are intentionally no Storable instances for struct-like things
> in the FFI spec, but that doesn't rule out that one can define instances for
> a special purpose (e.g. native ABI compliant ones) on your own. hsc2hs,
> c2hs, GreenCard etc. are tools to help you with that (and much more). This
> doesn't contradict anything in the FFI spec, you just choose one of several
> possible semantics in your program for types which are not even mentioned in
> the spec. OTOH, as an author of a general library I would be very hesitant
> to define such instances, they might not be what the library user expects.
>
> Perhaps we really want to extend the FFI spec in such a way that only one
> kind of instance makes sense for a given type (including complex ones), e.g.
> by defining separate classes for the different purposes mentioned in this
> thread. But this would need some serious design thoughts to get this right
> and flexible enough while avoiding to break tons of FFI code already out
> there.
>
> _______________________________________________
> Libraries mailing list
> [hidden email]
> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>
_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Storable laws

Henning Thielemann

On Sun, 17 Dec 2017, Evan Laforge wrote:

> This is only roughly related, but long ago I completely replaced
> Storable in my own hsc2hs-using code with CStorable, which is a copy
> of Storable, except having only the instances that I choose.
> Specifically, it has instances for CChar but not Char, for CBool but
> not Bool, etc. because it's too dangerous to silently allow these
> memory-corrupting instances.  I think reusing Storable for C
> serialization was a mistake.  Replacing Storable was pretty easy
> though, just copy-paste the 200-line Foreign module and change the
> class name.

Would you mind moving this class to a public package?
_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Storable laws

Evan Laforge
On Sun, Dec 17, 2017 at 11:55 PM, Henning Thielemann
<[hidden email]> wrote:

> On Sun, 17 Dec 2017, Evan Laforge wrote:
>> This is only roughly related, but long ago I completely replaced
>> Storable in my own hsc2hs-using code with CStorable, which is a copy
>> of Storable, except having only the instances that I choose.
>> Specifically, it has instances for CChar but not Char, for CBool but
>> not Bool, etc. because it's too dangerous to silently allow these
>> memory-corrupting instances.  I think reusing Storable for C
>> serialization was a mistake.  Replacing Storable was pretty easy
>> though, just copy-paste the 200-line Foreign module and change the
>> class name.
>
> Would you mind moving this class to a public package?

Done, after much delay: http://hackage.haskell.org/package/c-storable-0.2

That said, from the README:

If you are writing a new C binding, I recommend something higher-level than
hsc2hs, such as c2hs, which I think should sidestep the problem entirely by
verifying your types.  But if you are already using hsc2hs and for whatever
reason don't want subtle memory corruption bugs, you can import ForeignC
instead of Foreign and Foreign.C, and see if you have any.
_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Storable laws

John Wiegley-2
>>>>> "EL" == Evan Laforge <[hidden email]> writes:

EL> If you are writing a new C binding, I recommend something higher-level
EL> than hsc2hs, such as c2hs, which I think should sidestep the problem
EL> entirely by verifying your types. But if you are already using hsc2hs and
EL> for whatever reason don't want subtle memory corruption bugs, you can
EL> import ForeignC instead of Foreign and Foreign.C, and see if you have any.

Note that c2hsc is even higher, since it will generate the .hsc files for you
from the C headers.

--
John Wiegley                  GPG fingerprint = 4710 CF98 AF9B 327B B80F
http://newartisans.com                          60E1 46C4 BD1A 7AC1 4BA2
_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Storable laws

Evan Laforge
On Wed, Mar 7, 2018 at 11:13 AM, John Wiegley <[hidden email]> wrote:

>>>>>> "EL" == Evan Laforge <[hidden email]> writes:
>
> EL> If you are writing a new C binding, I recommend something higher-level
> EL> than hsc2hs, such as c2hs, which I think should sidestep the problem
> EL> entirely by verifying your types. But if you are already using hsc2hs and
> EL> for whatever reason don't want subtle memory corruption bugs, you can
> EL> import ForeignC instead of Foreign and Foreign.C, and see if you have any.
>
> Note that c2hsc is even higher, since it will generate the .hsc files for you
> from the C headers.

So many C FFI tools.  Is there a survey somewhere?

That might have saved me some time long ago, when it seemed like the
only choices were green card and hsc2hs, and green card was already
somewhat obsolete looking.  But come to think of it, I don't even know
why...
_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Storable laws

Niklas Hambüchen
On 07/03/2018 20.28, Evan Laforge wrote:
> So many C FFI tools.  Is there a survey somewhere?

And there's also `inline-c`, which needs no preprocessor but has a
larger set of Haskell deps instead.
_______________________________________________
Libraries mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries