I've wanted the following before:
foreign import ccall unsafe "strlen" cstringLength# :: Addr# -> Int# cstringLength :: CString -> Int cstringLength (Ptr s) = I# (cstringLength# s) A natural place for this seems to be Foreign.C.String. Thoughts? _______________________________________________ Libraries mailing list [hidden email] http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries |
Seems reasonable to me.
On Thu, 21 Jan 2021 at 03:55, chessai <[hidden email]> wrote: > > I've wanted the following before: > > foreign import ccall unsafe "strlen" > cstringLength# :: Addr# -> Int# > > cstringLength :: CString -> Int > cstringLength (Ptr s) = I# (cstringLength# s) > > A natural place for this seems to be Foreign.C.String. > > Thoughts? > _______________________________________________ > Libraries mailing list > [hidden email] > http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries Libraries mailing list [hidden email] http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries |
In reply to this post by chessai .
On Wed, Jan 20, 2021 at 09:54:30AM -0800, chessai wrote:
> I've wanted the following before: > > foreign import ccall unsafe "strlen" > cstringLength# :: Addr# -> Int# > > cstringLength :: CString -> Int > cstringLength (Ptr s) = I# (cstringLength# s) > > A natural place for this seems to be Foreign.C.String. Why a new FFI call, rather than `cstringLength#` from ghc-prim: GHC.CString (as of GHC 9.0.1): 9.0.1-notes.rst: ``ghc-prim`` library 9.0.1-notes.rst: ~~~~~~~~~~~~~~~~~~~~ 9.0.1-notes.rst: 9.0.1-notes.rst: - Add a known-key ``cstringLength#`` to ``GHC.CString`` that is eligible 9.0.1-notes.rst: for constant folding by a built-in rule. ghc-prim/changelog.md: - Add known-key `cstringLength#` to `GHC.CString`. This is just the ghc-prim/changelog.md: C function `strlen`, but a built-in rewrite rule allows GHC to ghc-prim/changelog.md: compute the result at compile time when the argument is known. CString.hs: -- | Compute the length of a NUL-terminated string. This address CString.hs: -- must refer to immutable memory. GHC includes a built-in rule for CString.hs: -- constant folding when the argument is a statically-known literal. CString.hs: -- That is, a core-to-core pass reduces the expression CString.hs: -- @cstringLength# "hello"#@ to the constant @5#@. CString.hs: cstringLength# :: Addr# -> Int# CString.hs: {-# INLINE[0] cstringLength# #-} CString.hs: cstringLength# = c_strlen Which is in turn re-exported by GHC.Exts: GHC/Exts.hs: -- * CString GHC/Exts.hs: unpackCString#, GHC/Exts.hs: unpackAppendCString#, GHC/Exts.hs: unpackFoldrCString#, GHC/Exts.hs: unpackCStringUtf8#, GHC/Exts.hs: unpackNBytes#, GHC/Exts.hs: cstringLength#, It is perhaps somewhat disappointing that the cstringLength# optimisations for `bytestring` (in master) aren't included in the `bytestring` version in 9.0.1. -- Viktor. _______________________________________________ Libraries mailing list [hidden email] http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries |
I forgot about that addition. In that case we would just need the lifted wrapper On Wed, Jan 20, 2021, 17:01 Viktor Dukhovni <[hidden email]> wrote: On Wed, Jan 20, 2021 at 09:54:30AM -0800, chessai wrote: _______________________________________________ Libraries mailing list [hidden email] http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries |
> On Jan 21, 2021, at 1:39 AM, chessai <[hidden email]> wrote:
>> On Wed, Jan 20, 2021, 17:01 Viktor Dukhovni <[hidden email]> wrote: >>> On Wed, Jan 20, 2021 at 09:54:30AM -0800, chessai wrote: >>> >>> I've wanted the following before: >>> >>> foreign import ccall unsafe "strlen" >>> cstringLength# :: Addr# -> Int# >>> >>> cstringLength :: CString -> Int >>> cstringLength (Ptr s) = I# (cstringLength# s) >>> >>> A natural place for this seems to be Foreign.C.String. >> >> Why a new FFI call, rather than `cstringLength#` from ghc-prim: GHC.CString >> (as of GHC 9.0.1): > > I forgot about that addition. In that case we would just need the lifted wrapper No worries, sure the lifted wrapper makes sense, and Foreign.C.String does look like a reasonable place in which to define, and from which to export it. -- Viktor. _______________________________________________ Libraries mailing list [hidden email] http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries |
Both the unboxed variant and the wrapper are only sound on primitive string literals. You cannot use them on anything that was allocated at runtime, only on stuff baked into the rodata section. This is a pretty onerous restriction. What use case did you have in mind?
Sent from my iPhone > On Jan 20, 2021, at 11:02 PM, Viktor Dukhovni <[hidden email]> wrote: > > >> >>> On Jan 21, 2021, at 1:39 AM, chessai <[hidden email]> wrote: >>> On Wed, Jan 20, 2021, 17:01 Viktor Dukhovni <[hidden email]> wrote: >>>> On Wed, Jan 20, 2021 at 09:54:30AM -0800, chessai wrote: >>>> >>>> I've wanted the following before: >>>> >>>> foreign import ccall unsafe "strlen" >>>> cstringLength# :: Addr# -> Int# >>>> >>>> cstringLength :: CString -> Int >>>> cstringLength (Ptr s) = I# (cstringLength# s) >>>> >>>> A natural place for this seems to be Foreign.C.String. >>> >>> Why a new FFI call, rather than `cstringLength#` from ghc-prim: GHC.CString >>> (as of GHC 9.0.1): >> >> I forgot about that addition. In that case we would just need the lifted wrapper > > No worries, sure the lifted wrapper makes sense, and Foreign.C.String does > look like a reasonable place in which to define, and from which to export it. > > -- > Viktor. > > _______________________________________________ > Libraries mailing list > [hidden email] > http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries Libraries mailing list [hidden email] http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries |
That doesn't sound right. I don't think it allocates any data on the heap which could cause reallocation and move an unpinned ByteArray#, which is the only way I can think it would be unsafe. On Thu, Jan 21, 2021, 17:50 Andrew Martin <[hidden email]> wrote: Both the unboxed variant and the wrapper are only sound on primitive string literals. You cannot use them on anything that was allocated at runtime, only on stuff baked into the rodata section. This is a pretty onerous restriction. What use case did you have in mind? _______________________________________________ Libraries mailing list [hidden email] http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries |
This is unsound:
x <- malloc ... memcpy ... copy a nul-terminated string into x let len = cstringLength x free x Because GHC can float the let binding down to where it is used after free. Sent from my iPhone On Jan 21, 2021, at 7:45 PM, Zemyla <[hidden email]> wrote:
_______________________________________________ Libraries mailing list [hidden email] http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries |
andrew! this is a really good point. would the with# or touch# combinators be needed to fix it (to force gc liveness?)? OR would we need to have the foreign c call defined to have an -> IO result, then use unsafePerformIO to "purefy it correctly"? i think the best way to explain *why* the proposed definition runs into trouble is to look at how we annotate delicate/complicated prims in primops are annotated otoh, the last time i was playing with an ostensibly pure primop that had really delicate effect ordering, the prefetch stuff in the NCG, my conclusion was that it *needed* explicit state tokens to make sure it didn't get reordered, and for this primop that pure version would need to be via unsafeperformio i think On Fri, Jan 22, 2021 at 8:46 AM Andrew Martin <[hidden email]> wrote:
_______________________________________________ Libraries mailing list [hidden email] http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries |
In reply to this post by Andrew Martin
On Fri, Jan 22, 2021 at 08:45:54AM -0500, Andrew Martin wrote:
> x <- malloc ... > memcpy ... copy a nul-terminated string into x > let len = cstringLength x > free x Isn't this broadly true for general uses of CString? Which is why we have `withCString`: https://hackage.haskell.org/package/base-4.14.1.0/docs/Foreign-C-String.html#v:withCString Is there any particularly different about the proposed `cstringLength`? Are you suggesting that it should have an "IO Int" result type to force sequencing? Is this warranted? Shouldn't users of CString (Ptr CChar) be already aware of the liveness issue in general. -- Viktor. _______________________________________________ Libraries mailing list [hidden email] http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries |
Are you suggesting that it should have an "IO Int" result type to force sequencing? Is this warranted? Yes. This is warranted. That's why Foreign.Storable.peek has IO in its result type. On any CString with a finite lifetime, it is necessary to sequence any reads and writes, and IO is the way this is done in base. By contrast, on a CString that is both immutable and has an infinite lifetime, we do not need to sequence reads. What kinds of CStrings fit the bill? Only those backed by primitive string literals. So, for example, if you have: myString :: CString myString = Ptr "foobar"# Since, myString is backed by something in the rodata section of a binary (meaning that it will never change and it will never be deallocated), then we do not care if reads get floated around. There are no functions in base for unsequenced reads, but in primitive, you'll find Data.Primitive.Ptr.indexOffPtr, which is unsequenced. So something like this would be ok: someOctet :: Word8 someOctet = Data.Primitive.Ptr.indexOffPtr myString 3 The cstringLength# in GHC.CString is similar to indexOffPtr. In fact, it could be implemented using indexOffPtr. The reason that cstringLength# exists (and in base of all places) is so that a built-in rewrite rule perform this transformation: cstringLength "foobar"# ==> 6# This will eventually be used to great effect in bytestring. See https://github.com/haskell/bytestring/pull/191. To get back to the original question, I think that any user-facing cstringLength function should probably be: cstringLength :: CString -> IO Int We need a separate FFI call that returns its result in IO to accomplish this. But this just be done in base rather than ghc-prim. There are no interesting rewrite rules that exist for such a function. On Fri, Jan 22, 2021 at 3:31 PM Viktor Dukhovni <[hidden email]> wrote: On Fri, Jan 22, 2021 at 08:45:54AM -0500, Andrew Martin wrote: -- -Andrew Thaddeus Martin
_______________________________________________ Libraries mailing list [hidden email] http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries |
In reply to this post by Andrew Martin
On Fri, Jan 22, 2021 at 04:56:33PM -0500, Andrew Martin wrote:
> This will eventually be used to great effect in bytestring. See > https://github.com/haskell/bytestring/pull/191. Yes, you might recall that I'm well aware of that (already merged) PR, indeed that's how I happened to recall that cstringLength# is present in 9.0. > To get back to the original question, I think that any user-facing > cstringLength function should probably be: > > cstringLength :: CString -> IO Int > > We need a separate FFI call that returns its result in IO to > accomplish this. But this just be done in base rather than ghc-prim. > There are no interesting rewrite rules that exist for such a function. So I guess your suggestion in response to @chessai's original post: >> On Wed, Jan 20, 2021 at 09:54:30AM -0800, chessai wrote: >> >> I've wanted the following before: >> >> foreign import ccall unsafe "strlen" >> cstringLength# :: Addr# -> Int# >> >> cstringLength :: CString -> Int >> cstringLength (Ptr s) = I# (cstringLength# s) >> >> A natural place for this seems to be Foreign.C.String. would be to instead directly implement the lifted FFI variant: foreign import ccall unsafe "strlen" cstringLength :: CString -> IO Int which probably would not need a wrapper and can be exported directly. module Main (main) where import Control.Monad ( (>=>) ) import Foreign.C.String (CString, withCString) foreign import ccall unsafe "strlen" cstringLength :: CString -> IO Int main :: IO () main = withCString "Hello, World!" $ cstringLength >=> print The cost of this safety net is that it results in more sequencing than is strictly necessary. It is enough for the enclosing IO action to not embed the length in its result in some not yet fully evaluated thunk. I guess @chessai can let us know whether the more strictly sequenced variant meets his needs. -- Viktor. _______________________________________________ Libraries mailing list [hidden email] http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries |
I agree with Andrew, let's just export the lifted ffi call This suits my needs, but, regardless of my needs, seems like a perfectly sensible addition to Foreign.C.String Concrete addition: foreign import unsafe "strlen" cstringLength :: CString -> IO Int On Fri, Jan 22, 2021, 17:09 Viktor Dukhovni <[hidden email]> wrote: On Fri, Jan 22, 2021 at 04:56:33PM -0500, Andrew Martin wrote: _______________________________________________ Libraries mailing list [hidden email] http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries |
I’m on board with this import, but we’ll need to get the type right if we’re going to bind to libc’s strlen directly
foreign import unsafe "strlen" cstringLength :: CString -> IO CSize
_______________________________________________ Libraries mailing list [hidden email] http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries |
On Fri, Jan 22, 2021 at 06:07:22PM -0800, Eric Mertens wrote:
> I’m on board with this import, but we’ll need to get the type right if > we’re going to bind to libc’s strlen directly > > foreign import unsafe "strlen" > cstringLength :: CString -> IO CSize Yes, definitely. The final all-nits-addressed variant would be: foreign import ccall unsafe "string.h strlen" cstringLength :: CString -> IO CSize which is differs from the example in section 8.4.3 of the Haskell 2010 report https://www.haskell.org/onlinereport/haskell2010/haskellch8.html#x15-1590008.4.3 foreign import ccall "string.h strlen" cstrlen :: Ptr CChar -> IO CSize only in the addition of "unsafe" and the name of the resulting function. -- Viktor. _______________________________________________ Libraries mailing list [hidden email] http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries |
Free forum by Nabble | Edit this page |