Quantcast

Quickest way to pass Text to C code

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Quickest way to pass Text to C code

Yves Parès-3
Hello,

I have to interact with a C++ library that accepts as string types (putting c++ strings aside) pointers of wchar_t (CWString in Haskell) or unsigned 32-bit int (Ptr Word32 for UTF-32 codepoints).

I have read what text, bytestring and base provide, but Text can only be directly converted to (Ptr Word16), and if I use encodeUTF32 to get a ByteString, then I only get useAsCString, no direct conversion to CWString or Ptr WordXX is possible.
Not to mention the extra memory allocations due to intermediate conversions.

base provides Foreign.C.String.useAsCWString, but it requires that either I use simple Strings at the first place or (same thing than before) I convert from Text to String before passing to C.

Is there something I'm missing or isn't this kind of conversion that easy?

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Quickest way to pass Text to C code

James Cook
On Mar 21, 2012, at 4:35 AM, Yves Parès wrote:

> Hello,
>
> I have to interact with a C++ library that accepts as string types (putting c++ strings aside) pointers of wchar_t (CWString in Haskell) or unsigned 32-bit int (Ptr Word32 for UTF-32 codepoints).

The vector package has "storable" vectors, which are essentially raw C arrays.  It provides the function:

        Data.Vector.Storable.unsafeWith :: Storable a => Vector a -> (Ptr a -> IO b) -> IO b

This is probably the simplest way to do what you're describing.  You can also manually allocate and poke data into raw memory using Foreign.Marshall.Alloc and Foreign.Storable, if you're feeling particularly masochistic ;)

-- James
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Quickest way to pass Text to C code

Yves Parès-3
> You can also manually allocate and poke data into raw memory using Foreign.Marshall.Alloc and Foreign.Storable, if you're feeling particularly masochistic ;)
That's kind of what I did by the past (Aggregate Word8 into a single Word32), before I discovered Text for fast string handling.

I know about storable Vectors (and already use them, but not for text), but I would loose Haskell-side the functionnalities of Text (I'm handling textual data in the first place, not raw bytes).
Text already provide all string handling/file reading functions.
Or else you'd have a convenient way to convert Text into Vector?

Le 21 mars 2012 12:35, James Cook <[hidden email]> a écrit :
On Mar 21, 2012, at 4:35 AM, Yves Parès wrote:

> Hello,
>
> I have to interact with a C++ library that accepts as string types (putting c++ strings aside) pointers of wchar_t (CWString in Haskell) or unsigned 32-bit int (Ptr Word32 for UTF-32 codepoints).

The vector package has "storable" vectors, which are essentially raw C arrays.  It provides the function:

       Data.Vector.Storable.unsafeWith :: Storable a => Vector a -> (Ptr a -> IO b) -> IO b

This is probably the simplest way to do what you're describing.  You can also manually allocate and poke data into raw memory using Foreign.Marshall.Alloc and Foreign.Storable, if you're feeling particularly masochistic ;)

-- James


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Quickest way to pass Text to C code

Antoine Latter-2
In reply to this post by Yves Parès-3
On Wed, Mar 21, 2012 at 3:35 AM, Yves Parès <[hidden email]> wrote:

> Hello,
>
> I have to interact with a C++ library that accepts as string types (putting
> c++ strings aside) pointers of wchar_t (CWString in Haskell) or unsigned
> 32-bit int (Ptr Word32 for UTF-32 codepoints).
>
> I have read what text, bytestring and base provide, but Text can only be
> directly converted to (Ptr Word16), and if I use encodeUTF32 to get a
> ByteString, then I only get useAsCString, no direct conversion to CWString
> or Ptr WordXX is possible.

A CString is a (Ptr CChar). You can then use castPtr to get whichever
pointer type you need, if you believe the underlying buffer has the
representation you want (in this case, UTF-32).

It still won't be null-terminated, however.

Antoine

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Quickest way to pass Text to C code

Yves Parès-3
Okay, eventually it boils down to this:

import Data.Text
import Data.Text.Encoding (encodeUtf32LE)
import Data.ByteString.Unsafe (unsafeUseAsCString)

textAsPtrW32 :: Text -> (Ptr Word32 -> IO a) -> IO a
textAsPtrW32 t = unsafeUseAsCString (encodeUtf32LE $ t `snoc` '\0') . (. castPtr)

As the function passed copies or at least does not store the pointer, I can use unsafeUseAsCString, but then I have to manually append the null-termination.


Le 21 mars 2012 13:09, Antoine Latter <[hidden email]> a écrit :
On Wed, Mar 21, 2012 at 3:35 AM, Yves Parès <[hidden email]> wrote:
> Hello,
>
> I have to interact with a C++ library that accepts as string types (putting
> c++ strings aside) pointers of wchar_t (CWString in Haskell) or unsigned
> 32-bit int (Ptr Word32 for UTF-32 codepoints).
>
> I have read what text, bytestring and base provide, but Text can only be
> directly converted to (Ptr Word16), and if I use encodeUTF32 to get a
> ByteString, then I only get useAsCString, no direct conversion to CWString
> or Ptr WordXX is possible.

A CString is a (Ptr CChar). You can then use castPtr to get whichever
pointer type you need, if you believe the underlying buffer has the
representation you want (in this case, UTF-32).

It still won't be null-terminated, however.

Antoine


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Loading...