How to write a pure String to String function in Haskell FFI to C++

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

How to write a pure String to String function in Haskell FFI to C++

Ting Lei-2
Hi,
 
I want to implement a function in C++ via Haskell FFI, which should have the (final) type of String -> String.
Say, is it possible to re-implement the following function in C++ with the exact same signature?
 
import Data.Char
toUppers:: String -> String
toUppers s = map toUpper senter code here
 
In particular, I wanted to avoid having an IO in the return type because introducing the impurity
(by that I mean the IO monad) for this simple task is logically unnecessary. All examples involing
 a C string I have seen so far involve returning an IO something or Ptr which cannot be converted back to a pure String.
 
The reason I want to do this is that I have the impression that marshaling is not easy with FFI. Maybe
if I can fix the simplest case above (other than primitive types such as int), then I can do whatever data
parsing I want on the C++ side, which should be easy, practically.
 
The cost of parsing is negligible compared to the computation that I want to do between
the marshalling to/from strings.
 
Thanks in advance.
 

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: How to write a pure String to String function in Haskell FFI to C++

Brandon Allbery
On Sun, Jun 2, 2013 at 7:22 PM, Ting Lei <[hidden email]> wrote:
In particular, I wanted to avoid having an IO in the return type because introducing the impurity
(by that I mean the IO monad) for this simple task is logically unnecessary. All examples involing

Anything that comes into or goes out of a Haskell program is in IO, period. If you have an FFI function which is guaranteed to not change anything but its parameters and those only in a pure way, then you can use unsafeLocalState to "hide" the IO; but claiming that when it's not true can lead to problems ranging from incorrect results to core dumps, so don't try to lie about it.
 
 a C string I have seen so far involve returning an IO something or Ptr which cannot be converted back to a pure String.

Haskell String-s are *not* C strings. Not even slightly. C cannot work with Haskell's String type directly at all. Some kind of marshaling is absolutely necessary; there are functions in Foreign.Marshal.String that will marshal Haskell String-s to and from C strings.

(String is a linked list of Char, which is also not a C char; it is a constructor and a machine word large enough to hold a Unicode codepoint. And because Haskell is non-strict, any part of that linked list can be an unevaluated thunk which requires forcing the evaluation of arbitrary Haskell code elsewhere to "reify" the value; this obviously cannot be done in the middle of random C code, so it must be done during marshaling.)

--
brandon s allbery kf8nh                               sine nomine associates
[hidden email]                                  [hidden email]
unix, openafs, kerberos, infrastructure, xmonad        http://sinenomine.net

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: How to write a pure String to String function in Haskell FFI to C++

Thomas Davie

On 2 Jun 2013, at 16:48, Brandon Allbery <[hidden email]> wrote:

On Sun, Jun 2, 2013 at 7:22 PM, Ting Lei <[hidden email]> wrote:
In particular, I wanted to avoid having an IO in the return type because introducing the impurity
(by that I mean the IO monad) for this simple task is logically unnecessary. All examples involing

Anything that comes into or goes out of a Haskell program is in IO, period. If you have an FFI function which is guaranteed to not change anything but its parameters and those only in a pure way, then you can use unsafeLocalState to "hide" the IO; but claiming that when it's not true can lead to problems ranging from incorrect results to core dumps, so don't try to lie about it.
 
 a C string I have seen so far involve returning an IO something or Ptr which cannot be converted back to a pure String.

Haskell String-s are *not* C strings. Not even slightly. C cannot work with Haskell's String type directly at all. Some kind of marshaling is absolutely necessary; there are functions in Foreign.Marshal.String that will marshal Haskell String-s to and from C strings.

(String is a linked list of Char, which is also not a C char; it is a constructor and a machine word large enough to hold a Unicode codepoint. And because Haskell is non-strict, any part of that linked list can be an unevaluated thunk which requires forcing the evaluation of arbitrary Haskell code elsewhere to "reify" the value; this obviously cannot be done in the middle of random C code, so it must be done during marshalling.)

I'm not convinced that that's "obvious" – though it certainly requires functions (that go through the FFI) to grab each character at a time.

Thanks

Tom Davie


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: How to write a pure String to String function in Haskell FFI to C++

Brandon Allbery
On Sun, Jun 2, 2013 at 8:01 PM, Thomas Davie <[hidden email]> wrote:
On 2 Jun 2013, at 16:48, Brandon Allbery <[hidden email]> wrote:
(String is a linked list of Char, which is also not a C char; it is a constructor and a machine word large enough to hold a Unicode codepoint. And because Haskell is non-strict, any part of that linked list can be an unevaluated thunk which requires forcing the evaluation of arbitrary Haskell code elsewhere to "reify" the value; this obviously cannot be done in the middle of random C code, so it must be done during marshalling.)

I'm not convinced that that's "obvious" – though it certainly requires functions (that go through the FFI) to grab each character at a time.

I think you underestimate the complexity of the Haskell runtime and the interactions between it and the FFI. Admittedly it is probably not "obvious" in the sense of "anyone can tell without knowing anything about it that it can't possibly work", but it should be at least somewhat obvious to someone who sees why there needs to be an FFI in the first place that the situation is not trivial, and that they probably should not blindly assume that the only reason one can't just pass Haskell values directly to C is that some GHC developer was feeling lazy at the time.

--
brandon s allbery kf8nh                               sine nomine associates
[hidden email]                                  [hidden email]
unix, openafs, kerberos, infrastructure, xmonad        http://sinenomine.net

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: How to write a pure String to String function in Haskell FFI to C++

Ting Lei-2
Thanks for your answers so far.

It seems that the laziness of String or [char] is the problem.

My question boils then down to this. There are plenty of Haskell FFI examples where simple things like sin/cos in <math.h> can be imported into Haskell as pure functions. Is there a way to extend that to String without introducing an IO (), but maybe sacrificing laziness?
If String has to be lazy, is there another Haskell data type convertible to String that can do the job?

The C++/C function (e.g. toUppers) is computation-only and as pure as cos and tan. The fact that marshaling string incurs an IO monad in current examples is kind of unintuitive and like a bug in design. I don't mind making redundant copies under the hood from one type to another..




On Sun, Jun 2, 2013 at 8:08 PM, Brandon Allbery <[hidden email]> wrote:
On Sun, Jun 2, 2013 at 8:01 PM, Thomas Davie <[hidden email]> wrote:
On 2 Jun 2013, at 16:48, Brandon Allbery <[hidden email]> wrote:
(String is a linked list of Char, which is also not a C char; it is a constructor and a machine word large enough to hold a Unicode codepoint. And because Haskell is non-strict, any part of that linked list can be an unevaluated thunk which requires forcing the evaluation of arbitrary Haskell code elsewhere to "reify" the value; this obviously cannot be done in the middle of random C code, so it must be done during marshalling.)

I'm not convinced that that's "obvious" – though it certainly requires functions (that go through the FFI) to grab each character at a time.

I think you underestimate the complexity of the Haskell runtime and the interactions between it and the FFI. Admittedly it is probably not "obvious" in the sense of "anyone can tell without knowing anything about it that it can't possibly work", but it should be at least somewhat obvious to someone who sees why there needs to be an FFI in the first place that the situation is not trivial, and that they probably should not blindly assume that the only reason one can't just pass Haskell values directly to C is that some GHC developer was feeling lazy at the time.

--
brandon s allbery kf8nh                               sine nomine associates
[hidden email]                                  [hidden email]
unix, openafs, kerberos, infrastructure, xmonad        http://sinenomine.net


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: How to write a pure String to String function in Haskell FFI to C++

Chris Wong
> The C++/C function (e.g. toUppers) is computation-only and as pure as cos
> and tan. The fact that marshaling string incurs an IO monad in current
> examples is kind of unintuitive and like a bug in design. I don't mind
> making redundant copies under the hood from one type to another..

If you can guarantee that the call is pure, then you can execute it
directly using `unsafePerformIO`. Simply call the external function as
usual, then invoke `unsafePerformIO` on the result.

See <http://hackage.haskell.org/packages/archive/base/4.6.0.1/doc/html/System-IO-Unsafe.html>.

On another note, if you really care about performance, you should use
the `bytestring` and `text` packages instead of String. They are
implemented in terms of byte arrays, instead of linked lists, hence
are both faster and more FFI-friendly.

>
>
>
> On Sun, Jun 2, 2013 at 8:08 PM, Brandon Allbery <[hidden email]> wrote:
>>
>> On Sun, Jun 2, 2013 at 8:01 PM, Thomas Davie <[hidden email]> wrote:
>>>
>>> On 2 Jun 2013, at 16:48, Brandon Allbery <[hidden email]> wrote:
>>>
>>> (String is a linked list of Char, which is also not a C char; it is a
>>> constructor and a machine word large enough to hold a Unicode codepoint. And
>>> because Haskell is non-strict, any part of that linked list can be an
>>> unevaluated thunk which requires forcing the evaluation of arbitrary Haskell
>>> code elsewhere to "reify" the value; this obviously cannot be done in the
>>> middle of random C code, so it must be done during marshalling.)
>>>
>>>
>>> I'm not convinced that that's "obvious" – though it certainly requires
>>> functions (that go through the FFI) to grab each character at a time.
>>
>>
>> I think you underestimate the complexity of the Haskell runtime and the
>> interactions between it and the FFI. Admittedly it is probably not "obvious"
>> in the sense of "anyone can tell without knowing anything about it that it
>> can't possibly work", but it should be at least somewhat obvious to someone
>> who sees why there needs to be an FFI in the first place that the situation
>> is not trivial, and that they probably should not blindly assume that the
>> only reason one can't just pass Haskell values directly to C is that some
>> GHC developer was feeling lazy at the time.
>>
>> --
>> brandon s allbery kf8nh                               sine nomine
>> associates
>> [hidden email]
>> [hidden email]
>> unix, openafs, kerberos, infrastructure, xmonad
>> http://sinenomine.net
>
>
>
> _______________________________________________
> Haskell-Cafe mailing list
> [hidden email]
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>



--
Chris Wong, fixpoint conjurer
  e: [hidden email]
  w: http://lfairy.github.io/

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: How to write a pure String to String function in Haskell FFI to C++

Carter Schonwald
as the others have said, if you want to have text data go between ghc and c++, please use Text or Bytestring,

String... would get weird.

If you seriously want to experiment with writing low level code manipulating the String type, it *MIGHT* be possible using the GHC C minus minus (CMM). This would be very very very subtle to do correctly, and also just be really really complicated and hard. 

Likewise, for writing a "pure" looking ffi function, a good example is in the lz4hs lib, where all the allocation occurs on the haskell side, and the ffi is only mutating freshly allocated memory. Subject to this, unsafePerformIO can be safely used to give a safe pure thread safe api.

cheers
-Carter


On Sun, Jun 2, 2013 at 10:55 PM, Chris Wong <[hidden email]> wrote:
> The C++/C function (e.g. toUppers) is computation-only and as pure as cos
> and tan. The fact that marshaling string incurs an IO monad in current
> examples is kind of unintuitive and like a bug in design. I don't mind
> making redundant copies under the hood from one type to another..

If you can guarantee that the call is pure, then you can execute it
directly using `unsafePerformIO`. Simply call the external function as
usual, then invoke `unsafePerformIO` on the result.

See <http://hackage.haskell.org/packages/archive/base/4.6.0.1/doc/html/System-IO-Unsafe.html>.

On another note, if you really care about performance, you should use
the `bytestring` and `text` packages instead of String. They are
implemented in terms of byte arrays, instead of linked lists, hence
are both faster and more FFI-friendly.

>
>
>
> On Sun, Jun 2, 2013 at 8:08 PM, Brandon Allbery <[hidden email]> wrote:
>>
>> On Sun, Jun 2, 2013 at 8:01 PM, Thomas Davie <[hidden email]> wrote:
>>>
>>> On 2 Jun 2013, at 16:48, Brandon Allbery <[hidden email]> wrote:
>>>
>>> (String is a linked list of Char, which is also not a C char; it is a
>>> constructor and a machine word large enough to hold a Unicode codepoint. And
>>> because Haskell is non-strict, any part of that linked list can be an
>>> unevaluated thunk which requires forcing the evaluation of arbitrary Haskell
>>> code elsewhere to "reify" the value; this obviously cannot be done in the
>>> middle of random C code, so it must be done during marshalling.)
>>>
>>>
>>> I'm not convinced that that's "obvious" – though it certainly requires
>>> functions (that go through the FFI) to grab each character at a time.
>>
>>
>> I think you underestimate the complexity of the Haskell runtime and the
>> interactions between it and the FFI. Admittedly it is probably not "obvious"
>> in the sense of "anyone can tell without knowing anything about it that it
>> can't possibly work", but it should be at least somewhat obvious to someone
>> who sees why there needs to be an FFI in the first place that the situation
>> is not trivial, and that they probably should not blindly assume that the
>> only reason one can't just pass Haskell values directly to C is that some
>> GHC developer was feeling lazy at the time.
>>
>> --
>> brandon s allbery kf8nh                               sine nomine
>> associates
>> [hidden email]
>> [hidden email]
>> unix, openafs, kerberos, infrastructure, xmonad
>> http://sinenomine.net
>
>
>
> _______________________________________________
> Haskell-Cafe mailing list
> [hidden email]
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>



--
Chris Wong, fixpoint conjurer
  e: [hidden email]
  w: http://lfairy.github.io/

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: How to write a pure String to String function in Haskell FFI to C++

Adam Vogt
In reply to this post by Ting Lei-2
On Sun, Jun 2, 2013 at 10:19 PM, Ting Lei <[hidden email]> wrote:

> Thanks for your answers so far.
>
> It seems that the laziness of String or [char] is the problem.
>
> My question boils then down to this. There are plenty of Haskell FFI
> examples where simple things like sin/cos in <math.h> can be imported into
> Haskell as pure functions. Is there a way to extend that to String without
> introducing an IO (), but maybe sacrificing laziness?
> If String has to be lazy, is there another Haskell data type convertible to
> String that can do the job?
>
> The C++/C function (e.g. toUppers) is computation-only and as pure as cos
> and tan. The fact that marshaling string incurs an IO monad in current
> examples is kind of unintuitive and like a bug in design. I don't mind
> making redundant copies under the hood from one type to another..

Hi Ting,

In the Foreign.C.String there is a function that converts String to an
array (CString = Ptr CChar) which can be handled on the C side:

withCString :: String -> (CString -> IO a) -> IO a

peekCString :: CString -> IO String

It's slightly more convenient to use these functions through the
preprocessor c2hs, as in the following example
<http://code.haskell.org/~aavogt/c_toUpper_ffi_ex/>. c2hs also has a
'pure' keyword which makes it add the unsafePerformIO, but for
whatever reason the side-effects were not done in the right order (the
peekCString happened before the foreign function was called).

Regards,
Adam

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe