standard poll/select interface

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
19 messages Options
Reply | Threaded
Open this post in threaded view
|

standard poll/select interface

Bulat Ziganshin
Hello John,

Thursday, February 09, 2006, 3:19:30 AM, you wrote:

>> JM> If we had a good standard poll/select interface in System.IO then we
>> JM> actually could implement a lot of concurrency as a library with no
>> JM> (required) run-time overhead. I'd really like to see such a thing get
>> JM> into the standard. Well, mainly it would just be a really useful thing
>> JM> to have in general. If others think it is a good idea I can try to come
>> JM> up with a suitable API and submit it to the repo.
>>
>> i have delayed answering to this letter until i announced my Streams
>> library. now i can say that such API already exists - in terms of my
>> library you need just to write an transformer that intercepts
>> vGetBuf/vPutBuf calls and pass them to the select/poll machinery. so
>> you can write such transformer just now and every program that uses
>> Streams will benefit from its usage. Converting programs that use
>> Handles to using Streams should be also an easy task.

JM> I was actually asking for something much more modest, which was the
JM> routine needed to pass them to the select/poll machinery. but yeah, what
JM> you say is one of my expected uses of such a routine. Once a standard IO
JM> library settles down, then I can start working on the exact API such a
JM> routine would have.

but if all will wait while the library settles down, it will never
occur :)  your work can change design of library, like the my library
itself can change the shape of haskell' :)  at this moment, i just
developed the library which satisfy demands in extending current I/O
library by new features, such as Unicode support, high speed,
portability to other compilers, binary i/o, i/o for packed strings,
and asynchronous i/o using methods other than select(). but i don't
implement all these features actually, i just developed
infrastructure, in which all these features can be easily added.
unlike the System.IO library, you don't need to ask someone to
implement new features or make corrections in foreign sources. you
just need to develop module what implements this standard Stream
interface and then it can be used as easy as transformers from the
library itself

as i understand this idea, transformer implementing async i/o should
intercept vGetBuf/vPutBuf calls for the FDs, start the appropriate
async operation, and then switch to another Haskell threads. the I/O
manager thread should run select() in cycle and when the request is
finished, wake up the appropriate thread. what's all. if you will ever
need, this implementation can then be used to extend GHC's System.IO
internals with the support for new async i/o managers (as i
understand, select() is now supported by GHC, but poll(), kqueue() is
not supported?). the only difference that my lib gives an opportunity
to test this implementation without modifying GHC I/O internals, what
is somewhat simpler. so, interface for async vGetBuf/vPutBuf routines
should be the same as for read/write:

type FD = Int
vGetBuf_async :: FD -> Ptr a -> Int -> IO Int
vPutBuf_async :: FD -> Ptr a -> Int -> IO Int

i think that implementations for ghc and jhc should be slightly
different, though, because of different ways to implement
multi-threading. but the I/O manager should be the same - it just
receives info about I/O operations to run and returns information
about completed ones.

... well, this I/O manager should implement just one operation:

performIO :: Request -> IO ()

type Request = (IOType, FD, Ptr a, Int, Notifier)
data IOType = Read | Write | ...
type Notifier = Result -> IO ()
data Result = OK Int | Fail ErrorInfo

"performIO" starts new I/O operation. On the completion of this
operation, Notifier is called with information about results of
execution.

so, for the GHC the following should work:

vGetBuf_async fd ptr size = do
    done <- newMVar
    let notifier = putMVar done ()
    performIO (Read, fd, ptr, size, notifier)
    takeMVar done

for JHC, the body of "vGetBuf_async" may be different

if you will find this interface reasonable, at least for the first
iteration, i will develop appropriate transformer, so for you remains
"only" the implementation of "performIO"

>> of course, Streams library is not some standard just now, and moreover
>> - it is not compatible with JHC. the greatest problem is what i using
>> type classes extensions available in GHC/Hugs what is not in H98
>> standard. so, i'm interested in pushing Haskell' to accept most
>> advanced possible extensions in this area and, of course, in actual
>> implementing these extensions in the Haskell compilers. alternative
>> way to make Streams available to wider range of Haskell compilers is
>> to strip support of streams working in monads other that IO.

JM> Don't take the absence of a feature in jhc to mean I don't like or want
JM> that feature. There are a lot of things I don't have but that I'd
JM> definitly want to see in the language simply because I was only shooting
JM> for H98 to begin with and was more interested in a lot of the back end
JM> stuff. You should figure out the nicest design that uses just the
JM> extensions needed for the design you want. it could help us decide what
JM> goes into haskell-prime to know what is absolutely needed for good
JM> design and what is just nice to have.

this simply means that the Streams library cannot be used with JHC,
what is bad news, because it is even more rich than GHC's System.IO.
jhc had chance to get modern I/O library. but it lost that chance :)

>> if you can make select/poll transformer, at least for testing
>> purposes, that will be really great.

JM> Yeah, I will look into this. the basic select/poll call will have to be
JM> pretty low level, but hopefully it will allow interesting higher level
JM> constructs based on your streams or an evolution of them.

please look. at this moment Sreams library lacks only a few important
features, already implemented in GHC's System.IO: sockets, line
buffering and async i/o. moreover, i don't have an experience in
implementing the async i/o, so foreign help is really necessary

addressing these three issues will allow to propose the Streams
library as possible System.IO replacement. and as you can see,
implementing the "performIO" will allow us to use async i/o for all
possible i/o operations, including "get/put_" or vGetContents, for
example


--
Best regards,
 Bulat                            mailto:[hidden email]



_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: standard poll/select interface

Einar Karttunen
On 09.02 22:24, Bulat Ziganshin wrote:

> as i understand this idea, transformer implementing async i/o should
> intercept vGetBuf/vPutBuf calls for the FDs, start the appropriate
> async operation, and then switch to another Haskell threads. the I/O
> manager thread should run select() in cycle and when the request is
> finished, wake up the appropriate thread. what's all. if you will ever
> need, this implementation can then be used to extend GHC's System.IO
> internals with the support for new async i/o managers (as i
> understand, select() is now supported by GHC, but poll(), kqueue() is
> not supported?). the only difference that my lib gives an opportunity
> to test this implementation without modifying GHC I/O internals, what
> is somewhat simpler. so, interface for async vGetBuf/vPutBuf routines
> should be the same as for read/write:
>
> type FD = Int
> vGetBuf_async :: FD -> Ptr a -> Int -> IO Int
> vPutBuf_async :: FD -> Ptr a -> Int -> IO Int

Please don't fix FD = Int, this is not true on some systems,
and when implementing efficient sockets one usually wants
to hold more complex state.

> JM> Don't take the absence of a feature in jhc to mean I don't like or want
> JM> that feature. There are a lot of things I don't have but that I'd
> JM> definitly want to see in the language simply because I was only shooting
> JM> for H98 to begin with and was more interested in a lot of the back end
> JM> stuff. You should figure out the nicest design that uses just the
> JM> extensions needed for the design you want. it could help us decide what
> JM> goes into haskell-prime to know what is absolutely needed for good
> JM> design and what is just nice to have.
>
> this simply means that the Streams library cannot be used with JHC,
> what is bad news, because it is even more rich than GHC's System.IO.
> jhc had chance to get modern I/O library. but it lost that chance :)

I think it is more like "all haskell-prime programs". Seriously,
if we design a new IO subsystem it would be quite nice to be
able to use it from standard conforming programs.

Maybe things can be reformulated in a way that will be compatible
with haskell-prime.

> please look. at this moment Sreams library lacks only a few important
> features, already implemented in GHC's System.IO: sockets, line
> buffering and async i/o. moreover, i don't have an experience in
> implementing the async i/o, so foreign help is really necessary

If you want I can look at getting network-alt to implement the
interface.

- Einar Karttunen
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re[2]: standard poll/select interface

Bulat Ziganshin
Hello Einar,

Friday, February 10, 2006, 2:09:08 AM, you wrote:

>> as i understand this idea, transformer implementing async i/o should
>> intercept vGetBuf/vPutBuf calls for the FDs, start the appropriate
>>
>> type FD = Int
>> vGetBuf_async :: FD -> Ptr a -> Int -> IO Int
>> vPutBuf_async :: FD -> Ptr a -> Int -> IO Int

EK> Please don't fix FD = Int, this is not true on some systems,
EK> and when implementing efficient sockets one usually wants
EK> to hold more complex state.

the heart of the library is class Stream. both "File" and "Socket"
should implement this interface. just now i use plain "FD" to
represent files, but that is temporary solution - really file also
must carry additional information: filename, open mode, open/closed
state. This "File" will be an abstract datatype, what can be based not
on FD in other operating systems.

The same applies to the "Socket". it can be any type what carry enough
information to work with network i/o.

implementation of async i/o should have a form of Stream Transformer,
which intercepts only the vGetBuf/vPutBuf operations and pass other
operations as is:

data AsyncFD = AsyncFD FD ... {-additional fields-}

instance Stream IO AsyncFD where
  vIsEOF (AsyncFD h ...) = vIsEOF h
  vClose (AsyncFD h ...) = vClose h
  ........
  vGetBuf (AsyncFD h ...) ptr size = vGetBuf_async h ptr size

as far as i see, the select/epoll don't need to know anything about
file/socket except for its descriptor(FD) ? in this case we can make
the Async transformer universal, compatible with both files and
sockets:

data Async h = Async h ... {-additional fields-}

addAsyncIO h = do
  .....
  return (Async h ...)

instance (Stream IO h) => Stream IO (Async h) where
  vIsEOF (Async h ...) = vIsEOF h
  vClose (Async h ...) = vClose h
  ........
  vGetBuf (Async h ...) ptr size = doBlockingOp "read" h $ vGetBuf h ptr size

this transformer can be made universal, supporting select/epoll/...
implementations via additional parameter to the "addAsyncIO", or it
can be a series of transformers, one for each method of async i/o. if
we have developed common API for async i/o as John suggested, then one
universal transformer working via this API can be used


>> this simply means that the Streams library cannot be used with JHC,
>> what is bad news, because it is even more rich than GHC's System.IO.
>> jhc had chance to get modern I/O library. but it lost that chance :)

EK> I think it is more like "all haskell-prime programs". Seriously,
EK> if we design a new IO subsystem it would be quite nice to be
EK> able to use it from standard conforming programs.

EK> Maybe things can be reformulated in a way that will be compatible
EK> with haskell-prime.

or haskell-prime can be reformulated ;)  as h' will be defined in
first iteration, i will check my lib and say to comittee what i will
need to omit from my library to be compatible with this standard.
then we can decide :)  just at current moment, support for complex
class hierrachies outside of Hugs/GHC is very poor

>> please look. at this moment Sreams library lacks only a few important
>> features, already implemented in GHC's System.IO: sockets, line
>> buffering and async i/o. moreover, i don't have an experience in
>> implementing the async i/o, so foreign help is really necessary

EK> If you want I can look at getting network-alt to implement the
EK> interface.

please look. basically, Sock should be made an instance of Stream with
implementations of vGetBuf/vPutBuf and other operations, as much as possible.
it should be easy, see the FD/Handle instances of Stream for example

and your support of select/poll should go into the tranformer(s). this
will allow to use async i/o not only for your own Sock type, but for
files, for the sockets from the old library and so on


--
Best regards,
 Bulat                            mailto:[hidden email]



_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: standard poll/select interface

Bulat Ziganshin
In reply to this post by Bulat Ziganshin
Hello Bulat,

Thursday, February 09, 2006, 10:24:59 PM, you wrote:

>>> if you can make select/poll transformer, at least for testing
>>> purposes, that will be really great.

JM>> Yeah, I will look into this. the basic select/poll call will have to be
JM>> pretty low level, but hopefully it will allow interesting higher level
JM>> constructs based on your streams or an evolution of them.

sorry, John, as i now see, Einar already implemented select/epoll
machinery in the alt-network lib. moreover, now he promised to extract
this functionality to make universal async i/o layer library. the only
thing that i don't know - whether he is ready to develop universal API
for these modules, as you initially proposed.

as this universal API will be done, i will roll up the Stream
transormer that uses it and therefore allows async i/o both with files
and sockets on any platform where this API can be implemented

--
Best regards,
 Bulat                            mailto:[hidden email]



_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: standard poll/select interface

Simon Marlow-5
In reply to this post by Bulat Ziganshin
Bulat Ziganshin wrote:

> Hello Einar,
>
> Friday, February 10, 2006, 2:09:08 AM, you wrote:
>
>
>>>as i understand this idea, transformer implementing async i/o should
>>>intercept vGetBuf/vPutBuf calls for the FDs, start the appropriate
>>>
>>>type FD = Int
>>>vGetBuf_async :: FD -> Ptr a -> Int -> IO Int
>>>vPutBuf_async :: FD -> Ptr a -> Int -> IO Int
>
>
> EK> Please don't fix FD = Int, this is not true on some systems,
> EK> and when implementing efficient sockets one usually wants
> EK> to hold more complex state.
>
> the heart of the library is class Stream. both "File" and "Socket"
> should implement this interface. just now i use plain "FD" to
> represent files, but that is temporary solution - really file also
> must carry additional information: filename, open mode, open/closed
> state. This "File" will be an abstract datatype, what can be based not
> on FD in other operating systems.
>
> The same applies to the "Socket". it can be any type what carry enough
> information to work with network i/o.
>
> implementation of async i/o should have a form of Stream Transformer,
> which intercepts only the vGetBuf/vPutBuf operations and pass other
> operations as is:

I don't think async I/O is a stream transformer, fitting it into the
stream hierarchy seems artificial to me.

It is just another way of doing I/O directly to/from file descriptors.
If your basic operation to read from an FD is

   readFD :: FD -> Int -> Ptr Word8 -> IO Int

then an async I/O layer simply provides you with the exact same
interface, but with an implementation that doesn't block other threads.
  It is part of the file descriptor interface, not a stream transformer.
  Also, you probably need

   readNonBlockingFD :: FD -> Int -> Ptr Word8 -> IO Int
   isReadyFD :: FD -> IO Bool

in fact, I think this should be the basic API, since you can implement
readFD in terms of it.  (readNonBlockingFD always reads at least one
byte, blocking until some data is available).  This is used to partially
fill an input buffer with the available data, for example.

One problem here is that in order to implement readNonBlockingFD on Unix
you have to put the FD into O_NONBLOCK mode, which due to misdesign of
the Unix API affects other users of the same file descriptor, including
other programs.  GHC suffers from this problem.

Cheers,
        Simon
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re[2]: standard poll/select interface

Bulat Ziganshin
Hello Simon,

Friday, February 10, 2006, 3:26:30 PM, you wrote:

>>>>as i understand this idea, transformer implementing async i/o should
>>>>intercept vGetBuf/vPutBuf calls for the FDs, start the appropriate
>>>>
>>>>type FD = Int
>>>>vGetBuf_async :: FD -> Ptr a -> Int -> IO Int
>>>>vPutBuf_async :: FD -> Ptr a -> Int -> IO Int
>>
>>
>> EK> Please don't fix FD = Int, this is not true on some systems,
>> EK> and when implementing efficient sockets one usually wants
>> EK> to hold more complex state.
>>
>> the heart of the library is class Stream. both "File" and "Socket"
>> should implement this interface. just now i use plain "FD" to
>> represent files, but that is temporary solution - really file also
>> must carry additional information: filename, open mode, open/closed
>> state. This "File" will be an abstract datatype, what can be based not
>> on FD in other operating systems.
>>
>> The same applies to the "Socket". it can be any type what carry enough
>> information to work with network i/o.
>>
>> implementation of async i/o should have a form of Stream Transformer,
>> which intercepts only the vGetBuf/vPutBuf operations and pass other
>> operations as is:

SM> I don't think async I/O is a stream transformer, fitting it into the
SM> stream hierarchy seems artificial to me.

yes, it is possible - what i'm trying to implement everything as
tranformer, independent of real necessity. i really thinks that
idea of transformers fit every need in extending functionality

it is a list of my reasons to implement this as transformer:

1) there is no "common FD" interface. module System.FD implements
something, but it is a really interface only for file i/o. it's used
partially in System.MMFile, implementing memory-mapped files, and i
think these fd* operations will be used to partially implement Socket
operations, but something will be different, including using recv/send
instead of read/write to implement GetBuf/PutBuf operations. so, there
is no common "instance Stream FD", but different instances for files,
memory-mapped files and sockets. As Einar just mentioned, Socket
dataype will include information what absent in File datatype. So,
these 3 types have in common using FD to implement some of its
operations, but some operations will be different and internal dataype
structures will be different. Transformer is an ideal way to just
reimplement vGetBuf/vPutBuf operations while passing through all the
rest. Without it, instead of 3 methods of doing I/O (mmap/read/recv)
you will need to implement all the 5
(mmap/read/recv/readAsync/recvAsync) - it's even without counting
selct, epoll and kqueue separately

2) as you can see in epoll()-based implementation of async i/o in
alt-network library, Einar attaches additional data (read/write
queues) to the FD to support epoll() interface. These data will be
different for select, epoll, kqueue and other methods of async i/o. At
least, without async i/o no information should be needed. Transformer
is an ideal way to attach additional data to the file/socket without
changing of "raw" datatype. Again, otherwise you will need to attach
all these data to the raw file, duplicate this work with the raw
socket and then repeat this for select, epoll and other async i/o
methods

on the other side, reasons for your proposal, as i see:

1) if FD will incorporate async i/o support, the System.FD library
will become much more useful - anyone using low-level fd* functions
will get async i/o support for free

but there is another defeciency in the System.FD library - it doesn't
include support for the files>4Gb and files with unicode filenames
under Windows. it seems natural to include this support in fd* too.

now let's see. you are proposing to include in fd* implementation
support for files, sockets, various async i/o methods and what's not
all. are you not think that this library will become a successor of
Handle library, implementing all possible fucntionality and don't
giving 3rd-party libraries chances to change anything partially?

i propose instead to divide library into the small manageable pieces
what can be easily stidied/modified/replaced and that brings something
really usefull only when used together. if what means that low-level
fd* interface can't be used even to work with raw files without great
restrictions (no Unicode filenames in windows, no async i/o) then it
will mean just this.


SM> It is just another way of doing I/O directly to/from file descriptors.
SM> If your basic operation to read from an FD is

SM>    readFD :: FD -> Int -> Ptr Word8 -> IO Int

SM> then an async I/O layer simply provides you with the exact same
SM> interface, but with an implementation that doesn't block other threads.
SM>   It is part of the file descriptor interface, not a stream transformer.
SM>   Also, you probably need

SM>    readNonBlockingFD :: FD -> Int -> Ptr Word8 -> IO Int
SM>    isReadyFD :: FD -> IO Bool

SM> in fact, I think this should be the basic API, since you can implement
SM> readFD in terms of it.  (readNonBlockingFD always reads at least one
SM> byte, blocking until some data is available).  This is used to partially
SM> fill an input buffer with the available data, for example.

this can be in basic API, but not in basic implementation :))) really,
i think that you mix two things - readNonBlockingFD call that can fill
buffer only partially and readAsync call that use some I/O manager to
perform other Haskell threads while data are read

well, i agree that should be two GetBuf variants in the Stream
interface - greedy and non-greedy. say, vGetBuf and
vGetBufNonBlocking. vPutBuf also need two variants?

then, may be LineBuffering and BlockBuffering should use
vGetBufNonBlocking and vGetBuf, respectively?

but i don't know anything about implementation. is the difference
between readNonBlockingFD and readFD calls only in the O_NONBLOCK mode
of file handle, or different functions are used? what for Windows? for
sockets? how this interacts with the async i/o?

SM> One problem here is that in order to implement readNonBlockingFD on Unix
SM> you have to put the FD into O_NONBLOCK mode, which due to misdesign of
SM> the Unix API affects other users of the same file descriptor, including
SM> other programs.  GHC suffers from this problem.

what means that it is better to decide at "open" stage whether this
file will be used with readNonBlockingFD or with simple readFD?


--
Best regards,
 Bulat                            mailto:[hidden email]



_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Re: standard poll/select interface

John Meacham
In reply to this post by Simon Marlow-5
On Fri, Feb 10, 2006 at 12:26:30PM +0000, Simon Marlow wrote:
> in fact, I think this should be the basic API, since you can implement
> readFD in terms of it.  (readNonBlockingFD always reads at least one
> byte, blocking until some data is available).  This is used to partially
> fill an input buffer with the available data, for example.

this is the behavior of standard file descriptors. not non-blocking
ones. We should definitly not guarentee reads fill an input buffer
fully at least for the lowest level calls, that is the job for the
layers on top of it.

>
> One problem here is that in order to implement readNonBlockingFD on Unix
> you have to put the FD into O_NONBLOCK mode, which due to misdesign of
> the Unix API affects other users of the same file descriptor, including
> other programs.  GHC suffers from this problem.

non blocking ones will return immediatly if no data is available rather
than make sure they return at least one byte.

In any case, the correct solution in the circumstances is to provide a
select/poll/epoll/devpoll interface. It is nicer than setting
NON_BLOCKING and more efficient. This is largely orthogonal to the
Streams design though.

        John

--
John Meacham - ⑆repetae.net⑆john⑈
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: standard poll/select interface

Simon Marlow-5
John Meacham wrote:

> On Fri, Feb 10, 2006 at 12:26:30PM +0000, Simon Marlow wrote:
>
>>in fact, I think this should be the basic API, since you can implement
>>readFD in terms of it.  (readNonBlockingFD always reads at least one
>>byte, blocking until some data is available).  This is used to partially
>>fill an input buffer with the available data, for example.
>
>
> this is the behavior of standard file descriptors. not non-blocking
> ones. We should definitly not guarentee reads fill an input buffer
> fully at least for the lowest level calls, that is the job for the
> layers on top of it.

You're right - I was slightly confused there.  O_NONBLOCK isn't
necessary to implement readNonBlockingFD.

>>One problem here is that in order to implement readNonBlockingFD on Unix
>>you have to put the FD into O_NONBLOCK mode, which due to misdesign of
>>the Unix API affects other users of the same file descriptor, including
>>other programs.  GHC suffers from this problem.
>
>
> non blocking ones will return immediatly if no data is available rather
> than make sure they return at least one byte.
>
> In any case, the correct solution in the circumstances is to provide a
> select/poll/epoll/devpoll interface. It is nicer than setting
> NON_BLOCKING and more efficient. This is largely orthogonal to the
> Streams design though.

I think the reason we set O_NONBLOCK is so that we don't have to test
with select() before reading, we can just call read().  If you don't use
O_NONBLOCK, you need two system calls to read/write instead of one.
This probably isn't a big deal, given that we're buffering anyway.

I agree that a generic select/poll interface would be nice.  If it was
in terms of Handles though, that's not useful for implementing the I/O
library.  If it was in terms of FDs, that's not portable - we'd need a
separate one for Windows.  How would you design it?

Cheers,
        Simon
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: standard poll/select interface

Simon Marlow-5
In reply to this post by Bulat Ziganshin
Bulat Ziganshin wrote:

> SM> I don't think async I/O is a stream transformer, fitting it into the
> SM> stream hierarchy seems artificial to me.
>
> yes, it is possible - what i'm trying to implement everything as
> tranformer, independent of real necessity. i really thinks that
> idea of transformers fit every need in extending functionality
>
> it is a list of my reasons to implement this as transformer:
>
> 1) there is no "common FD" interface.

Well, there's the unix package.  In theory, System.IO should layer on
top of System.Posix or System.Win32, depending on the platform.  In
practice we extract the important bits of System.Posix and put them in
the base package to avoid circular dependencies.  The current
implementation could use some cleaning up here (eg. FD vs. Fd).

> on the other side, reasons for your proposal, as i see:
>
> 1) if FD will incorporate async i/o support, the System.FD library
> will become much more useful - anyone using low-level fd* functions
> will get async i/o support for free
>
> but there is another defeciency in the System.FD library - it doesn't
> include support for the files>4Gb

Yes it does!

> and files with unicode filenames
> under Windows.

Under Windows I believe we should be using a Win32-specific substrate on
which to build the I/O library.

> it seems natural to include this support in fd* too.
>
> now let's see. you are proposing to include in fd* implementation
> support for files, sockets, various async i/o methods and what's not
> all. are you not think that this library will become a successor of
> Handle library, implementing all possible fucntionality and don't
> giving 3rd-party libraries chances to change anything partially?

Not at all - I'm just suggesting that there should be an API to FD-based
I/O, and that concurrency-safety can be layered on top of this,
providing exactly the same API but with concurrency-safety built in.

> i think that you mix two things - readNonBlockingFD call that can fill
> buffer only partially and readAsync call that use some I/O manager to
> perform other Haskell threads while data are read

Why do you want to expose readAsync at all?

> well, i agree that should be two GetBuf variants in the Stream
> interface - greedy and non-greedy. say, vGetBuf and
> vGetBufNonBlocking. vPutBuf also need two variants?
>
> then, may be LineBuffering and BlockBuffering should use
> vGetBufNonBlocking and vGetBuf, respectively?
>
> but i don't know anything about implementation. is the difference
> between readNonBlockingFD and readFD calls only in the O_NONBLOCK mode
> of file handle, or different functions are used? what for Windows? for
> sockets? how this interacts with the async i/o?

Never mind about this - just assume readNonBlockingFD as your
lowest-level primitive, and we can provide an implementation of
readNonBlockingFD that uses select/poll/whatever underneath.  I imagine
we'll stop using O_NONBLOCK.  The Windows version will look different at
this level, because we should be using Win32 native I/O, i.e HANDLE
instead of FD, but it will have a primitive similar to
readNonBlockingFD, also concurrency-safe.

Cheers,
        SImon
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: standard poll/select interface

John Meacham
In reply to this post by Simon Marlow-5
On Tue, Feb 21, 2006 at 01:15:48PM +0000, Simon Marlow wrote:
> I agree that a generic select/poll interface would be nice.  If it was
> in terms of Handles though, that's not useful for implementing the I/O
> library.  If it was in terms of FDs, that's not portable - we'd need a
> separate one for Windows.  How would you design it?

Yeah, this is why I have held off on a specific design until we get a
better idea of what the new IO library will look like. I am thinking it
will have to involve some abstract event source type with primitive
routines for creating this type from things like handles,fds, or
anything else we might want to wait on. so it is system-extendable in
that sense in that implementations can just provide new event source
creation primitives.

The other advantage of this sort of thing is that you would want things
like the X11 library to be able to provide an event source for when an
X11 event is ready to be read so you can seamlessly integrate your X11
loop into your main one.

The X11 library would create such an event source from the underlying
socket but just return the abstract event source so the implementation
can change (perhaps when using a shared memory based system like D11 for
instance) without affecting how the user uses the library in a portable
way.

I will try to come up with something concrete for us to look at that we
can modify as the rest of the IO library congeals.

        John

--
John Meacham - ⑆repetae.net⑆john⑈
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Re: standard poll/select interface

Donn Cave-2
On Tue, 21 Feb 2006, John Meacham wrote:

> Yeah, this is why I have held off on a specific design until we get a
> better idea of what the new IO library will look like. I am thinking it
> will have to involve some abstract event source type with primitive
> routines for creating this type from things like handles,fds, or
> anything else we might want to wait on. so it is system-extendable in
> that sense in that implementations can just provide new event source
> creation primitives.
>
> The other advantage of this sort of thing is that you would want things
> like the X11 library to be able to provide an event source for when an
> X11 event is ready to be read so you can seamlessly integrate your X11
> loop into your main one.
>
> The X11 library would create such an event source from the underlying
> socket but just return the abstract event source so the implementation
> can change (perhaps when using a shared memory based system like D11 for
> instance) without affecting how the user uses the library in a portable
> way.

Could an application reasonably choose between several dispatching
systems?  For example, I'm working on a Macintosh here, where instead
of X11 Apple provides its NextStep based GUI with its own apparently
fairly well defined event system.  I don't know that system very well,
but a MacOS Haskell GUI application would probably want to look in
that direction for event integration.  Meanwhile, I might want to
work with kqueue, on the same platform, because it supports filesystem
events along with the usual select stuff.

        Donn Cave, [hidden email]

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Re: standard poll/select interface

Bulat Ziganshin-2
In reply to this post by John Meacham
Hello John,

Wednesday, February 22, 2006, 3:32:34 AM, you wrote:

>> I agree that a generic select/poll interface would be nice.  If it was
>> in terms of Handles though, that's not useful for implementing the I/O
>> library.  If it was in terms of FDs, that's not portable - we'd need a
>> separate one for Windows.  How would you design it?

JM> Yeah, this is why I have held off on a specific design until we get a
JM> better idea of what the new IO library will look like. I am thinking it
JM> will have to involve some abstract event source type with primitive
JM> routines for creating this type from things like handles,fds, or
JM> anything else we might want to wait on. so it is system-extendable in
JM> that sense in that implementations can just provide new event source
JM> creation primitives.

i don't think that we need some fixed interface. it can be just
parameterized:

type ReadBuf  h = h -> Ptr () -> Int -> IO Int
type WriteBuf h = h -> Ptr () -> Int -> IO ()

so Unix implementations will use FD, Windows implementation will work
with Handle and all will be happy :)

JM> The other advantage of this sort of thing is that you would want things
JM> like the X11 library to be able to provide an event source for when an
JM> X11 event is ready to be read so you can seamlessly integrate your X11
JM> loop into your main one.

you don't need to have the same interface for the X11 and files async
operations. The library can export "ReadBuf FD", "WriteBuf FD" and
"X11Op" implementations and you will use each one in appropriate
place.

JM> The X11 library would create such an event source from the underlying
JM> socket but just return the abstract event source so the implementation
JM> can change (perhaps when using a shared memory based system like D11 for
JM> instance) without affecting how the user uses the library in a portable
JM> way.

JM> I will try to come up with something concrete for us to look at that we
JM> can modify as the rest of the IO library congeals.

as i already said, this IO library will not emerge by itself :)  there
is my library which use Stream class so it can accept any form of
async library. there is a lib by Marcin Kowalczyk. and there is
Einar's Alt-Network lib which already implements 2 async methods. so
what we need is to convert Einar's work to single interface and make a
Stream interface around this. the later will be better accomplished by
me, but i don't know whether he planned to work on former. i can also
do it, but without any testing because i still don't have any Unix
installed :)  Streams library by itself is now unix-compilable, thanks
to Peter Simons

--
Best regards,
 Bulat                            mailto:[hidden email]

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re[2]: Re: standard poll/select interface

Bulat Ziganshin-2
In reply to this post by Donn Cave-2
Hello Donn,

Wednesday, February 22, 2006, 4:23:28 AM, you wrote:

DC> Could an application reasonably choose between several dispatching
DC> systems?  For example, I'm working on a Macintosh here, where instead
DC> of X11 Apple provides its NextStep based GUI with its own apparently
DC> fairly well defined event system.  I don't know that system very well,
DC> but a MacOS Haskell GUI application would probably want to look in
DC> that direction for event integration.  Meanwhile, I might want to
DC> work with kqueue, on the same platform, because it supports filesystem
DC> events along with the usual select stuff.

this depends not of John's design of this low-level lib, but on design
of higher-level libs that will use it. just for example - Streams lib
will allow to switch this manager even at runtime

--
Best regards,
 Bulat                            mailto:[hidden email]

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Re: standard poll/select interface

John Meacham
In reply to this post by Bulat Ziganshin-2
On Wed, Feb 22, 2006 at 03:28:26PM +0300, [hidden email] wrote:

> JM> Yeah, this is why I have held off on a specific design until we get a
> JM> better idea of what the new IO library will look like. I am thinking it
> JM> will have to involve some abstract event source type with primitive
> JM> routines for creating this type from things like handles,fds, or
> JM> anything else we might want to wait on. so it is system-extendable in
> JM> that sense in that implementations can just provide new event source
> JM> creation primitives.
>
> i don't think that we need some fixed interface. it can be just
> parameterized:
>
> type ReadBuf  h = h -> Ptr () -> Int -> IO Int
> type WriteBuf h = h -> Ptr () -> Int -> IO ()
>
> so Unix implementations will use FD, Windows implementation will work
> with Handle and all will be happy :)

I think you misunderstand, the poll interface will need to accept a
_set_ of events to wait for. This is independent of the buffer interface
and lower level than async IO (for the traditional definition of async
IO). Not all event sources will necessarily be FDs on unix or handles on
windows, if say a haskell RTS integrates with a systems built in event
loop (such as the OSX example mentioned in another email).


> JM> The other advantage of this sort of thing is that you would want things
> JM> like the X11 library to be able to provide an event source for when an
> JM> X11 event is ready to be read so you can seamlessly integrate your X11
> JM> loop into your main one.
>
> you don't need to have the same interface for the X11 and files async
> operations. The library can export "ReadBuf FD", "WriteBuf FD" and
> "X11Op" implementations and you will use each one in appropriate
> place.

You can't treat them as independent types at the poll site, since you
need to wait on a set of events from potentially different types of
sources.

> JM> The X11 library would create such an event source from the underlying
> JM> socket but just return the abstract event source so the implementation
> JM> can change (perhaps when using a shared memory based system like D11 for
> JM> instance) without affecting how the user uses the library in a portable
> JM> way.
>
> JM> I will try to come up with something concrete for us to look at that we
> JM> can modify as the rest of the IO library congeals.
>
> as i already said, this IO library will not emerge by itself :)  there
> is my library which use Stream class so it can accept any form of
> async library. there is a lib by Marcin Kowalczyk. and there is
> Einar's Alt-Network lib which already implements 2 async methods. so
> what we need is to convert Einar's work to single interface and make a
> Stream interface around this. the later will be better accomplished by
> me, but i don't know whether he planned to work on former. i can also
> do it, but without any testing because i still don't have any Unix
> installed :)  Streams library by itself is now unix-compilable, thanks
> to Peter Simons

I am not quite sure what you mean by this. the poll/select interface
will be lower level than your Streams library and fairly independent.
The async methods I have seen have been non-blocking based and tend to
be system dependent, which is different than what the poll/select
interface is about. the poll/select interface is about providing the
mininimum functionality to allow _portable_ async applications and
libraries to be written.

        John

--
John Meacham - ⑆repetae.net⑆john⑈
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re[2]: Re: standard poll/select interface

Bulat Ziganshin-2
Hello John,

Wednesday, February 22, 2006, 5:11:04 PM, you wrote:

it seems that we don't understand each other. let's be concrete:

my library reads and writes files. it uses read/write/recv/send to do
this in blocking manner. now i want to have another operations what will
have the SAME INTERFACES but internally use something like poll in
order to allow Haskell RTS support i/o overlapping with "user
threads". agree?

these async operations should had the same interface as blocked ones,
but that is impossible for Windows, so i propose to had slightly more
general interfaces:

 type ReadBuf  h = h -> Ptr () -> Int -> IO Int
 type WriteBuf h = h -> Ptr () -> Int -> IO ()

these are functions which my library will call, all other drom my
viewpoint are internal details of this async lib. i don't know (i really
don't know) how to build this list of events and how to manage it.

the same is for X11 library - async lib just should provide alternative
implementation of some operations and don't require from the user of
async lib to manage eventlist.

it seems like you want to define something more low-level, but i'm as
i/o library author will be happy just to call some non-blocking
equivalents of read/write provided by async lib. and it seems that i'm
not competent enough to discuss details of its internal implementation
;)

on the other side, you don't need to wait while some i/o library will
be make standard. anyway such library will need non-blocking
implementations of read() and write(), so this is the high-level
interface that async lib should implement. agree?


>> JM> Yeah, this is why I have held off on a specific design until we get a
>> JM> better idea of what the new IO library will look like. I am thinking it
>> JM> will have to involve some abstract event source type with primitive
>> JM> routines for creating this type from things like handles,fds, or
>> JM> anything else we might want to wait on. so it is system-extendable in
>> JM> that sense in that implementations can just provide new event source
>> JM> creation primitives.
>>
>> i don't think that we need some fixed interface. it can be just
>> parameterized:
>>
>> type ReadBuf  h = h -> Ptr () -> Int -> IO Int
>> type WriteBuf h = h -> Ptr () -> Int -> IO ()
>>
>> so Unix implementations will use FD, Windows implementation will work
>> with Handle and all will be happy :)

JM> I think you misunderstand, the poll interface will need to accept a
JM> _set_ of events to wait for. This is independent of the buffer interface
JM> and lower level than async IO (for the traditional definition of async
JM> IO). Not all event sources will necessarily be FDs on unix or handles on
JM> windows, if say a haskell RTS integrates with a systems built in event
JM> loop (such as the OSX example mentioned in another email).


>> JM> The other advantage of this sort of thing is that you would want things
>> JM> like the X11 library to be able to provide an event source for when an
>> JM> X11 event is ready to be read so you can seamlessly integrate your X11
>> JM> loop into your main one.
>>
>> you don't need to have the same interface for the X11 and files async
>> operations. The library can export "ReadBuf FD", "WriteBuf FD" and
>> "X11Op" implementations and you will use each one in appropriate
>> place.

JM> You can't treat them as independent types at the poll site, since you
JM> need to wait on a set of events from potentially different types of
JM> sources.

>> JM> The X11 library would create such an event source from the underlying
>> JM> socket but just return the abstract event source so the implementation
>> JM> can change (perhaps when using a shared memory based system like D11 for
>> JM> instance) without affecting how the user uses the library in a portable
>> JM> way.
>>
>> JM> I will try to come up with something concrete for us to look at that we
>> JM> can modify as the rest of the IO library congeals.
>>
>> as i already said, this IO library will not emerge by itself :)  there
>> is my library which use Stream class so it can accept any form of
>> async library. there is a lib by Marcin Kowalczyk. and there is
>> Einar's Alt-Network lib which already implements 2 async methods. so
>> what we need is to convert Einar's work to single interface and make a
>> Stream interface around this. the later will be better accomplished by
>> me, but i don't know whether he planned to work on former. i can also
>> do it, but without any testing because i still don't have any Unix
>> installed :)  Streams library by itself is now unix-compilable, thanks
>> to Peter Simons

JM> I am not quite sure what you mean by this. the poll/select interface
JM> will be lower level than your Streams library and fairly independent.
JM> The async methods I have seen have been non-blocking based and tend to
JM> be system dependent, which is different than what the poll/select
JM> interface is about. the poll/select interface is about providing the
JM> mininimum functionality to allow _portable_ async applications and
JM> libraries to be written.

JM>         John




--
Best regards,
 Bulat                            mailto:[hidden email]

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Re: standard poll/select interface

Marcin 'Qrczak' Kowalczyk
In reply to this post by Simon Marlow-5
Simon Marlow <[hidden email]> writes:

> I think the reason we set O_NONBLOCK is so that we don't have to test
> with select() before reading, we can just call read().  If you don't
> use O_NONBLOCK, you need two system calls to read/write instead of
> one. This probably isn't a big deal, given that we're buffering anyway.

I've heard that for Linux sockets select/poll/epoll might say that
data is available where it in fact is not (it may be triggered by
socket activity which doesn't result in new data). Select/poll/epoll
are designed to work primarily with non-blocking I/O.

In my implementation of my language pthreads are optionally used in
the way very similar to your paper "Extending the Haskell Foreign
Function Interface with Concurrency". This means that I have a choice
of using blocking or non-blocking I/O for a given descriptor, both
work similarly, but blocking I/O takes up an OS thread. Each file
has a blocking flag kept in its data.

A non-blocking I/O is done in the same thread. The timer signal is
kept active, so if another process has switched the file to blocking,
it will be woken up by the timer signal and won't block the whole
process. The thread performing the I/O will only waste its timeslices.

A blocking I/O temporarily releases access to the the runtime, setting
up a worker OS thread for other threads if needed etc. As an
optimization, if there are no other threads to be run by the scheduler
(no running threads, nor waiting for I/O, nor waiting for a timeout,
and we are the thread which handles system signals), then runtime is
not physically released (no worker OS threads, no unlinking of the
thread structure), only the signal mask is changed so the visible
semantics is maintained. This is common to other such potentially
blocking system calls. I don't know if GHC does something similar.

(I recently made it working even if a thread that my runtime has not
seen before wants to access the runtime. If the optimization of not
physically releasing the runtime was in place, the new thread performs
the actions on behalf of the previous thread.)

In either case EAGAIN causes the thread to block, asking the scheduler
to wake it up when I/O is ready. This means that even if some other
process has switched the file to non-blocking, the process will only
do unnecessary context switches.

It's important to make this working when the blocking flag is out
of sync. The Unix blocking flag is not even associated with the
descriptor but with an open file, i.e. it's shared with descriptors
created by dup(), so it might be hard to predict without asking the
OS.

If pthreads are available, stdin, stdout and stderr are kept blocking,
because they are often shared with other processes, and making them
blocking works well. Without pthreads they are non-blocking, because
I felt it was more important to not waste timeslices of the thread
performing I/O than to be nice to other processes. In both cases pipes
and sockets are non-blocking, while named files are blocking. The
programmer can change the blocking state explicitly, but this is
probably useful only when setting up redirections before exec*().

--
   __("<         Marcin Kowalczyk
   \__/       [hidden email]
    ^^     http://qrnik.knm.org.pl/~qrczak/
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Re: standard poll/select interface

Marcin 'Qrczak' Kowalczyk
In reply to this post by Simon Marlow-5
Simon Marlow <[hidden email]> writes:

> I agree that a generic select/poll interface would be nice.

We must be aware that epoll (and I think kqueue too) registers event
sources in advance, separately from waiting, which is its primary
advantage over poll.

The interface should use this model because it's easy to implement it
in terms of select/poll without losing efficiency, but the converse
would lose the benefit of epoll.

(My runtime has a generic interface on the C level only, for hooking
another implementation to be used by the scheduler.)

--
   __("<         Marcin Kowalczyk
   \__/       [hidden email]
    ^^     http://qrnik.knm.org.pl/~qrczak/
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Re: standard poll/select interface

Donn Cave-2
In reply to this post by Marcin 'Qrczak' Kowalczyk
On Fri, 24 Feb 2006, Marcin 'Qrczak' Kowalczyk wrote:
> Simon Marlow <[hidden email]> writes:
>> I think the reason we set O_NONBLOCK is so that we don't have to test
>> with select() before reading, we can just call read().  If you don't
>> use O_NONBLOCK, you need two system calls to read/write instead of
>> one. This probably isn't a big deal, given that we're buffering anyway.
>
> I've heard that for Linux sockets select/poll/epoll might say that
> data is available where it in fact is not (it may be triggered by
> socket activity which doesn't result in new data).

Only UDP, from anything I'm able to find out about this.
Apparently a UDP packet may turn out to be invalid in some respect,
to be discovered too late during the recvmsg system call.  In a
similar situation, the TCP layer would have already accounted for
this by the time select sees anything.  Likewise of course any
local slow devices like a tty, pipe etc.

> Select/poll/epoll
> are designed to work primarily with non-blocking I/O.

That's what the Linux kernel developers say, anyway, since it
would be inconvenient for them to fix this, even though it
apparently violates the POSIX specification.

        Donn Cave, [hidden email]

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: standard poll/select interface

Simon Marlow-5
In reply to this post by Marcin 'Qrczak' Kowalczyk
Marcin 'Qrczak' Kowalczyk wrote:

> Simon Marlow <[hidden email]> writes:
>
>
>>I think the reason we set O_NONBLOCK is so that we don't have to test
>>with select() before reading, we can just call read().  If you don't
>>use O_NONBLOCK, you need two system calls to read/write instead of
>>one. This probably isn't a big deal, given that we're buffering anyway.
>
>
> I've heard that for Linux sockets select/poll/epoll might say that
> data is available where it in fact is not (it may be triggered by
> socket activity which doesn't result in new data). Select/poll/epoll
> are designed to work primarily with non-blocking I/O.

Ah yes, you're right.  It's important for us to guarantee that calling
read() can't block, so even if we select() first there's a race
condition in that someone else can call read() before the current thread.

> In my implementation of my language pthreads are optionally used in
> the way very similar to your paper "Extending the Haskell Foreign
> Function Interface with Concurrency". This means that I have a choice
> of using blocking or non-blocking I/O for a given descriptor, both
> work similarly, but blocking I/O takes up an OS thread. Each file
> has a blocking flag kept in its data.

That's an interesting idea, and neatly solves the problem of making
stdin/stdout/stderr non-blocking, but at the expense of some heavyweight
OS-thread blocking.

Cheers,
        Simon
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe