[GHC] #15153: GHC uses O_NONBLOCK on regular files, which has no effect, and blocks the runtime

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

[GHC] #15153: GHC uses O_NONBLOCK on regular files, which has no effect, and blocks the runtime

GHC - devs mailing list
#15153: GHC uses O_NONBLOCK on regular files, which has no effect, and blocks the
runtime
-------------------------------------+-------------------------------------
           Reporter:  nh2            |             Owner:  (none)
               Type:  bug            |            Status:  new
           Priority:  normal         |         Milestone:  8.6.1
          Component:  Runtime        |           Version:  8.2.2
  System                             |
           Keywords:                 |  Operating System:  Linux
       Architecture:                 |   Type of failure:  Runtime
  Unknown/Multiple                   |  performance bug
          Test Case:                 |        Blocked By:
           Blocking:                 |   Related Tickets:
Differential Rev(s):                 |         Wiki Page:
-------------------------------------+-------------------------------------
 This is the outcome of https://mail.haskell.org/pipermail/ghc-
 devs/2018-May/015749.html

 Reading through the code of
 [http://hackage.haskell.org/package/base-4.11.1.0/docs/src/GHC.IO.FD.html#readRawBufferPtr
 readRawBufferPtr] the first line jumped to my eye:

 {{{
   | isNonBlocking fd = unsafe_read -- unsafe is ok, it can't block
 }}}

 This looks suspicious.

 On Linux, if `fd` is a a descriptor to a regular file (on disk, a
 networked filesystem, or a block device), then `O_NONBLOCK` will have no
 effect, yet `unsafe_read` is used which will block the running OS thread.

 You can read more about `O_NONBLOCK` not working on regular files on Linux
 here:

 * https://www.nginx.com/blog/thread-pools-boost-performance-9x/
 * https://stackoverflow.com/questions/8057892/epoll-on-regular-files
 * https://jvns.ca/blog/2017/06/03/async-io-on-linux--select--poll--and-
 epoll/
 *
 https://groups.google.com/forum/#!topic/comp.os.linux.development.system/K
 -fC-G6P4EA

 And indeed, the following program does NOT keep printing things in the
 printing thread, and instead blocks for 30 seconds:

 {{{
 module Main where

 import           Control.Concurrent
 import           Control.Monad
 import qualified Data.ByteString as BS
 import           System.Environment

 main :: IO ()
 main = do
   args <- getArgs
   case args of
     [file] -> do

       forkIO $ forever $ do
         putStrLn "still running"
         threadDelay 100000 -- 0.1 s
       bs <- BS.readFile file
       putStrLn $ "Read " ++ show (BS.length bs) ++ " bytes"

     _ -> error "Pass 1 argument (a file)"
 }}}

 when compiled with

 {{{
 ~/.stack/programs/x86_64-linux/ghc-8.2.2/bin/ghc --make -O -threaded
 blocking-regular-file-read-test.hs
 }}}

 on my Ubuntu 16.04 and on a 2GB file like

 {{{
 ./blocking-regular-file-read-test /mnt/images/ubuntu-18.04-desktop-
 amd64.iso
 }}}

 And `strace -f -e open,read` on it shows:

 {{{
 open("/mnt/images/ubuntu-18.04-desktop-amd64.iso",
 O_RDONLY|O_NOCTTY|O_NONBLOCK) = 11
 read(11,  <unfinished ...>
 }}}

 So GHC is trying to use `O_NONBLOCK` on regular files, which cannot work
 and will block when used through unsafe foreign calls like that.

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/15153>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler

_______________________________________________
ghc-tickets mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-tickets
Reply | Threaded
Open this post in threaded view
|

Re: [GHC] #15153: GHC uses O_NONBLOCK on regular files, which has no effect, and blocks the runtime

GHC - devs mailing list
#15153: GHC uses O_NONBLOCK on regular files, which has no effect, and blocks the
runtime
-------------------------------------+-------------------------------------
        Reporter:  nh2               |                Owner:  (none)
            Type:  bug               |               Status:  new
        Priority:  normal            |            Milestone:  8.6.1
       Component:  Runtime System    |              Version:  8.2.2
      Resolution:                    |             Keywords:
Operating System:  Linux             |         Architecture:
 Type of failure:  Runtime           |  Unknown/Multiple
  performance bug                    |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by nh2):

 This is mainly for Linux but may also apply to other OSs.

 I've read that FreeBSD has a better story for reading files
 asynchronously, but I don't know whether GHC has any special code to use
 those features.

 Input from FreeBSD users would be appreciated.

 Otherwise, likely the fix to this for Linux will also improve things over
 the status quo on FreeBSD.

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/15153#comment:1>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler

_______________________________________________
ghc-tickets mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-tickets
Reply | Threaded
Open this post in threaded view
|

Re: [GHC] #15153: GHC uses O_NONBLOCK on regular files, which has no effect, and blocks the runtime

GHC - devs mailing list
In reply to this post by GHC - devs mailing list
#15153: GHC uses O_NONBLOCK on regular files, which has no effect, and blocks the
runtime
-------------------------------------+-------------------------------------
        Reporter:  nh2               |                Owner:  (none)
            Type:  bug               |               Status:  new
        Priority:  normal            |            Milestone:  8.6.1
       Component:  Runtime System    |              Version:  8.2.2
      Resolution:                    |             Keywords:
Operating System:  Linux             |         Architecture:
 Type of failure:  Runtime           |  Unknown/Multiple
  performance bug                    |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by nh2):

 Also funny but perhaps not too surprising:

 If in my code, you replace `forkIO` by e.g. `forkOn 2`, then
 nondeterministically, sometimes the program hangs and sometimes it works
 with `+RTS -N2`.

 The higher you set `-N`, the more likely it is to work.

 If you put both the putStrLn loop and the readFile into `forkOn 0` and
 `forkOn 1` each, and run with `+RTS -N3`, then it always works as
 expected.

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/15153#comment:2>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler

_______________________________________________
ghc-tickets mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-tickets
Reply | Threaded
Open this post in threaded view
|

Re: [GHC] #15153: GHC uses O_NONBLOCK on regular files, which has no effect, and blocks the runtime

GHC - devs mailing list
In reply to this post by GHC - devs mailing list
#15153: GHC uses O_NONBLOCK on regular files, which has no effect, and blocks the
runtime
-------------------------------------+-------------------------------------
        Reporter:  nh2               |                Owner:  (none)
            Type:  bug               |               Status:  new
        Priority:  normal            |            Milestone:  8.6.1
       Component:  Runtime System    |              Version:  8.2.2
      Resolution:                    |             Keywords:
Operating System:  Linux             |         Architecture:
 Type of failure:  Runtime           |  Unknown/Multiple
  performance bug                    |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by simonmar):

 Yeah, this is well known. O_NONBLOCK doesn't do anything for local disk
 files, but it also doesn't do any harm. It's not usually a big problem
 except when the filesystem actually has significant latency, like with NFS
 or an arbitrary fuse filesystem.

 I think I was worried about the overhead of using safe calls here, the
 problem is that if you have lots of threads all reading from Handles (e.g.
 sockets) then if those reads are safe calls we'll get a large number of
 real OS threads created.

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/15153#comment:3>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler

_______________________________________________
ghc-tickets mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-tickets
Reply | Threaded
Open this post in threaded view
|

Re: [GHC] #15153: GHC uses O_NONBLOCK on regular files, which has no effect, and blocks the runtime

GHC - devs mailing list
In reply to this post by GHC - devs mailing list
#15153: GHC uses O_NONBLOCK on regular files, which has no effect, and blocks the
runtime
-------------------------------------+-------------------------------------
        Reporter:  nh2               |                Owner:  (none)
            Type:  bug               |               Status:  new
        Priority:  normal            |            Milestone:  8.6.1
       Component:  Runtime System    |              Version:  8.2.2
      Resolution:                    |             Keywords:
Operating System:  Linux             |         Architecture:
 Type of failure:  Runtime           |  Unknown/Multiple
  performance bug                    |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by nh2):

 Replying to [comment:3 simonmar]:
 > It's not usually a big problem except when the filesystem actually has
 significant latency, like with NFS or an arbitrary fuse filesystem.

 What about my example though?

 I'm just reading a 3GB file using `BS.readFile` and it blocks other
 threads for 30 seconds.

 (My file system doens't have unexpected latency in terms of access time,
 it's a normal spinning disk, and the problem is the time it takes to read
 the data. Though I find it reasonablye to argue that the ~8ms it may take
 to seek this disk is ''also'' a lot of wasted CPU time, that's not the key
 problem in my example.)

 > I think I was worried about the overhead of using safe calls here, the
 problem is that if you have lots of threads all reading from Handles (e.g.
 sockets) then if those reads are safe calls we'll get a large number of
 real OS threads created.

 The concern about overhead makes sense to me, but I find it easy to argue
 that there's more overheard in having all other resources of a machine
 (CPU, network, other disks) idle while a single disk arm is moving. And
 the time for a `pthread_create` is rather small compared to a read of any
 physical storage device.

 Another argument is that if examples like the one I gave don't work, it is
 very hard to build reasonable heartbeating of long-running IO tasks in
 Haskell.

 > O_NONBLOCK doesn't do anything for local disk files, but it also doesn't
 do any harm

 What I meant is that from a quick look at the code, it looked like if
 O_NONBLOCK isn't set, a `safe` call would be made instead.

 In the spirit of https://ghc.haskell.org/trac/ghc/ticket/13296#comment:3
 and the followup comment, would it be a good default to have reads to be
 `safe`, and offer `unsafe` equivalents when you know that the read will be
 very fast?

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/15153#comment:4>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler

_______________________________________________
ghc-tickets mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-tickets
Reply | Threaded
Open this post in threaded view
|

Re: [GHC] #15153: GHC uses O_NONBLOCK on regular files, which has no effect, and blocks the runtime

GHC - devs mailing list
In reply to this post by GHC - devs mailing list
#15153: GHC uses O_NONBLOCK on regular files, which has no effect, and blocks the
runtime
-------------------------------------+-------------------------------------
        Reporter:  nh2               |                Owner:  (none)
            Type:  bug               |               Status:  new
        Priority:  normal            |            Milestone:  8.8.1
       Component:  Runtime System    |              Version:  8.2.2
      Resolution:                    |             Keywords:
Operating System:  Linux             |         Architecture:
 Type of failure:  Runtime           |  Unknown/Multiple
  performance bug                    |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):
       Wiki Page:                    |
-------------------------------------+-------------------------------------
Changes (by redneb):

 * cc: redneb (added)


--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/15153#comment:6>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler

_______________________________________________
ghc-tickets mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-tickets