[GHC] #15136: High CPU when asynchronous exception and unblocking retry on TVar raced

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
18 messages Options
Reply | Threaded
Open this post in threaded view
|

[GHC] #15136: High CPU when asynchronous exception and unblocking retry on TVar raced

GHC - devs mailing list
#15136: High CPU when asynchronous exception and unblocking retry on TVar raced
-------------------------------------+-------------------------------------
           Reporter:  nshimaza       |             Owner:  (none)
               Type:  bug            |            Status:  new
           Priority:  highest        |         Milestone:  8.6.1
          Component:  Runtime        |           Version:  8.4.2
  System                             |
           Keywords:                 |  Operating System:  Unknown/Multiple
       Architecture:                 |   Type of failure:  Runtime crash
  Unknown/Multiple                   |
          Test Case:                 |        Blocked By:
           Blocking:                 |   Related Tickets:
Differential Rev(s):                 |         Wiki Page:
-------------------------------------+-------------------------------------
 Detail:  https://github.com/nshimaza/race-tmvar-async-exception

 Runtime falls into high CPU under racing condition between async exception
 and unblocking retry on TVar.

 * Reproduces with +RTS -Nx where x > 1
 * Does NOT reproduce with +RTS -N1
 * Program stalls at `killThread`
 * High CPU based on given -Nx
     * CPU won't be 100% if you gave x smaller than available hardware
 threads of your platform.
 * Does NOT reproduce if TVar/retry is replaced by MVar
 * Reproduced with GHC 8.4.2 (macOS High Sierra (10.13.4))
 * Reproduced with GHC 8.4.2 (Docker for Mac Version 18.03.1-ce-mac65)
 * Reproduced with ghc-8.5.20180506 (Docker for Mac Version 18.03.1-ce-
 mac65)


 Minimal reproducing code here.  (You can find more verbose code on the
 above github repo.)
 {{{#!hs
 main :: IO ()
 main = do
     let volume = 1000
     forM_ [1..1000] $ \i -> do
         putStrFlush $ show i ++ " "

         -- Spawn massive number of threads.
         threads <- replicateM volume $ do
             trigger <- newTVarIO False
             tid <- forkIO $ void $ atomically $ do
                 t <- readTVar trigger
                 if t then pure t else retry
             pure (trigger, tid)

         -- Make sure all threads are spawned.
         threadDelay 30000

         -- Let threads start to exit normally.
         forkIO $ forM_ threads $ \(trigger, _) -> threadDelay 1 *>
 atomically (writeTVar trigger True)

         -- Concurrently kill threads in order to create race.
         -- TMVar operation and asynchronous exception can hit same thread
 simultaneously.
         -- Adjust threadDelay if you don't reproduce very well.
         threadDelay 1000
         forM_ threads $ \(_, tid) -> do
             putCharFlush 'A'
             killThread tid      -- When the issue reproduced, this
 killThread doesn't return.
             putCharFlush '\b'
 }}}

 This program intentionally creates race condition between asynchronous
 exception
 and unblocking operation of `retry` on TVar.  From one side, a `writeTVar
 trigger True` is attempted from external thread while target thread is
 blocking
 at `retry` on the same `TVar`.  On the other side, an asynchronous
 exception
 `ThreadKilled` is thrown by yet another external thread to the same target
 thread.

 In other word, it attempts to kill a thread about to unblock.

 I guess when the above two operation hit the same thread at the same time
 in
 parallel in SMP environment, GHC runtime falls into high CPU.

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/15136>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler

_______________________________________________
ghc-tickets mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-tickets
Reply | Threaded
Open this post in threaded view
|

Re: [GHC] #15136: High CPU when asynchronous exception and unblocking retry on TVar raced

GHC - devs mailing list
#15136: High CPU when asynchronous exception and unblocking retry on TVar raced
-------------------------------------+-------------------------------------
        Reporter:  nshimaza          |                Owner:  (none)
            Type:  bug               |               Status:  new
        Priority:  highest           |            Milestone:  8.6.1
       Component:  Runtime System    |              Version:  8.4.2
      Resolution:                    |             Keywords:
Operating System:  Unknown/Multiple  |         Architecture:
                                     |  Unknown/Multiple
 Type of failure:  Runtime crash     |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by bgamari):

 Thanks for the small repro! Indeed I am able to easily reproduce this.

 The racing threads look like this,
 {{{

 Thread 8 (LWP 25960):
 #0  0x000000000049c0df in stmCommitTransaction ()
 #1  0x00000000004a9de4 in stg_atomically_frame_info ()
 #2  0x0000000000000000 in ?? ()

 Thread 1 (LWP 25948):
 #0  0x000000000049b290 in lock_tvar ()
 #1  0x000000000049b559 in remove_watch_queue_entries_for_trec ()
 #2  0x000000000049b95b in stmAbortTransaction ()
 #3  0x000000000049984c in raiseAsync ()
 #4  0x0000000000499ba3 in throwToMsg ()
 #5  0x0000000000499e82 in throwTo ()
 #6  0x00000000004a6aac in stg_killThreadzh ()
 #7  0x0000000000000000 in ?? ()
 }}}

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/15136#comment:1>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler

_______________________________________________
ghc-tickets mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-tickets
Reply | Threaded
Open this post in threaded view
|

Re: [GHC] #15136: High CPU when asynchronous exception and unblocking retry on TVar raced

GHC - devs mailing list
In reply to this post by GHC - devs mailing list
#15136: High CPU when asynchronous exception and unblocking retry on TVar raced
-------------------------------------+-------------------------------------
        Reporter:  nshimaza          |                Owner:  (none)
            Type:  bug               |               Status:  new
        Priority:  highest           |            Milestone:  8.6.1
       Component:  Runtime System    |              Version:  8.4.2
      Resolution:                    |             Keywords:
Operating System:  Unknown/Multiple  |         Architecture:
                                     |  Unknown/Multiple
 Type of failure:  Runtime crash     |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by carlostome):

 After some investigation I found that indeed there is a deadlock between
 the mutexes of a TSO and a TVar. An excerpt of the debugging log is the
 follwing:

 {{{
 Task 2: cap 1: running thread 7012 (ThreadRunGHC)
 Task 2: STM: 0xad6d08 : stmStartTransaction with 316 tokens
 Task 2: STM: 0xad6d08 : stmStartTransaction()=0x4200224000
 Task 2: STM: 0x4200224000 : stmWriteTVar(0x4200225c68, 0xad5e72)
 Task 2: STM: 0x4200224000 : get_entry_for TVar 0x4200225c68
 Task 2: STM: 0x4200224000 : FOR_EACH_ENTRY, current_chunk=0x420027e4d0
 limit=0
 Task 2: STM: 0x4200224000 : read_current_value(0x4200225c68)=0xad5e69
 Task 2: STM: 0x4200224000 : stmWriteTVar done
 Task 2: STM: 0x4200224000 : stmCommitTransaction()
 Task 2: STM: 0x4200224000 : lock_stm()
 Task 2: STM: 0x4200224000 : FOR_EACH_ENTRY, current_chunk=0x420027e4d0
 limit=1
 Task 2: STM: 0x4200224000 : trying to acquire 0x4200225c68
 Task 2: STM: 0x4200224000 : cond_lock_tvar(0x4200225c68, 0xad5e69)
 -- (1)
 Task 2: STM: 0x4200224000 : success
 Task 2: STM: 0x4200224000 : doing read check
 Task 2: STM: 0x4200224000 : FOR_EACH_ENTRY, current_chunk=0x420027e4d0
 limit=1
 Task 2: STM: 0x4200224000 : read-check succeeded
 Task 2: STM: 0x4200224000 : FOR_EACH_ENTRY, current_chunk=0x420027e4d0
 limit=1
 Task 2: STM: 0x4200224000 : writing 0xad5e72 to 0x4200225c68, waking
 waiters      -- (2)
 Task 2: STM: unpark_waiters_on tvar=0x4200225c68
 Task 2: STM: unpark_waiters_on tvar=0x4200225c68, status=0x4200223b80
 Task 2: STM: cap 1: unpark_waiters_on tvar=0x4200225c68,
 status2=0x4200223b80     -- (3)
 Task 2: STM: cap 1: unpark_tso id 6866
 -- (4)
 Task 1: BlockedOnSTM: lockTSO id 6866
 Task 1: cap 0 BlockedOnSTM: raiseAsync id 6866
 Task 1: cap 0: raising exception in thread 6866.
 Task 1: cap 0: raiseAsync: freezing atomically frame
 -- (5)
 Task 1: STM: 0x4200225f58 : stmAbortTransaction
 -- (6)
 Task 1: STM: 0x4200225f58 : lock_stm()
 Task 1: STM: 0x4200225f58 : aborting top-level transaction cap 0
 Task 1: STM: 0x4200225f58 : aborting top-level transaction
 Task 1: STM: 0x4200225f58 : stmAbortTransaction aborting waiting
 transaction
 Task 1: STM: 0x4200225f58 : remove_watch_queue_entries_for_trec()
 -- (7)
 Task 1: STM: 0x4200225f58 : FOR_EACH_ENTRY, current_chunk=0x4200223cc8
 limit=1
 Task 1: STM: 0x4200225f58 : lock_tvar (0x4200225c68)
 -- (8)
 }}}

 What is happening here has to do with how async exceptions are
 implemented.

 The thread 6866 is waiting on the TVar 0x4200225c68. In the meantime the
 thread 7012 is able to change the value of such TVar to True. To do so,
 first it has to acquire the lock for the TVar (1), then actually change
 its value (2), and finally wake any thread that was waiting on this TVar
 (3). To wake up a concrete waiter it needs to hold its mutex through
 lock_tso() (4).

 The problem is that when an async exception is thrown to a thread whose
 status is BlockedOnSTM, the first thing the thread throwing the exception
 does is to grab the mutex of the target thread with lock_tso()
 (https://github.com/ghc/ghc/blob/93220d46fceabf3afeae36f1fda94e1698c3639a/rts/RaiseAsync.c#L419),
 then because the target thread was in the middle of an STM transaction
 (5) that dependes on the same TVar, it has to abort it (6) for what it has
 to be removed from the TVar watch queue (7) needing the mutex on the TVar
 to do so (8).

 Basically the thread completing the STM grabs first the TVar lock and then
 the TSO lock while for the async exception the locks are taken in reverse
 order.

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/15136#comment:2>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler

_______________________________________________
ghc-tickets mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-tickets
Reply | Threaded
Open this post in threaded view
|

Re: [GHC] #15136: High CPU when asynchronous exception and unblocking retry on TVar raced

GHC - devs mailing list
In reply to this post by GHC - devs mailing list
#15136: High CPU when asynchronous exception and unblocking retry on TVar raced
-------------------------------------+-------------------------------------
        Reporter:  nshimaza          |                Owner:  (none)
            Type:  bug               |               Status:  new
        Priority:  highest           |            Milestone:  8.6.1
       Component:  Runtime System    |              Version:  8.4.2
      Resolution:                    |             Keywords:
Operating System:  Unknown/Multiple  |         Architecture:
                                     |  Unknown/Multiple
 Type of failure:  Runtime crash     |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):
       Wiki Page:                    |
-------------------------------------+-------------------------------------
Changes (by carlostome):

 * cc: carlostome (added)


--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/15136#comment:3>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler

_______________________________________________
ghc-tickets mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-tickets
Reply | Threaded
Open this post in threaded view
|

Re: [GHC] #15136: High CPU when asynchronous exception and unblocking retry on TVar raced

GHC - devs mailing list
In reply to this post by GHC - devs mailing list
#15136: High CPU when asynchronous exception and unblocking retry on TVar raced
-------------------------------------+-------------------------------------
        Reporter:  nshimaza          |                Owner:  (none)
            Type:  bug               |               Status:  new
        Priority:  highest           |            Milestone:  8.6.1
       Component:  Runtime System    |              Version:  8.4.2
      Resolution:                    |             Keywords:
Operating System:  Unknown/Multiple  |         Architecture:
                                     |  Unknown/Multiple
 Type of failure:  Runtime crash     |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):
       Wiki Page:                    |
-------------------------------------+-------------------------------------
Changes (by simonmar):

 * cc: simonmar (added)


--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/15136#comment:4>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler

_______________________________________________
ghc-tickets mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-tickets
Reply | Threaded
Open this post in threaded view
|

Re: [GHC] #15136: High CPU when asynchronous exception and unblocking retry on TVar raced

GHC - devs mailing list
In reply to this post by GHC - devs mailing list
#15136: High CPU when asynchronous exception and unblocking retry on TVar raced
-------------------------------------+-------------------------------------
        Reporter:  nshimaza          |                Owner:  (none)
            Type:  bug               |               Status:  new
        Priority:  highest           |            Milestone:  8.6.1
       Component:  Runtime System    |              Version:  8.4.2
      Resolution:                    |             Keywords:
Operating System:  Unknown/Multiple  |         Architecture:
                                     |  Unknown/Multiple
 Type of failure:  Runtime crash     |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by simonmar):

 Thanks for the excellent diagnosis @bgamari.

 I'm not sure what the best fix is yet, but there is something distinctly
 strange here. `lockTSO` and `unlockTSO` are only used to synchronise for
 this particular case, between `raiseAsync` and `unpark_tso`. In all other
 cases, a TSO has a clear owner - in the case of a blocked TSO, the owner
 is usually the object on which the TSO is blocked, e.g. an MVar. Perhaps
 we should switch to using an owner semantics for BlockedOnSTM too - that
 is, if we see `BlockedOnSTM` in `raiseAsync`, we attempt to lock the
 `TVar` pointed to by `tso->block_info`.

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/15136#comment:5>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler

_______________________________________________
ghc-tickets mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-tickets
Reply | Threaded
Open this post in threaded view
|

Re: [GHC] #15136: High CPU when asynchronous exception and unblocking retry on TVar raced

GHC - devs mailing list
In reply to this post by GHC - devs mailing list
#15136: High CPU when asynchronous exception and unblocking retry on TVar raced
-------------------------------------+-------------------------------------
        Reporter:  nshimaza          |                Owner:  (none)
            Type:  bug               |               Status:  new
        Priority:  highest           |            Milestone:  8.6.1
       Component:  Runtime System    |              Version:  8.4.2
      Resolution:                    |             Keywords:
Operating System:  Unknown/Multiple  |         Architecture:
                                     |  Unknown/Multiple
 Type of failure:  Runtime crash     |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by bgamari):

 Just to make sure credit is given where credit is due, carlostome is the
 one responsible for the diagnosis (which is indeed excellent; great work
 carlostome!)

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/15136#comment:6>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler

_______________________________________________
ghc-tickets mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-tickets
Reply | Threaded
Open this post in threaded view
|

Re: [GHC] #15136: High CPU when asynchronous exception and unblocking retry on TVar raced

GHC - devs mailing list
In reply to this post by GHC - devs mailing list
#15136: High CPU when asynchronous exception and unblocking retry on TVar raced
-------------------------------------+-------------------------------------
        Reporter:  nshimaza          |                Owner:  (none)
            Type:  bug               |               Status:  new
        Priority:  highest           |            Milestone:  8.6.1
       Component:  Runtime System    |              Version:  8.4.2
      Resolution:                    |             Keywords:
Operating System:  Unknown/Multiple  |         Architecture:
                                     |  Unknown/Multiple
 Type of failure:  Runtime crash     |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by bgamari):

 We will make a best-effort to look at this for 8.6.

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/15136#comment:7>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler

_______________________________________________
ghc-tickets mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-tickets
Reply | Threaded
Open this post in threaded view
|

Re: [GHC] #15136: High CPU when asynchronous exception and unblocking retry on TVar raced

GHC - devs mailing list
In reply to this post by GHC - devs mailing list
#15136: High CPU when asynchronous exception and unblocking retry on TVar raced
-------------------------------------+-------------------------------------
        Reporter:  nshimaza          |                Owner:  osa1
            Type:  bug               |               Status:  new
        Priority:  highest           |            Milestone:  8.6.1
       Component:  Runtime System    |              Version:  8.4.2
      Resolution:                    |             Keywords:
Operating System:  Unknown/Multiple  |         Architecture:
                                     |  Unknown/Multiple
 Type of failure:  Runtime crash     |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):
       Wiki Page:                    |
-------------------------------------+-------------------------------------
Changes (by osa1):

 * owner:  (none) => osa1


Comment:

 I'll try implementing comment:5 see how it goes. This code path is not
 tested enough so I'll also try to port some of concurrency tests that use
 MVar and async exceptions (exercising `raiseAsync()`) to STM.

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/15136#comment:8>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler

_______________________________________________
ghc-tickets mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-tickets
Reply | Threaded
Open this post in threaded view
|

Re: [GHC] #15136: High CPU when asynchronous exception and unblocking retry on TVar raced

GHC - devs mailing list
In reply to this post by GHC - devs mailing list
#15136: High CPU when asynchronous exception and unblocking retry on TVar raced
-------------------------------------+-------------------------------------
        Reporter:  nshimaza          |                Owner:  osa1
            Type:  bug               |               Status:  new
        Priority:  highest           |            Milestone:  8.6.1
       Component:  Runtime System    |              Version:  8.4.2
      Resolution:                    |             Keywords:
Operating System:  Unknown/Multiple  |         Architecture:
                                     |  Unknown/Multiple
 Type of failure:  Runtime crash     |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by osa1):

 Just to repeat comment:1 and comment:2 with my words:

 Thread 1 kills Thread 2 which is blocked on a TVar operation. For this it
 calls
 raiseAsync() and for that it has to lock Thread 2 (lockTSO). Then to abort
 the
 transaction it needs to lock the TVar (lock_tvar).

 At the same time Thread 3 succeeds to modify the TVar and to unblock any
 threads
 blocked on this TVar it needs to lock the TVar (lock_tvar), and then to
 actually
 unblock the thread it needs to lock the TSO (lockTSO).

 When the order of locking goes like this:

 - Thread 1 locks the TSO (lockTSO)
 - Thread 3 locks the TVar (lock_tvar)

 We get a deadlock because Thread 1 now wants to lock the TVar and Thread 3
 wants
 to lock the TSO, both of which are locked already.

 > Perhaps we should switch to using an owner semantics for BlockedOnSTM
 too -
 > that is, if we see BlockedOnSTM in raiseAsync, we attempt to lock the
 TVar
 > pointed to by tso->block_info.

 I only get `END_TSO_QUEUE` in `tso->block_info`. I think the TVar is only
 reachable from the array list `tso->trec->current_chunk`. I guess we could
 do
 this:

 - Lock the TSO
 - If BlockedOnSTM then check tso->trec entries. Expect to see only one
 TVar
   there (can we have more than on TVars here?). Lock the TVar and release
 the
   TSO.
 - Continue with raiseAsync()

 I don't know if we can see more than one TVar in tso->trec entries. Also,
 we
 need to modify stmAbortTransaction because we'll have the TVar locked
 already,
 but it still needs to lock it when it's called from other call sites (e.g.
 from
 `raise#`).

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/15136#comment:9>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler

_______________________________________________
ghc-tickets mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-tickets
Reply | Threaded
Open this post in threaded view
|

Re: [GHC] #15136: High CPU when asynchronous exception and unblocking retry on TVar raced

GHC - devs mailing list
In reply to this post by GHC - devs mailing list
#15136: High CPU when asynchronous exception and unblocking retry on TVar raced
-------------------------------------+-------------------------------------
        Reporter:  nshimaza          |                Owner:  osa1
            Type:  bug               |               Status:  new
        Priority:  highest           |            Milestone:  8.6.1
       Component:  Runtime System    |              Version:  8.4.2
      Resolution:                    |             Keywords:
Operating System:  Unknown/Multiple  |         Architecture:
                                     |  Unknown/Multiple
 Type of failure:  Runtime crash     |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by osa1):

 I documented why we can't apply ownership semantics of MVars to TVars (as
 described in comment:5) in Phab:D4954.

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/15136#comment:10>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler

_______________________________________________
ghc-tickets mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-tickets
Reply | Threaded
Open this post in threaded view
|

Re: [GHC] #15136: High CPU when asynchronous exception and unblocking retry on TVar raced

GHC - devs mailing list
In reply to this post by GHC - devs mailing list
#15136: High CPU when asynchronous exception and unblocking retry on TVar raced
-------------------------------------+-------------------------------------
        Reporter:  nshimaza          |                Owner:  osa1
            Type:  bug               |               Status:  patch
        Priority:  highest           |            Milestone:  8.6.1
       Component:  Runtime System    |              Version:  8.4.2
      Resolution:                    |             Keywords:
Operating System:  Unknown/Multiple  |         Architecture:
                                     |  Unknown/Multiple
 Type of failure:  Runtime crash     |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):  Phab:D4956
       Wiki Page:                    |
-------------------------------------+-------------------------------------
Changes (by osa1):

 * status:  new => patch
 * differential:   => Phab:D4956


--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/15136#comment:11>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler

_______________________________________________
ghc-tickets mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-tickets
Reply | Threaded
Open this post in threaded view
|

Re: [GHC] #15136: High CPU when asynchronous exception and unblocking retry on TVar raced

GHC - devs mailing list
In reply to this post by GHC - devs mailing list
#15136: High CPU when asynchronous exception and unblocking retry on TVar raced
-------------------------------------+-------------------------------------
        Reporter:  nshimaza          |                Owner:  osa1
            Type:  bug               |               Status:  patch
        Priority:  highest           |            Milestone:  8.6.1
       Component:  Runtime System    |              Version:  8.4.2
      Resolution:                    |             Keywords:
Operating System:  Unknown/Multiple  |         Architecture:
                                     |  Unknown/Multiple
 Type of failure:  Runtime crash     |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):  Phab:D4956
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by Ben Gamari <ben@…>):

 In [changeset:"7fc418df856d9b58034eeec48915646e67a7a562/ghc"
 7fc418df/ghc]:
 {{{
 #!CommitTicketReference repository="ghc"
 revision="7fc418df856d9b58034eeec48915646e67a7a562"
 Fix deadlock between STM and throwTo

 There was a lock-order reversal between lockTSO() and the TVar lock,
 see #15136 for the details.

 It turns out we can fix this pretty easily by just deleting all the
 locking code(!).  The principle for unblocking a `BlockedOnSTM` thread
 then becomes the same as for other kinds of blocking: if the TSO
 belongs to this capability then we do it directly, otherwise we send a
 message to the capability that owns the TSO. That is, a thread blocked
 on STM is owned by its capability, as it should be.

 The possible downside of this is that we might send multiple messages
 to wake up a thread when the thread is on another capability. This is
 safe, it's just not very efficient.  I'll try to do some experiments
 to see if this is a problem.

 Test Plan: Test case from #15136 doesn't deadlock any more.

 Reviewers: bgamari, osa1, erikd

 Reviewed By: osa1

 Subscribers: rwbarton, thomie, carter

 GHC Trac Issues: #15136

 Differential Revision: https://phabricator.haskell.org/D4956
 }}}

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/15136#comment:12>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler

_______________________________________________
ghc-tickets mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-tickets
Reply | Threaded
Open this post in threaded view
|

Re: [GHC] #15136: High CPU when asynchronous exception and unblocking retry on TVar raced

GHC - devs mailing list
In reply to this post by GHC - devs mailing list
#15136: High CPU when asynchronous exception and unblocking retry on TVar raced
-------------------------------------+-------------------------------------
        Reporter:  nshimaza          |                Owner:  osa1
            Type:  bug               |               Status:  merge
        Priority:  highest           |            Milestone:  8.6.1
       Component:  Runtime System    |              Version:  8.4.2
      Resolution:                    |             Keywords:
Operating System:  Unknown/Multiple  |         Architecture:
                                     |  Unknown/Multiple
 Type of failure:  Runtime crash     |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):  Phab:D4956
       Wiki Page:                    |
-------------------------------------+-------------------------------------
Changes (by bgamari):

 * status:  patch => merge


--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/15136#comment:13>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler

_______________________________________________
ghc-tickets mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-tickets
Reply | Threaded
Open this post in threaded view
|

Re: [GHC] #15136: High CPU when asynchronous exception and unblocking retry on TVar raced

GHC - devs mailing list
In reply to this post by GHC - devs mailing list
#15136: High CPU when asynchronous exception and unblocking retry on TVar raced
-------------------------------------+-------------------------------------
        Reporter:  nshimaza          |                Owner:  osa1
            Type:  bug               |               Status:  merge
        Priority:  highest           |            Milestone:  8.6.1
       Component:  Runtime System    |              Version:  8.4.2
      Resolution:                    |             Keywords:
Operating System:  Unknown/Multiple  |         Architecture:
                                     |  Unknown/Multiple
 Type of failure:  Runtime crash     |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):  Phab:D4956
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by osa1):

 Phab:D4961 is a continuation of the commit above and should probably be
 merged with it.

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/15136#comment:14>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler

_______________________________________________
ghc-tickets mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-tickets
Reply | Threaded
Open this post in threaded view
|

Re: [GHC] #15136: High CPU when asynchronous exception and unblocking retry on TVar raced

GHC - devs mailing list
In reply to this post by GHC - devs mailing list
#15136: High CPU when asynchronous exception and unblocking retry on TVar raced
-------------------------------------+-------------------------------------
        Reporter:  nshimaza          |                Owner:  osa1
            Type:  bug               |               Status:  merge
        Priority:  highest           |            Milestone:  8.6.1
       Component:  Runtime System    |              Version:  8.4.2
      Resolution:                    |             Keywords:
Operating System:  Unknown/Multiple  |         Architecture:
                                     |  Unknown/Multiple
 Type of failure:  Runtime crash     |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):  Phab:D4956
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by Ben Gamari <ben@…>):

 In [changeset:"502640c90c3d0fbb6c46257be14fdc7e3c694c6c/ghc"
 502640c9/ghc]:
 {{{
 #!CommitTicketReference repository="ghc"
 revision="502640c90c3d0fbb6c46257be14fdc7e3c694c6c"
 Optimise wakeups for STM

 Avoids repeated wakeup messages being sent when a TVar is written to
 multiple times. See comments for details.

 Test Plan:
 * Test from #15136 (will be added to stm shortly)
 * existing stm tests

 Reviewers: bgamari, osa1, erikd

 Reviewed By: bgamari

 Subscribers: rwbarton, thomie, carter

 GHC Trac Issues: #15136

 Differential Revision: https://phabricator.haskell.org/D4961
 }}}

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/15136#comment:15>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler

_______________________________________________
ghc-tickets mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-tickets
Reply | Threaded
Open this post in threaded view
|

Re: [GHC] #15136: High CPU when asynchronous exception and unblocking retry on TVar raced

GHC - devs mailing list
In reply to this post by GHC - devs mailing list
#15136: High CPU when asynchronous exception and unblocking retry on TVar raced
-------------------------------------+-------------------------------------
        Reporter:  nshimaza          |                Owner:  osa1
            Type:  bug               |               Status:  merge
        Priority:  highest           |            Milestone:  8.6.1
       Component:  Runtime System    |              Version:  8.4.2
      Resolution:                    |             Keywords:
Operating System:  Unknown/Multiple  |         Architecture:
                                     |  Unknown/Multiple
 Type of failure:  Runtime crash     |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):  Phab:D4956
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by nshimaza):

 Let me report that my repro code no longer reproduces the issue with
 ghc-8.7.20180715 in my test environment!  (Platform is Docker for Mac
 Version 18.03.1-ce-mac65 (24312).)

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/15136#comment:16>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler

_______________________________________________
ghc-tickets mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-tickets
Reply | Threaded
Open this post in threaded view
|

Re: [GHC] #15136: High CPU when asynchronous exception and unblocking retry on TVar raced

GHC - devs mailing list
In reply to this post by GHC - devs mailing list
#15136: High CPU when asynchronous exception and unblocking retry on TVar raced
-------------------------------------+-------------------------------------
        Reporter:  nshimaza          |                Owner:  osa1
            Type:  bug               |               Status:  closed
        Priority:  highest           |            Milestone:  8.6.1
       Component:  Runtime System    |              Version:  8.4.2
      Resolution:  fixed             |             Keywords:
Operating System:  Unknown/Multiple  |         Architecture:
                                     |  Unknown/Multiple
 Type of failure:  Runtime crash     |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):  Phab:D4956
       Wiki Page:                    |
-------------------------------------+-------------------------------------
Changes (by bgamari):

 * status:  merge => closed
 * resolution:   => fixed


Comment:

 Thanks for the confirmation!

 comment:15 merged with c15ba1fb83575081bb6a1d4955f9e13626ef8d51.

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/15136#comment:17>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler

_______________________________________________
ghc-tickets mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-tickets