Battling laziness

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
17 messages Options
Reply | Threaded
Open this post in threaded view
|

Battling laziness

Joel Reymont
Folks,

I have a huge space leak someplace and I suspect this code. The  
SrvServerInfo data structure is something like 50K compressed or  
uncompressed byte data before unpickling. My thousands of bots issue  
this request at least once and I almost run out of memory with 100  
bots on a 1Gb machine on FreeBSD. Do I need deepSeq somewhere below?

This is the read.

read :: Handle -> (SSL, BIO, BIO) -> IO Command
read h _ =
     do sa <- emptyByteArray 4
        hGetArray h sa 4
        (size', _) <- unpickle endian32 sa 0
        let size = fromIntegral $ size' - 4
        packet <- emptyByteArray size
        hGetArray h packet size
        unstuff packet 0

I suspect that I need to deepSeq cmd'' instead of return $! cmd''

unstuff :: MutByteArray -> Index -> IO Command
unstuff array ix =
     do (kind, ix1) <- unpickle puCmdType array ix
        (cmd', _) <- unpickle (puCommand kind) array ix1
        case cmd' of
          InvalidCommand -> do fail $ "unstuff: Cannot parse " ++  
show array
          SrvCompressedCommands sz bytes ->
              do bytes' <- uncompress bytes (fromIntegral sz)
                 cmd'' <- unstuff bytes' 4
                 return $! cmd''
          _ -> return cmd'

This is where the list of active tables is converted to a table id  
list of [Word32].

pickTable _ filters (Cmd cmd@(SrvServerInfo {})) =
     do let tables = filter (tableMatches filters) $ activeTables cmd
            ids = map tiTableID tables
        case tables of
          [] -> fail $ "pickTable: No tables found: " ++ show filters
          _ ->
              do pop
                 stoptimer "pickTable"
                 return $! Eat $! Just $! Custom $! Tables $! ids

This is where the table id list of [Word32] is consumed.

takeEmptySeat _ aff_id _ (Custom (Tables ids@(table:rest))) =
     do trace 85 $ "takeEmptySeat: " ++ show (length ids)
                  ++ " tables found"
        trace 100 $ "takeEmptySeat: tables: " ++ showTables ids
        trace 85 $ "takeEmptySeat: trying table# " ++ show table
        w <- get
        put_ $ w { tables_to_try = rest }
        push "goToTable" $ goToTable table aff_id
        -- kick off goToTable
        return $ Eat $ Just Go

This is the SrvServerInfo structure.

     | SrvServerInfo
       {
        activeTables :: ![TableInfo], -- Word16/
        removedTables :: ![Word32], -- Word16/
        version :: !Int32
       }

And this is the table info itself.

data TableInfo = TableInfo
     {
      tiAvgPot :: !Word64,
      tiNumPlayers :: !Word16,
      tiWaiting :: !Word16,
      tiPlayersFlop :: !Word8,
      tiTableName :: !String,
      tiTableID :: !Word32,
      tiGameType :: !GameType,
      tiInfoMaxPlayers :: !Word16,
      tiIsRealMoneyTable :: !Bool,
      tiLowBet :: !Word64,
      tiHighBet :: !Word64,
      tiMinStartMoney :: !Word64,
      tiMaxStartMoney :: !Word64,
      tiGamesPerHour :: !Word16,
      tiTourType :: !TourType,
      tiTourID :: !Word32,
      tiBetType :: !BetType,
      tiCantReturnLess :: !Word32,
      tiAffiliateID :: ![Word8],
      tiLangID :: !Word32
     }  deriving (Show, Typeable)

        Thanks, Joel

--
http://wagerlabs.com/





_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

RE: Battling laziness

Simon Marlow
What does ordinary heap profiling (-hc, -hd, -hy) tell you about what's
in the heap?  These options should work fine with STM.

Cheers,
        Simon

On 16 December 2005 11:44, Joel Reymont wrote:

> Folks,
>
> I have a huge space leak someplace and I suspect this code. The
> SrvServerInfo data structure is something like 50K compressed or
> uncompressed byte data before unpickling. My thousands of bots issue
> this request at least once and I almost run out of memory with 100
> bots on a 1Gb machine on FreeBSD. Do I need deepSeq somewhere below?
>
> This is the read.
>
> read :: Handle -> (SSL, BIO, BIO) -> IO Command
> read h _ =
>      do sa <- emptyByteArray 4
>         hGetArray h sa 4
>         (size', _) <- unpickle endian32 sa 0
>         let size = fromIntegral $ size' - 4
>         packet <- emptyByteArray size
>         hGetArray h packet size
>         unstuff packet 0
>
> I suspect that I need to deepSeq cmd'' instead of return $! cmd''
>
> unstuff :: MutByteArray -> Index -> IO Command
> unstuff array ix =
>      do (kind, ix1) <- unpickle puCmdType array ix
>         (cmd', _) <- unpickle (puCommand kind) array ix1
>         case cmd' of
>           InvalidCommand -> do fail $ "unstuff: Cannot parse " ++
> show array
>           SrvCompressedCommands sz bytes ->
>               do bytes' <- uncompress bytes (fromIntegral sz)
>                  cmd'' <- unstuff bytes' 4
>                  return $! cmd''
>           _ -> return cmd'
>
> This is where the list of active tables is converted to a table id
> list of [Word32].
>
> pickTable _ filters (Cmd cmd@(SrvServerInfo {})) =
>      do let tables = filter (tableMatches filters) $ activeTables cmd
>             ids = map tiTableID tables
>         case tables of
>           [] -> fail $ "pickTable: No tables found: " ++ show filters
>           _ ->
>               do pop
>                  stoptimer "pickTable"
>                  return $! Eat $! Just $! Custom $! Tables $! ids
>
> This is where the table id list of [Word32] is consumed.
>
> takeEmptySeat _ aff_id _ (Custom (Tables ids@(table:rest))) =
>      do trace 85 $ "takeEmptySeat: " ++ show (length ids)
>                   ++ " tables found"
>         trace 100 $ "takeEmptySeat: tables: " ++ showTables ids
>         trace 85 $ "takeEmptySeat: trying table# " ++ show table
>         w <- get
>         put_ $ w { tables_to_try = rest }
>         push "goToTable" $ goToTable table aff_id
>         -- kick off goToTable
>         return $ Eat $ Just Go
>
> This is the SrvServerInfo structure.
>
>      | SrvServerInfo
>        {
>         activeTables :: ![TableInfo], -- Word16/
>         removedTables :: ![Word32], -- Word16/
>         version :: !Int32
>        }
>
> And this is the table info itself.
>
> data TableInfo = TableInfo
>      {
>       tiAvgPot :: !Word64,
>       tiNumPlayers :: !Word16,
>       tiWaiting :: !Word16,
>       tiPlayersFlop :: !Word8,
>       tiTableName :: !String,
>       tiTableID :: !Word32,
>       tiGameType :: !GameType,
>       tiInfoMaxPlayers :: !Word16,
>       tiIsRealMoneyTable :: !Bool,
>       tiLowBet :: !Word64,
>       tiHighBet :: !Word64,
>       tiMinStartMoney :: !Word64,
>       tiMaxStartMoney :: !Word64,
>       tiGamesPerHour :: !Word16,
>       tiTourType :: !TourType,
>       tiTourID :: !Word32,
>       tiBetType :: !BetType,
>       tiCantReturnLess :: !Word32,
>       tiAffiliateID :: ![Word8],
>       tiLangID :: !Word32
>      }  deriving (Show, Typeable)
>
> Thanks, Joel

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Battling laziness

Joel Reymont
-hc points to script#9 below.

script (_, _, affid) (Custom (JoinedTable 0)) =
     do {-# SCC "script#8" #-}push "takeEmptySeat" $
             {-# SCC "script#9" #-}takeEmptySeat Holdem affid []
        {-# SCC "script#10" #-}return $ Eat $ Just Go

What takeEmptySeat does it call pickTable

takeEmptySeat game_type _ filters Go =
     do push "pickTable" $ pickTable game_type filters
        return $ Eat $ Just Go

pickTable retrieves the list of SrvServerInfo structures, etc.

Overall, -hc does not help me figure out where my data is being  
retained. My understanding is that I need to do -hbdrag,void fo  
rthat. I did not try -hd and -hy, they would only help me narrow down  
the producers, right?

My program seems to spend 70% of the time collecting garbage. Notice  
the HUGE overall allocations. This is my trying to launch 4k bots  
over 8 hours. Only 1k bots were launched and just 300 of those got to  
play. Maybe because they did not have time with all the garbage  
collection :-).

The tests that I ran previously did not involve heavy network  
traffic, just a few very small packets. This is why I was able to get  
to thousands of bots in just a couple of hours and keep them there.

./randomplay +RTS -k3k -P -hc -srandomplay.gc
95,739,560,464 bytes allocated in the heap
887,633,330,904 bytes copied during GC
131,849,008 bytes maximum residency (8730 sample(s))

      330325 collections in generation 0 (557.40s)
        8730 collections in generation 1 (16370.05s)

         248 Mb total memory in use

   INIT  time    0.00s  (  0.03s elapsed)
   MUT   time  783.40s  (1872.75s elapsed)
   GC    time  16927.45s  (20075.68s elapsed)
   RP    time    0.00s  (  0.00s elapsed)
   PROF  time  6003.62s  (7058.40s elapsed)
   EXIT  time    0.00s  (  0.00s elapsed)
   Total time  23714.47s  (29006.86s elapsed)

   %GC time      71.4%  (69.2% elapsed) <---- isn't this aweful?

   Alloc rate    122,210,314 bytes per MUT second

   Productivity   3.3% of total user, 2.7% of total elapsed

On Dec 16, 2005, at 11:53 AM, Simon Marlow wrote:

> What does ordinary heap profiling (-hc, -hd, -hy) tell you about  
> what's
> in the heap?  These options should work fine with STM.

--
http://wagerlabs.com/





_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Battling laziness

Joel Reymont
In reply to this post by Simon Marlow
I uploaded the full reports to http://wagerlabs.com/randomplay.tgz

On Dec 16, 2005, at 11:53 AM, Simon Marlow wrote:

> What does ordinary heap profiling (-hc, -hd, -hy) tell you about  
> what's
> in the heap?  These options should work fine with STM.

--
http://wagerlabs.com/





_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

RE: Battling laziness

Simon Marlow
In reply to this post by Joel Reymont
On 16 December 2005 12:08, Joel Reymont wrote:

> -hc points to script#9 below.
>
> script (_, _, affid) (Custom (JoinedTable 0)) =
>      do {-# SCC "script#8" #-}push "takeEmptySeat" $
>              {-# SCC "script#9" #-}takeEmptySeat Holdem affid []
>         {-# SCC "script#10" #-}return $ Eat $ Just Go
>
> What takeEmptySeat does it call pickTable
>
> takeEmptySeat game_type _ filters Go =
>      do push "pickTable" $ pickTable game_type filters
>         return $ Eat $ Just Go

It's hard to pick out the cause of a space leak from just a fragment of
the program, but I'll try to give you some pointers.

If script#9 is the cost center attached to all of your leaking heap
data, then you're already a long way to finding the problem.  It'll help
even more to find out whether it is just unevaluated copies of
"takeEmptySeat Holdem affid []", or something else (-hd, -hy will help
here).  Try +RTS -hy -hcscript#9, for example.

One obvious thing to try is replacing the '$' before {-# SCC "script#9"
#-} with '$!'.  And similarly in takeEmptySeat.

> Overall, -hc does not help me figure out where my data is being
> retained. My understanding is that I need to do -hbdrag,void fo
> rthat. I did not try -hd and -hy, they would only help me narrow down
> the producers, right?

Not necessarily; lag/drag/void only tells you about certain kinds of
space leaks.  It's another tool in the box, and quite often you can get
away without it.  Retainer profiling similarly.

(I should say that we definitely plan to update these for STM, but it's
not completely trivial (I checked).  Volunteers definitely welcome).

> My program seems to spend 70% of the time collecting garbage. Notice
> the HUGE overall allocations. This is my trying to launch 4k bots
> over 8 hours. Only 1k bots were launched and just 300 of those got to
> play. Maybe because they did not have time with all the garbage
> collection :-).

Note that your GC time is inflated quite a bit due to profiling (it
makes every object larger).

The plan to reduce GC time is, in this order: squash space leaks, reduce
allocation (to reduce GC load), and then tweak GC parameters.

Cheers,
        Simon
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Battling laziness

Joel Reymont

On Dec 16, 2005, at 12:36 PM, Simon Marlow wrote:

> If script#9 is the cost center attached to all of your leaking heap
> data, then you're already a long way to finding the problem.  It'll  
> help
> even more to find out whether it is just unevaluated copies of
> "takeEmptySeat Holdem affid []", or something else (-hd, -hy will help
> here).  Try +RTS -hy -hcscript#9, for example.
>
> One obvious thing to try is replacing the '$' before {-# SCC  
> "script#9"
> #-} with '$!'.  And similarly in takeEmptySeat.

Let me try these and report my findings.

> (I should say that we definitely plan to update these for STM, but  
> it's
> not completely trivial (I checked).  Volunteers definitely welcome).

I volunteer! Just need some pointers on where to get started. I learn  
quickly but need to be guided ;-). Plus, I need this the most, right?

        Thanks, Joel

--
http://wagerlabs.com/





_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

RE: Battling laziness

Simon Marlow
In reply to this post by Joel Reymont
On 16 December 2005 12:42, Joel Reymont wrote:

> On Dec 16, 2005, at 12:36 PM, Simon Marlow wrote:
>
>> If script#9 is the cost center attached to all of your leaking heap
>> data, then you're already a long way to finding the problem.  It'll
>> help even more to find out whether it is just unevaluated copies of
>> "takeEmptySeat Holdem affid []", or something else (-hd, -hy will
>> help here).  Try +RTS -hy -hcscript#9, for example.
>>
>> One obvious thing to try is replacing the '$' before {-# SCC
>> "script#9" #-} with '$!'.  And similarly in takeEmptySeat.
>
> Let me try these and report my findings.
>
>> (I should say that we definitely plan to update these for STM, but
>> it's not completely trivial (I checked).  Volunteers definitely
>> welcome).
>
> I volunteer! Just need some pointers on where to get started. I learn
> quickly but need to be guided ;-). Plus, I need this the most, right?

I was slightly mistaken: lag/drag/void profiling is pretty easy.  Take a
look at ghc/rts/LdvProfile.c and add relevant cases for STM objects to
processHeapClosureForDead().  If you fix this up and test it we should
be able to get it into 6.4.2.  Retainer profiling is much harder; the
code is in RetainerProfile.c/RetainerSet.c.

Cheers,
        Simon
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Battling laziness

Bulat Ziganshin
In reply to this post by Joel Reymont
Hello Joel,

Friday, December 16, 2005, 2:44:00 PM, you wrote:

JR> I have a huge space leak someplace and I suspect this code. The
JR> SrvServerInfo data structure is something like 50K compressed or  
JR> uncompressed byte data before unpickling. My thousands of bots issue  
JR> this request at least once and I almost run out of memory with 100  
JR> bots on a 1Gb machine on FreeBSD. Do I need deepSeq somewhere below?

1. try to use 3-generations GC. this may greatly help in reducing GC
times

2. manually add {-# UNPACK #-} to all simple fields (ints, words,
chars). don't use "-f-unbox-strict-fields" because it can unbox whole
structures instead of sharing them

3. in my experience, it's enough to mark all fields in massively used
structures as strict and then eval highest level of such structures
(using "return $! x"). after that the whole structure will be fully
evaluated. but when you use a list, you must either manually eval whole
list (using "return $! length xs") or use DeepSeq, as you suggest,
because lists remain unevaluated depite all these sctrictness annotations

4. you can try to use packed strings or unboxed arrays instead of
lists. in my experience this can greatly reduce GC time just because
this array don't need to be scanned on each GC

5. what is the "uncompress" function here? can i see its code?

6. why EACH bot receives and processes this 50k structure itself?
can't that be done only one time for all?


JR>      do let tables = filter (tableMatches filters) $ activeTables cmd
JR>             ids = map tiTableID tables
JR>                  return $! Eat $! Just $! Custom $! Tables $! ids

here `ids` definitely will be unevaluated, except for the first
element. add "return $! length ids" before the last line

ps: last week i also fight against memory requirements of my own
program. as a result, they was reduced 3-4 times :)



--
Best regards,
 Bulat                            mailto:[hidden email]



_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Battling laziness

Joel Reymont
In reply to this post by Simon Marlow
Simon,

I'm approaching this methodically, as you are suggesting. I re-ran  
the program with -hc again and got the following. I suppose it tells  
me that I need to investigate launchScripts#8.

COST CENTRE                    MODULE               %time %alloc
launchScripts#8                Main                  85.7   86.0
takeEmptySeat#8                Snippets               8.0    7.0
CAF                            Main                   4.1    5.9

{-# SCC "launchScripts#8" #-}launch host $ script (bot, bot, affid)

I added some strictness and ran again

{-# SCC "launchScripts#8" #-}launch host $! script (bot, bot, affid)

COST CENTRE                    MODULE               %time %alloc
launchScripts#8                Main                  81.0   81.6
takeEmptySeat#8                Snippets              12.1    9.2
CAF                            Main                   5.1    8.3

Did $! make a difference of 4%? I'm running -hy -hclaunchScripts#8 now.

I ran ./randomplay +RTS -p -hy -hclaunchScripts#8, results at http://
wagerlabs.com/randomplay1.tgz results from -hc -hclaunchScripts#8 at  
http://wagerlabs.com/randomplay2.tgz

COST CENTRE                    MODULE               %time %alloc
launchScripts#8                Main                  92.1   92.0
takeEmptySeat#8                Snippets               4.3    5.1
CAF                            Main                   2.1    1.9

What do the "by type" (-hy) results tell you and how should I proceed?

        Thanks, Joel

--
http://wagerlabs.com/





_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Battling laziness

Joel Reymont
In reply to this post by Simon Marlow
The result of ./randomplay +RTS -p -hd -hclaunchScripts#8 is at

http://wagerlabs.com/randomplay.hd.ps

        Thanks, Joel

--
http://wagerlabs.com/





_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Battling laziness

Joel Reymont
In reply to this post by Simon Marlow
Looking at http://wagerlabs.com/randomplay.hd.ps I see closures  
(constructors?) in this order

<Script.Array.sat_s46N>
W8#
I#
<Script.Array.fromIntegral_s453>
<Script.Endian.sat_s1WxM>
:
<Script.Endian.sat_s1WF2>
W16#
<Script.PicklePlus.sat_s38YS>
stg_ap_2_upd_info

This tells me it's something having to do with array code. I'm  
attaching the Script.Array module at the end. This report does not  
tell me who is retaining the data, though.

Looking at http://wagerlabs.com/randomplay.hy.ps I see types ordered  
like this

*
Word8
Int
->*
[]
Char
Word16
TableInfo

What do I make of all these?

This is Script.Array:

--
module Script.Array where

import Data.Array.IO
import Data.Array.Unboxed
import Foreign hiding (newArray)
import Foreign.Ptr

type MutByteArray = IOUArray Int Word8
type ByteArray = UArray Int Word8
type Index = Int

arraySize :: HasBounds a =>  a Int e -> Int
arraySize a = (snd (bounds a)) + 1

emptyByteArray :: Int -> IO MutByteArray
emptyByteArray sz = newArray (0, sz - 1) 0

mkPureArray :: MutByteArray -> IO ByteArray
mkPureArray array = freeze array

copyMArray :: MutByteArray -> Index -> MutByteArray -> Index -> Int -
 > IO ()
copyMArray _ _ _ _ 0 = return ()
copyMArray dest ix src src_ix n =
     do e <- readArray src src_ix
        writeArray dest ix e
        copyMArray dest (ix + 1) src (src_ix + 1) (n - 1)

copyIArray :: MutByteArray -> Index -> ByteArray -> Index -> Int ->  
IO ()
copyIArray _ _ _ _ 0 = return ()
copyIArray dest ix src src_ix n =
     do let e = src ! src_ix
        writeArray dest ix e
        copyIArray dest (ix + 1) src (src_ix + 1) (n - 1)

readBits :: forall a.(Num a, Bits a) => MutByteArray -> Index -> IO a
readBits array ix =
     readBits' array ix bitsize 0
         where bitsize = bitSize (undefined :: a)
               readBits' _ _ 0 acc = return acc
               readBits' array ix count acc =
                   do e <- readArray array ix
                      let e' = (fromIntegral e) `shiftL` (count - 8)
                      readBits' array (ix + 1) (count - 8) (acc + e')

writeBits :: (Integral a, Bits a) => MutByteArray -> Index -> a -> IO ()
writeBits array ix a =
     writeBits' array ix (bitSize a)
         where writeBits' _ _ 0 = return ()
               writeBits' array ix count =
                   do let mask = 0xff `shiftL` (count - 8)
                          a' = (a .&. mask) `shiftR` (count - 8)
                          a'' = fromIntegral a'
                      writeArray array ix a''
                      writeBits' array (ix + 1) (count - 8)

withByteArray :: ByteArray -> (Ptr Word8 -> IO a) -> IO a
withByteArray array fun =
     do let size = arraySize array
        allocaBytes size $ \ptr ->
            do copyBytes ptr array 0 size
               fun ptr
            where copyBytes _ _ _ 0 = return ()
                  copyBytes ptr arr ix sz =
                      do poke ptr (arr ! ix)
                         copyBytes (advancePtr ptr 1) arr (ix + 1)  
(sz - 1)

byteArrayFromPtr :: Ptr Word8 -> Int -> IO MutByteArray
byteArrayFromPtr ptr sz =
     do array <- emptyByteArray sz
        copyBytes array ptr 0 sz
        return array
            where copyBytes _ _ _ 0 = return ()
                  copyBytes array ptr ix n =
                      do e <- peek ptr
                         writeArray array ix e
                         copyBytes array (advancePtr ptr 1) (ix + 1)  
(n - 1)

instance Show MutByteArray where
     show a = show $ unsafePerformIO $ getElems a

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

RE: Battling laziness

Simon Marlow
In reply to this post by Joel Reymont
On 16 December 2005 15:23, Joel Reymont wrote:

> Looking at http://wagerlabs.com/randomplay.hd.ps I see closures
> (constructors?) in this order
>
> <Script.Array.sat_s46N>
> W8#
> I#
> <Script.Array.fromIntegral_s453>
> <Script.Endian.sat_s1WxM>
>>
> <Script.Endian.sat_s1WF2>
> W16#
> <Script.PicklePlus.sat_s38YS>
> stg_ap_2_upd_info

Ok, so your heap is mainly full of (a) thunks generated by something in
Script.Array, (b) Word8s, and (c) Ints.

> This tells me it's something having to do with array code. I'm
> attaching the Script.Array module at the end. This report does not
> tell me who is retaining the data, though.
>
> Looking at http://wagerlabs.com/randomplay.hy.ps I see types ordered
> like this
>
> *
> Word8
> Int
> ->*
> []
> Char
> Word16
> TableInfo

interesting... Word8 and Int correspond to the -hd output above, but '*'
indicates that the type of the <Script.Array.sat_s46N> is polymorphic.
Completely polymorphic closures like this are usually (error
"something"), which is a silly thing to fill up your heap with :-)

I'm a bit mystified though, because looking at the code for
Script.Array, all your arrays are unboxed, so I don't know where all the
Word8s and Ints are coming from.  It might be useful to do "+RTS
-hyWord8 -hc" to see who generated the Word8s.  Oh, and it looks like
you aren't doing -auto-all, that would probably be helpful.

Cheers,
        Simon
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Battling laziness

Joel Reymont
On Dec 16, 2005, at 3:47 PM, Simon Marlow wrote:

> interesting... Word8 and Int correspond to the -hd output above,  
> but '*'
> indicates that the type of the <Script.Array.sat_s46N> is polymorphic.
> Completely polymorphic closures like this are usually (error
> "something"), which is a silly thing to fill up your heap with :-)

Hmm... I'm attaching the pickling code that I use at the end,  
together with a sample of how I use it to pickle/unpickle SrvServerInfo.

> I'm a bit mystified though, because looking at the code for
> Script.Array, all your arrays are unboxed, so I don't know where  
> all the
> Word8s and Ints are coming from.  It might be useful to do "+RTS
> -hyWord8 -hc" to see who generated the Word8s.

I will do it. Why bother with Word8, though? Shouldn't I be looking  
for the polymorphic closures instead?

>   Oh, and it looks like
> you aren't doing -auto-all, that would probably be helpful.

I compile like this:

ghc -O --make -prof -auto-all randomplay.hs -o randomplay -lssl -
lcrypto -lz

and run like this:

./randomplay +RTS -p -hd -hclaunchScripts#8

Did I miss -auto-all somewhere?

I have Cabal 1.1.4 and I give configure the -p option which builds  
the profiled libraries for me. Do I need to separately give -auto-all  
to the compiler below

ghc-options: -fglasgow-exts -Wall -threaded -fno-warn-name-shadowing

        Thanks, Joel

----
{-# OPTIONS_GHC -fglasgow-exts -fth #-}
module Script.Pickle where

import Data.Word
import Data.Int
import Data.Bits
import Data.Char
import Data.Maybe
import Data.Array.MArray
import Script.Array
import Control.Monad

data PU a = PU
     {
      appP :: MutByteArray -> Index -> a -> IO Index,
      appU :: MutByteArray -> Index -> IO (a, Index),
      appS :: a -> IO Int
     }

pickle :: PU a -> MutByteArray -> Index -> a -> IO Index
pickle p array ix value = appP p array ix value

unpickle :: PU a -> MutByteArray -> Index -> IO (a, Index)
unpickle p array ix = appU p array ix

sizeup :: PU a -> a -> IO Int
sizeup p value = appS p value

lift :: a -> PU a
lift x = PU (\_ ix _ -> return ix) (\_ ix -> return (x, ix)) (\_ ->  
return 0)

sequ :: (b -> a) -> PU a -> (a -> PU b) -> PU b
sequ f pa k = PU
               (\array ix b ->
                    do let a = f b
                           pb = k a
                       ix1 <- appP pa array ix a
                       appP pb array ix1 b)
               (\array ix ->
                    do (a, ix1) <- appU pa array ix
                       let pb = k a
                       appU pb array ix1)
               (\b ->
                    do let a = f b
                           pb = k a
                       sz1 <- appS pa a
                       sz2 <- appS pb b
                       return $ sz1 + sz2)

pair :: PU a -> PU b -> PU (a,b)
pair pa pb = sequ fst pa (\ a -> sequ snd pb
                           (\ b -> lift (a, b)))

triple :: PU a -> PU b -> PU c -> PU (a, b, c)
triple pa pb pc = sequ (\ (x, _, _) -> x) pa
                   (\a -> sequ (\ (_, y, _) -> y) pb
                    (\b -> sequ (\ (_, _, z) -> z) pc
                     (\c -> lift (a, b, c))))

quad :: PU a -> PU b -> PU c -> PU d -> PU (a, b, c, d)
quad pa pb pc pd = sequ (\ (w, _, _, _) -> w) pa
                    (\a -> sequ (\ (_, x, _, _) -> x) pb
                     (\b -> sequ (\ (_, _, y, _) -> y) pc
                      (\c -> sequ (\ (_, _, _, z) -> z) pd
                       (\d -> lift (a, b, c, d)))))

wrap :: (a -> b, b -> a) -> PU a -> PU b
wrap (i, j) pa = sequ j pa (lift . i)

unit :: PU ()
unit = lift ()

{-# SPECIALIZE num :: PU Word8 #-}
{-# SPECIALIZE num :: PU Word16 #-}
{-# SPECIALIZE num :: PU Word32 #-}
{-# SPECIALIZE num :: PU Word64 #-}
{-# SPECIALIZE num :: PU Int16 #-}
{-# SPECIALIZE num :: PU Int32 #-}

num :: (Integral a, Bits a) => PU a
num = PU appP_num appU_num (return . byteSize)

char :: PU Char
char = wrap (fromByte, toByte) num

bool :: PU Bool
bool = wrap (toenum, fromenum) byte

enum :: (Integral a, Bits a, Enum b) => PU a -> PU b
enum pa = wrap (toenum, fromenum) pa

byte :: PU Word8
byte = num

short :: PU Word16
short = num

uint :: PU Word32
uint = num

fixlist :: PU a -> Int -> PU [a]
fixlist _ 0 = lift []
fixlist pa n = wrap (\(a, b) -> a : b,
                      \(a : b) -> (a, b))
                (pair pa (fixlist pa (n - 1)))

list :: (Integral a, Bits a) => PU a -> PU b -> PU [b]
list pa pb = sequ (fromIntegral . length) pa (\a -> fixlist pb  
(fromIntegral a))

alt :: (a -> Word8) -> [PU a] -> PU a
alt tag ps = sequ tag byte (((!!) ps) . fromIntegral)

optional :: PU a -> PU (Maybe a)
optional pa = alt tag [lift Nothing, wrap (Just, fromJust) pa]
     where tag Nothing = 0; tag (Just _) = 1

chunk :: Integral a => PU a -> PU ByteArray
chunk pa = sequ
            (fromIntegral . (+ 1) . snd . bounds)
            pa
            (\a -> bytearray $ fromIntegral a)

bytearray :: Int -> PU ByteArray
bytearray sz = PU
                (\array ix a ->
                     do let count = (snd $ bounds a) + 1
                        copyIArray array ix a 0 count
                        return $ ix + sz)
                (\array ix ->
                     do new <- emptyByteArray sz
                        copyMArray new 0 array ix sz
                        pure <- freeze new
                        return (pure, ix + sz))
                (\a -> return $ (snd $ bounds a) + 1)

--- Basic implementation

byteSize :: forall a.(Num a, Bits a) => a -> Int
byteSize a = bitSize a `div` 8

appP_num :: (Num a, Integral a, Bits a) => MutByteArray -> Index -> a  
-> IO Index
appP_num array ix a =
     do writeBits array ix a
        return $ ix + byteSize a

appU_num :: (Num a, Integral a, Bits a) => MutByteArray -> Index ->  
IO (a, Index)
appU_num array ix =
     do a <- readBits array ix
        return (a, ix + byteSize a)

--- Utility

toenum :: forall a b.(Enum a, Integral b) => b -> a
toenum = toEnum . fromIntegral

fromenum :: forall b a. (Num b, Enum a) => a -> b
fromenum = fromIntegral . fromEnum

fromByte :: Enum a => Word8 -> a
fromByte = toEnum . fromIntegral

toByte ::  Enum a => a -> Word8
toByte = fromIntegral . fromEnum

And I use it like this:

puTableInfo :: PU TableInfo
puTableInfo =
     sequ tiAvgPot endian64
          (\a -> sequ tiNumPlayers endian16
           (\b -> sequ tiWaiting endian16
            (\c -> sequ tiPlayersFlop byte
             (\d -> sequ tiTableName wstring
              (\e -> sequ tiTableID endian32
               (\f -> sequ tiGameType (enum endian16 :: PU GameType)
                (\g -> sequ tiInfoMaxPlayers endian16
                 (\h -> sequ tiIsRealMoneyTable bool
                  (\i -> sequ tiLowBet endian64
                   (\j -> sequ tiHighBet endian64
                    (\k -> sequ tiMinStartMoney endian64
                     (\l -> sequ tiMaxStartMoney endian64
                      (\m -> sequ tiGamesPerHour endian16
                       (\n -> sequ tiTourType (enum byte)
                        (\o -> sequ tiTourID endian32
                         (\p -> sequ tiBetType (enum byte)
                          (\q -> sequ tiCantReturnLess endian32
                           (\r -> sequ tiAffiliateID (list endian32  
byte)
                            (\v -> sequ tiLangID endian32
                             (\w -> lift $
                                    TableInfo a b c d e f g
                                              h i j k l m n
                                                o p q r v w
                                ))))))))))))))))))))

--
http://wagerlabs.com/





_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Battling laziness

Joel Reymont
In reply to this post by Simon Marlow

On Dec 16, 2005, at 3:47 PM, Simon Marlow wrote:

> Ok, so your heap is mainly full of (a) thunks generated by  
> something in
> Script.Array, (b) Word8s, and (c) Ints.

Would it be worth investigaiting who is holding on to them?

> interesting... Word8 and Int correspond to the -hd output above,  
> but '*'
> indicates that the type of the <Script.Array.sat_s46N> is polymorphic.
> Completely polymorphic closures like this are usually (error
> "something"), which is a silly thing to fill up your heap with :-)

So what do I do then? If I add cost center annotations to  
Script.Array, will they show up in the -hd report?

        Thanks, Joel

--
http://wagerlabs.com/





_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Battling laziness

Joel Reymont
In reply to this post by Simon Marlow

On Dec 16, 2005, at 3:47 PM, Simon Marlow wrote:

> I'm a bit mystified though, because looking at the code for
> Script.Array, all your arrays are unboxed, so I don't know where  
> all the
> Word8s and Ints are coming from.  It might be useful to do "+RTS
> -hyWord8 -hc" to see who generated the Word8s.

Done. http://wagerlabs.com/randomplay.word8.ps

        {-# SCC "launchScripts#8" #-}launch host $! script (bot, bot,  
affid)

The xx, xx, are Word8. affiliateIDs is all Word8 and looks like this:

affiliateIDs = [ [xx,xx,xx,xx,xx,xx,xx],
                  99 more like the above ]

I guess the whole affid list of lists is being pulled into script?  
How do I prevent this?

-----
launchScripts  :: Int
                -> NamePick
                -> TMVar (ClockTime, (Event CustomEvent))
                -> IO ()
launchScripts 0 _ _ = return ()
launchScripts n pick mbx =
     do n' <- case pick of
                Random -> {-# SCC "launchScripts#1" #-}liftIO $  
randomRIO (0, 8500)
                Straight -> {-# SCC "launchScripts#2" #-}return n
        let botnum = {-# SCC "launchScripts#3" #-}firstbot + n'
            bot = {-# SCC "launchScripts#4" #-}"m" ++ show botnum
            cell = {-# SCC "launchScripts#5" #-}botnum `mod` 100 - 1
            affid = {-# SCC "launchScripts#6" #-}if cell == -1
                       then [xx,xx,xx,xx,xx,xx,xx]
                       else affiliateIDs !! cell
        {-# SCC "launchScripts#7" #-}trace_ $ "Launching bot..." ++  
show n
        {-# SCC "launchScripts#8" #-}launch host $! script (bot, bot,  
affid)
        {-# SCC "launchScripts#9" #-}liftIO $ sleep_ 1000
        -- quit if we have been told to
        empty <- {-# SCC "launchScripts#10" #-}atomically $  
isEmptyTMVar mbx
        {-# SCC "launchScripts#11" #-}unless empty $ trace_  
"launchScripts: Done, exiting"
        {-# SCC "launchScripts#12" #-}when empty $ launchScripts (n -  
1) pick mbx


--
http://wagerlabs.com/





_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Battling laziness

Joel Reymont
In reply to this post by Simon Marlow
Most of the samples in randomplay.hp look like this:

BEGIN_SAMPLE 1.76
(170)Script.Array.CAF   8
(154)Script.CmdType.CAF 64
(165)Script.PickleCmd.CAF       760
(197)Script.PokerClient.CAF     8
(156)Script.Command.CAF 24
(282)Main.CAF   285752
(163)Script.Pickle.CAF  16
(311)/launchScripts#8/laun...   93464
END_SAMPLE 1.76

I'm pickling to/from unboxed arrays of Word8

type MutByteArray = IOUArray Int Word8
type ByteArray = UArray Int Word8
type Index = Int

CmdType is (Word8, Word8) that tells me what pickler to use.

PickleCmd looks like this:

puCommand :: (Word8, Word8) -> PU Command

puCommand (116, 2) =
     sequ tableID endian32
          (\a -> sequ password wstring
           (\b -> sequ localIP wstring
            (\c -> sequ affiliateID (list endian32 byte)
             (\d -> lift $
                    ClConnectGame a b c d
              ))))

puCommand (36, 1) =
...

Command has about 250 constructors for the different records that can  
be send/received. These records can be somewhat nested and have lists  
of other records inside them. Like SrvServerInfo. Could this be where  
the polymorphism is coming from, i.e. the "*" are my Commands that  
are being unpickled? Fields in command all have strictness  
annotations, btw.

        Thanks, Joel

--
http://wagerlabs.com/





_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Battling laziness

Joel Reymont
In reply to this post by Simon Marlow
On Dec 16, 2005, at 3:47 PM, Simon Marlow wrote:

> Oh, and it looks like
> you aren't doing -auto-all, that would probably be helpful.

Apparently, when you give -p to configure (with Cabal 1.1+) it does  
add -prof but does not add -auto-all. I added this to my cabal file  
and my profiling suddenly bloomed! Now I really have something to  
chew on!

COST CENTRE                    MODULE               %time %alloc
byteArrayFromPtr               Script.Array          34.1   34.7
readBits                       Script.Array          32.3   36.2
appU_endian                    Script.Endian          5.7    3.2
sequ                           Script.Pickle          5.3    3.7
emptyByteArray                 Script.Array           5.3    4.5
appU_num                       Script.Pickle          3.6    4.0
copyMArray                     Script.Array           2.4    2.7
bytearray                      Script.Pickle          1.9    2.6
appU_wstr                      Script.Endian          1.7    0.8
withByteArray                  Script.Array           1.4    1.7
byteSize                       Script.Pickle          1.1    0.9
puTableInfo                    Script.PicklePlus      0.6    1.3

It makes me wonder how I managed to convert pickling to mutable  
arrays from [Word8] without complete profiling info! The memory hogs  
are at http://wagerlabs.com/randomplay.autohc.ps

        Joel

--
http://wagerlabs.com/





_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe