Noobie attempt to process log output into dependency graph

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

Noobie attempt to process log output into dependency graph

John Lusk-2
Hi, all,

Here's my question:

I thought, for grins, I'd try to turn some log output into a dependency graph (using GraphViz's dot(1)). I'm having difficulty forcing my stateful paradigm into a functional one, so I need some help.

If I was to do this with an imperative (stateful) language, I'd build a set of edges (or a map to a frequency count, really, since I'll use freq > 1 to add some output text noting the repeated occurrences), and then dump out the set elements to a text file that would look something like this fragment:

a -> q
q -> d
d -> e [color=red]
d -> f [color=red

My big problem now is that if I process a subtree that looks like:

a
  b
    c
    d
  b
    d
    e

my current plan is to proces the first b-c-d subtree and then process the b-d-e subtree, BUT I need to pass the updated edge set to the second processing call, which is pretty stateful.

Do I need to just bite the bullet and find some succinct way to do that, or is my entire approach just wrong, stuck in my stateful mindset?

My (awful) code looks like this:

-- Emit to stdout a series of dot(1) edges specifying dependencies.
-- "A -> B" means "A depends on B".
--
-- Build with 'ghc dependency-graph.hs'
-- 
-- Input is a text file containing lines as follows:
--      (some indentation) (some extraneous text) (file-A) in (some directory)
--          (some extra indentation) (some extraneous text) (file-B) in (some directory)
--      (some indentation matching the first line above) (some extraneous text) (file-C) in (some directory)
--
-- This means that file-A depends on file-B, but neither file-A nor file-B depend on file-C.
--
-- Sample:
--    Helios.MigrationTool.Common.AssemblyUtils.GetAssemblyList() Information: 0 : Processing SXA.Compass.Config.ViewModel.dll  in C:\Program Files (x86)\Allscripts Sunrise\Clinical Manager Client\7.2.5575.0\
--    Helios.MigrationTool.Common.AssemblyUtils.GetAssemblyList() Information: 0 : Adding C:\Program Files (x86)\Allscripts Sunrise\Clinical Manager Client\7.2.5575.0\SXA.Compass.Config.ViewModel.dll (IsPresent=true)        to assemblyList at beginning of GetAssemblyListEx()
--      Helios.MigrationTool.Common.AssemblyUtils.GetAssemblyList() Information: 0 : Processing SXA.Compass.Config.Utils.dll    in C:\Program Files (x86)\Allscripts Sunrise\Clinical Manager Client\7.2.5575.0\
--
-- (Need to skip the line containing "Adding", and only process the ones containing "Processing".)
-- 
-- Algorithm:
--      Read first line, parse, remember indentation
--      Repeat for other lines, but if indentation increases, store pair A -> B in hashset.
--      At end, dump out hashset.

-- import Debug.Trace
-- import System.Environment
-- import System.Console.GetOpt
-- import Data.Maybe (fromMaybe)
-- import Data.List.Split
import Prelude -- hiding (readFile) -- Because we want the System.IO.Strict version
-- import System.IO (hPutStr, hPutStrLn, stderr)
-- import System.IO.Strict
-- import Control.Monad
-- import System.Directory
-- import System.FilePath
import Text.Regex.TDFA
-- import Text.Regex.TDFA.String
-- import Text.Printf

-- import qualified Data.Map.Lazy as Map
import qualified Data.Map.Strict as Map

---------------------------------------------------------------- Test Data
l1 = "    Helios.MigrationTool.Common.AssemblyUtils.GetAssemblyList() Information: 0 : Processing SXA.Compass.Config.ViewModel.dll\tin C:\\Program Files (x86)\\Allscripts Sunrise\\Clinical Manager Client\\7.2.5575.0\\"
l2 = "    Helios.MigrationTool.Common.AssemblyUtils.GetAssemblyList() Information: 0 : Adding C:\\Program Files (x86)\\Allscripts Sunrise\\Clinical Manager Client\\7.2.5575.0\\SXA.Compass.Config.ViewModel.dll\t(IsPresent=true)\tto assemblyList at beginning of GetAssemblyListEx()"
l3 = "      Helios.MigrationTool.Common.AssemblyUtils.GetAssemblyList() Information: 0 : Processing SXA.Compass.Config.Utils.dll\tin C:\\Program Files (x86)\\Allscripts Sunrise\\Clinical Manager Client\\7.2.5575.0\\"
---------------------------------------------------------------- Test Data Ends
-- See http://stackoverflow.com/q/32149354/370611
-- toRegex = makeRegexOpts defaultCompOpt{multiline=False} defaultExecOpt

-- Escape parens?
-- initialFillerRegex :: String
-- initialFillerRegex = "Helios.MigrationTool.Common.AssemblyUtils.GetAssemblyList\\(\\) Information: 0 : Processing"

-- Regex matching (marking) a line to be processed
-- valuableLineRegex :: String
-- valuableLineRegex = "\\bProcessing\\b"

-- |Regex matching line to be parsed
parseLineRegex :: String
parseLineRegex = "^(.* Information: 0 : Processing )([^ ]*)[ \t]+in (.*)" -- 3subexpressions

main :: IO()
main = do
  logContents <- getContents
  putStrLn $ unlines $ fst $ edges (parseIndent $ lines logContents) Map.empty

----------------------------------------------------------------
-- |Parses out the leading indentation of the given String into a string of spaces and the rest of the line
parseIndent :: String -> (String,String)
parseIndent s = ((fourth $ (s =~ "^( *)(.*)" :: (String,String,String,[String]))) !! 0,
                 (fourth $ (s =~ "^( *)(.*)" :: (String,String,String,[String]))) !! 1)

----------------------------------------------------------------
-- |Returns a list of strings describing edges in the form "a -> b /* comment */"
edges :: 
  [(String,String)]             -- ^ Input tuples: (indent, restOfString)
  -> Map.Map String Int -- ^ Map of edges in form "a -> b" with a count of the number of times that edge occurs
  -> [String]           -- ^ Output list of edge descriptions in form "a -> b optionalExtraText"

edges [] edgeSet =
  (edgeDump $ Map.assocs edgeSet, 0)

edges (lastLine:[]) edgeSet =
  (edgeDump $ Map.assocs edgeSet, 1)

edges (fstLogLine:sndLogLine:[]) edgeSet =
  let fstFields = (snd fstLogLine) =~ parseLineRegex :: (String,String,String,[String])
      sndFields = (snd sndLogLine) =~ parseLineRegex :: (String,String,String,[String])
  in
    if length (fourth fstFields) == 0
    then error ("Unmatched: " ++ (first fstFields)) -- First line must always match
    else if length (fourth sndFields) == 0 -- "Adding", not "Processing"
    then edges (fstLogLine:[]) edgeSet -- Skip useless line
    else if indentLength fstLogLine >= indentLength sndLogLine
    then edges (sndLogLine:[]) edgeSet -- Can't be an edge from first to second line; drop first line and keep going.
    else edges (sndLogLine:[])
         (Map.insertWith (+) ((fullName fstFields) ++ (fullName sndFields)) 1)

edges (fstLogLine:sndLogLine:thdLogLine:logLines) edgeSet =
  let fstFields = (snd fstLogLine) =~ parseLineRegex :: (String,String,String,[String])
      sndFields = (snd sndLogLine) =~ parseLineRegex :: (String,String,String,[String])
      thdFields = (snd thdLogLine) =~ parseLineRegex :: (String,String,String,[String])
  in
    if length (fourth fstFields) == 0
    then error ("Unmatched: " ++ (first fstFields)) -- First line must always match

    else if length (fourth sndFields) == 0 -- "Adding", not "Processing"
    then edges (fstLogLine:thdLogLine:logLines) edgeSet -- Skip useless line

    else if indentLength fstLogLine >= indentLength sndLogLine
    then []                     -- Stop processing at outdent

    else
      -- Looking one of:
      --       1
      --          2 -- process 1 -> 2, then process 2.. as subtree
      --             3 -- Need to process as subtree rooted at 2, then drop subtree (zero or more lines at same level as 3)
      -- or
      --       1
      --          2 -- processs, then drop this line (process 2.. as empty subtree?)
      --          3
      -- or
      --       1
      --          2 -- process, then drop this line (drop entire subtree rooted at 1) (same as above, drop empty subtree? (2))
      --       3
      -- or
      --       1
      --          2 -- same as above? Drop empty subtree rooted at 2
      --    3
      edges (sndLogLine:thdLogLine:logLines) (Map.insertWith (+) ((fullName fstFields) ++ (fullName sndFields)) 1) -- now what? I need to pass the UPDATED edgeSet on to the next call, after the subtree rooted at 2 is dropped.



    then edges (sndLogLine:logLines) edgeSet -- Can't be an edge from first to second line; drop first line and keep going.
    else edges (sndLogLine:(takeWhile (increasingIndent $ length $ fst fstLogLine) logLines))
         (Map.insertWith (+) ((fullName fstFields) ++ (fullName sndFields)) 1)
    else ((fst $ edges (sndLogLine:logLines) edgeSet)
           ++ (fst $ edges (fstLogLine:(drop
                                        (snd $ edges (sndLogLine:logLines) edgeSet) -- # of lines processed
                                        logLines)) edgeSet),
          (snd $ edges (sndLogLine:logLines) edgeSet)
          + (snd $ edges (fstLogLine:(drop
                                      (snd $ edges (sndLogLine:logLines) edgeSet) -- # of lines processed
                                      logLines)) edgeSet)
         )

----------------------------------------------------------------
fullname :: (String,String,String,[String]) -> String
fullname (_,_,_,[_,fileName,directoryName]) = directoryName ++ fileName

----------------------------------------------------------------
-- |Edges from the first line to all following lines
edgesFrom :: String             -- ^ First line
  -> [String]                   -- ^ Following lines
  -> Map.Map String Int         -- ^ Set of edges built so far
  -> [String]
edgesFrom a b c = []

----------------------------------------------------------------
-- |Return length of indent or error
indentLength :: (String,String,String,[String]) -- ^ Regex match context
  -> Int                                        -- ^ Length of indent
indentLength (prefix,_,_,[]) = error $ "Not matched: " ++ prefix
indentLength (_,_,_,subexprs) =
  length $ subexprs !! 0

----------------------------------------------------------------
-- |Returns a list of edges, possibly with comments indicating occurrence counts > 1
edgeDump :: [(String,Int)]     -- ^ List of (edge,count) tuples
  -> [String]                  -- ^ List of edges, possibly w/comments
edgeDump [] = []
edgeDump ((edge,count):rest)
  | count <= 1  = edge:(edgeDump rest)
  | otherwise   = (edge ++ " /* " ++ (show count) ++ " occurrences */"):(edgeDump rest)

----------------------------------------------------------------
first :: (a,b,c,d) -> a
first (x,_,_,_) = x

fourth :: (a,b,c,d) -> d
fourth (_,_,_,x) = x



_______________________________________________
Beginners mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

Re: Noobie attempt to process log output into dependency graph

John Lusk-2
(Or you could find it here: https://github.com/JohnL4/DependencyGraph)

On Wed, Dec 14, 2016 at 6:26 PM, John Lusk <[hidden email]> wrote:
Hi, all,

Here's my question:

I thought, for grins, I'd try to turn some log output into a dependency graph (using GraphViz's dot(1)). I'm having difficulty forcing my stateful paradigm into a functional one, so I need some help.

If I was to do this with an imperative (stateful) language, I'd build a set of edges (or a map to a frequency count, really, since I'll use freq > 1 to add some output text noting the repeated occurrences), and then dump out the set elements to a text file that would look something like this fragment:

a -> q
q -> d
d -> e [color=red]
d -> f [color=red

My big problem now is that if I process a subtree that looks like:

a
  b
    c
    d
  b
    d
    e

my current plan is to proces the first b-c-d subtree and then process the b-d-e subtree, BUT I need to pass the updated edge set to the second processing call, which is pretty stateful.

Do I need to just bite the bullet and find some succinct way to do that, or is my entire approach just wrong, stuck in my stateful mindset?

My (awful) code looks like this:

-- Emit to stdout a series of dot(1) edges specifying dependencies.
-- "A -> B" means "A depends on B".
--
-- Build with 'ghc dependency-graph.hs'
-- 
-- Input is a text file containing lines as follows:
--      (some indentation) (some extraneous text) (file-A) in (some directory)
--          (some extra indentation) (some extraneous text) (file-B) in (some directory)
--      (some indentation matching the first line above) (some extraneous text) (file-C) in (some directory)
--
-- This means that file-A depends on file-B, but neither file-A nor file-B depend on file-C.
--
-- Sample:
--    Helios.MigrationTool.Common.AssemblyUtils.GetAssemblyList() Information: 0 : Processing SXA.Compass.Config.ViewModel.dll  in C:\Program Files (x86)\Allscripts Sunrise\Clinical Manager Client\7.2.5575.0\
--    Helios.MigrationTool.Common.AssemblyUtils.GetAssemblyList() Information: 0 : Adding C:\Program Files (x86)\Allscripts Sunrise\Clinical Manager Client\7.2.5575.0\SXA.Compass.Config.ViewModel.dll (IsPresent=true)        to assemblyList at beginning of GetAssemblyListEx()
--      Helios.MigrationTool.Common.AssemblyUtils.GetAssemblyList() Information: 0 : Processing SXA.Compass.Config.Utils.dll    in C:\Program Files (x86)\Allscripts Sunrise\Clinical Manager Client\7.2.5575.0\
--
-- (Need to skip the line containing "Adding", and only process the ones containing "Processing".)
-- 
-- Algorithm:
--      Read first line, parse, remember indentation
--      Repeat for other lines, but if indentation increases, store pair A -> B in hashset.
--      At end, dump out hashset.

-- import Debug.Trace
-- import System.Environment
-- import System.Console.GetOpt
-- import Data.Maybe (fromMaybe)
-- import Data.List.Split
import Prelude -- hiding (readFile) -- Because we want the System.IO.Strict version
-- import System.IO (hPutStr, hPutStrLn, stderr)
-- import System.IO.Strict
-- import Control.Monad
-- import System.Directory
-- import System.FilePath
import Text.Regex.TDFA
-- import Text.Regex.TDFA.String
-- import Text.Printf

-- import qualified Data.Map.Lazy as Map
import qualified Data.Map.Strict as Map

---------------------------------------------------------------- Test Data
l1 = "    Helios.MigrationTool.Common.AssemblyUtils.GetAssemblyList() Information: 0 : Processing SXA.Compass.Config.ViewModel.dll\tin C:\\Program Files (x86)\\Allscripts Sunrise\\Clinical Manager Client\\7.2.5575.0\\"
l2 = "    Helios.MigrationTool.Common.AssemblyUtils.GetAssemblyList() Information: 0 : Adding C:\\Program Files (x86)\\Allscripts Sunrise\\Clinical Manager Client\\7.2.5575.0\\SXA.Compass.Config.ViewModel.dll\t(IsPresent=true)\tto assemblyList at beginning of GetAssemblyListEx()"
l3 = "      Helios.MigrationTool.Common.AssemblyUtils.GetAssemblyList() Information: 0 : Processing SXA.Compass.Config.Utils.dll\tin C:\\Program Files (x86)\\Allscripts Sunrise\\Clinical Manager Client\\7.2.5575.0\\"
---------------------------------------------------------------- Test Data Ends
-- See http://stackoverflow.com/q/32149354/370611
-- toRegex = makeRegexOpts defaultCompOpt{multiline=False} defaultExecOpt

-- Escape parens?
-- initialFillerRegex :: String
-- initialFillerRegex = "Helios.MigrationTool.Common.AssemblyUtils.GetAssemblyList\\(\\) Information: 0 : Processing"

-- Regex matching (marking) a line to be processed
-- valuableLineRegex :: String
-- valuableLineRegex = "\\bProcessing\\b"

-- |Regex matching line to be parsed
parseLineRegex :: String
parseLineRegex = "^(.* Information: 0 : Processing )([^ ]*)[ \t]+in (.*)" -- 3subexpressions

main :: IO()
main = do
  logContents <- getContents
  putStrLn $ unlines $ fst $ edges (parseIndent $ lines logContents) Map.empty

----------------------------------------------------------------
-- |Parses out the leading indentation of the given String into a string of spaces and the rest of the line
parseIndent :: String -> (String,String)
parseIndent s = ((fourth $ (s =~ "^( *)(.*)" :: (String,String,String,[String]))) !! 0,
                 (fourth $ (s =~ "^( *)(.*)" :: (String,String,String,[String]))) !! 1)

----------------------------------------------------------------
-- |Returns a list of strings describing edges in the form "a -> b /* comment */"
edges :: 
  [(String,String)]             -- ^ Input tuples: (indent, restOfString)
  -> Map.Map String Int -- ^ Map of edges in form "a -> b" with a count of the number of times that edge occurs
  -> [String]           -- ^ Output list of edge descriptions in form "a -> b optionalExtraText"

edges [] edgeSet =
  (edgeDump $ Map.assocs edgeSet, 0)

edges (lastLine:[]) edgeSet =
  (edgeDump $ Map.assocs edgeSet, 1)

edges (fstLogLine:sndLogLine:[]) edgeSet =
  let fstFields = (snd fstLogLine) =~ parseLineRegex :: (String,String,String,[String])
      sndFields = (snd sndLogLine) =~ parseLineRegex :: (String,String,String,[String])
  in
    if length (fourth fstFields) == 0
    then error ("Unmatched: " ++ (first fstFields)) -- First line must always match
    else if length (fourth sndFields) == 0 -- "Adding", not "Processing"
    then edges (fstLogLine:[]) edgeSet -- Skip useless line
    else if indentLength fstLogLine >= indentLength sndLogLine
    then edges (sndLogLine:[]) edgeSet -- Can't be an edge from first to second line; drop first line and keep going.
    else edges (sndLogLine:[])
         (Map.insertWith (+) ((fullName fstFields) ++ (fullName sndFields)) 1)

edges (fstLogLine:sndLogLine:thdLogLine:logLines) edgeSet =
  let fstFields = (snd fstLogLine) =~ parseLineRegex :: (String,String,String,[String])
      sndFields = (snd sndLogLine) =~ parseLineRegex :: (String,String,String,[String])
      thdFields = (snd thdLogLine) =~ parseLineRegex :: (String,String,String,[String])
  in
    if length (fourth fstFields) == 0
    then error ("Unmatched: " ++ (first fstFields)) -- First line must always match

    else if length (fourth sndFields) == 0 -- "Adding", not "Processing"
    then edges (fstLogLine:thdLogLine:logLines) edgeSet -- Skip useless line

    else if indentLength fstLogLine >= indentLength sndLogLine
    then []                     -- Stop processing at outdent

    else
      -- Looking one of:
      --       1
      --          2 -- process 1 -> 2, then process 2.. as subtree
      --             3 -- Need to process as subtree rooted at 2, then drop subtree (zero or more lines at same level as 3)
      -- or
      --       1
      --          2 -- processs, then drop this line (process 2.. as empty subtree?)
      --          3
      -- or
      --       1
      --          2 -- process, then drop this line (drop entire subtree rooted at 1) (same as above, drop empty subtree? (2))
      --       3
      -- or
      --       1
      --          2 -- same as above? Drop empty subtree rooted at 2
      --    3
      edges (sndLogLine:thdLogLine:logLines) (Map.insertWith (+) ((fullName fstFields) ++ (fullName sndFields)) 1) -- now what? I need to pass the UPDATED edgeSet on to the next call, after the subtree rooted at 2 is dropped.



    then edges (sndLogLine:logLines) edgeSet -- Can't be an edge from first to second line; drop first line and keep going.
    else edges (sndLogLine:(takeWhile (increasingIndent $ length $ fst fstLogLine) logLines))
         (Map.insertWith (+) ((fullName fstFields) ++ (fullName sndFields)) 1)
    else ((fst $ edges (sndLogLine:logLines) edgeSet)
           ++ (fst $ edges (fstLogLine:(drop
                                        (snd $ edges (sndLogLine:logLines) edgeSet) -- # of lines processed
                                        logLines)) edgeSet),
          (snd $ edges (sndLogLine:logLines) edgeSet)
          + (snd $ edges (fstLogLine:(drop
                                      (snd $ edges (sndLogLine:logLines) edgeSet) -- # of lines processed
                                      logLines)) edgeSet)
         )

----------------------------------------------------------------
fullname :: (String,String,String,[String]) -> String
fullname (_,_,_,[_,fileName,directoryName]) = directoryName ++ fileName

----------------------------------------------------------------
-- |Edges from the first line to all following lines
edgesFrom :: String             -- ^ First line
  -> [String]                   -- ^ Following lines
  -> Map.Map String Int         -- ^ Set of edges built so far
  -> [String]
edgesFrom a b c = []

----------------------------------------------------------------
-- |Return length of indent or error
indentLength :: (String,String,String,[String]) -- ^ Regex match context
  -> Int                                        -- ^ Length of indent
indentLength (prefix,_,_,[]) = error $ "Not matched: " ++ prefix
indentLength (_,_,_,subexprs) =
  length $ subexprs !! 0

----------------------------------------------------------------
-- |Returns a list of edges, possibly with comments indicating occurrence counts > 1
edgeDump :: [(String,Int)]     -- ^ List of (edge,count) tuples
  -> [String]                  -- ^ List of edges, possibly w/comments
edgeDump [] = []
edgeDump ((edge,count):rest)
  | count <= 1  = edge:(edgeDump rest)
  | otherwise   = (edge ++ " /* " ++ (show count) ++ " occurrences */"):(edgeDump rest)

----------------------------------------------------------------
first :: (a,b,c,d) -> a
first (x,_,_,_) = x

fourth :: (a,b,c,d) -> d
fourth (_,_,_,x) = x




_______________________________________________
Beginners mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

Re: Noobie attempt to process log output into dependency graph

Magnus Therning
In reply to this post by John Lusk-2

John Lusk <[hidden email]> writes:

> Hi, all,
[.. cut ..]
> http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners

If I understand you correctly you want to parse a set of lines and keep
track of indentation. This is not entirely unlike parsing a programming
language where indentation is significant, like Haskell :) Is that
correct?

A quick look at Hackage gives several libs with combinators dealing with
indentaion-aware parsers. Have you looked at any of them?

/M

--
Magnus Therning              OpenPGP: 0x927912051716CE39
email: [hidden email]   jabber: [hidden email]
twitter: magthe               http://therning.org/magnus

For a successful technology, reality must take precedence over public
relations, for nature cannot be fooled.
     — R.P. Feynman

_______________________________________________
Beginners mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners

signature.asc (847 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Noobie attempt to process log output into dependency graph

John Lusk
I have not, but I might. This was a little work project that I've now run out of time for.

I was really hoping for a deeper discussion of state management than "just use this package." This seems kind of like receiving a stream of inputs from a user and needing to keep track of several items of state that are changing independently (as opposed to the neat problems usually used in basic FP education).

Should I be taking a more monadic approach?

On Thu, Dec 15, 2016 at 5:17 AM, Magnus Therning <[hidden email]> wrote:

John Lusk <[hidden email]> writes:

> Hi, all,
[.. cut ..]
> http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners

If I understand you correctly you want to parse a set of lines and keep
track of indentation. This is not entirely unlike parsing a programming
language where indentation is significant, like Haskell :) Is that
correct?

A quick look at Hackage gives several libs with combinators dealing with
indentaion-aware parsers. Have you looked at any of them?

/M

--
Magnus Therning              OpenPGP: 0x927912051716CE39
email: [hidden email]   jabber: [hidden email]
twitter: magthe               http://therning.org/magnus

For a successful technology, reality must take precedence over public
relations, for nature cannot be fooled.
     — R.P. Feynman

_______________________________________________
Beginners mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners



_______________________________________________
Beginners mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

Re: Noobie attempt to process log output into dependency graph

Imants Cekusins

A graph library of your choice + State monad would do the trick. 
e.g.:  fgl Data-Graph-Inductive-Monad


Or you could store Graph state in an MVar and work as you would with stateful approach

_______________________________________________
Beginners mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

Re: Noobie attempt to process log output into dependency graph

Magnus Therning
In reply to this post by John Lusk

John Lusk <[hidden email]> writes:

> I have not, but I might. This was a little work project that I've now run
> out of time for.
>
> I was really hoping for a deeper discussion of state management than
> "just use this package." This seems kind of like receiving a stream of
> inputs from a user and needing to keep track of several items of state
> that are changing independently (as opposed to the neat problems
> usually used in basic FP education).
>
> Should I be taking a more monadic approach?
Well, we have to start somewhere :)

Anyway, you don't necessarily have to resort to the state monad. I
believe, based you your other code that you quite easily can go from
your list of lines to a list of `(Int, String)`, where the integer
indicates the indentation level. Then you can look at `Data.Tree` (in
containers) and `Data.Tree.Zipper` (in rosezipper) to build your tree.

This is my quick hack:

~~~
buildTree _ zipPos [] = zipPos
buildTree n zipPos xx@((lvl, s):xs)
  | lvl > n =
    let newZipPos = children zipPos
        node = Node s []
    in buildTree lvl (insert node newZipPos) xs
  | lvl == n =
    let newZipPos = nextSpace zipPos
        node = Node s []
    in buildTree lvl (insert node newZipPos) xs
  | lvl < n =
    let (Just newZipPos) = parent zipPos
    in buildTree (n - 1) newZipPos xx
~~~

With the following definitions in place:

~~~
ils = [ (1, "The root")
      , (2, "Child 1")
      , (3, "Child 1.1")
      , (4, "Child 1.1.1")
      , (3, "Child 1.2")
      , (2, "Child 2")
      ]

zipRoot = fromTree $ Node "absolute top" []
~~~

I build the tree, and print it, like this:

~~~
putStrLn $ drawTree $ toTree $ buildTree 0 zipRoot ils
top
|
`- The root
   |
   +- Child 1
   |  |
   |  +- Child 1.1
   |  |  |
   |  |  `- Child 1.1.1
   |  |
   |  `- Child 1.2
   |
   `- Child 2
~~~

Whether this is usable for you depends a lot on how big your logs are, I
suppose.

If this was something that I'd keep around for a while I'd probably
look into rewriting `buildTree` so that it would fit for use with
`mapAccumL`.

/M

--
Magnus Therning              OpenPGP: 0x927912051716CE39
email: [hidden email]   jabber: [hidden email]
twitter: magthe               http://therning.org/magnus

The British have "the perfect temperament to be hackers—technically
skilled, slightly disrespectful of authority, and just a touch of
criminal behavior".
     — Mary Ann Davidson, Oracle's Security Chief

_______________________________________________
Beginners mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners

signature.asc (847 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Noobie attempt to process log output into dependency graph

John Lusk
Thanks, all, that gives me something to chew on.

It occurred to me (during my 45-minute commute to work) that all Haskell programs (listen to the noob <eyeroll/>) have the following structure (modulo my fractured syntax):

main :: IO()
main = do
  inputs <- getInputs
  doOutput $ f inputs initialState

f :: [input] -> state -> outputs

f [] state =
  transformToOutputs state

f (input:inputs) state =
  f inputs (newState state input)

doOutput :: [output] -> IO()

doOutput outputs = do
  putStr $ unlines outputs

So all I have to do is write newState and I'm good! ^_^

(transformToOutputs will, of course, be a snap.)

Right?

John.

On Thu, Dec 15, 2016 at 2:38 PM, Magnus Therning <[hidden email]> wrote:

John Lusk <[hidden email]> writes:

> I have not, but I might. This was a little work project that I've now run
> out of time for.
>
> I was really hoping for a deeper discussion of state management than
> "just use this package." This seems kind of like receiving a stream of
> inputs from a user and needing to keep track of several items of state
> that are changing independently (as opposed to the neat problems
> usually used in basic FP education).
>
> Should I be taking a more monadic approach?

Well, we have to start somewhere :)

Anyway, you don't necessarily have to resort to the state monad. I
believe, based you your other code that you quite easily can go from
your list of lines to a list of `(Int, String)`, where the integer
indicates the indentation level. Then you can look at `Data.Tree` (in
containers) and `Data.Tree.Zipper` (in rosezipper) to build your tree.

This is my quick hack:

~~~
buildTree _ zipPos [] = zipPos
buildTree n zipPos xx@((lvl, s):xs)
  | lvl > n =
    let newZipPos = children zipPos
        node = Node s []
    in buildTree lvl (insert node newZipPos) xs
  | lvl == n =
    let newZipPos = nextSpace zipPos
        node = Node s []
    in buildTree lvl (insert node newZipPos) xs
  | lvl < n =
    let (Just newZipPos) = parent zipPos
    in buildTree (n - 1) newZipPos xx
~~~

With the following definitions in place:

~~~
ils = [ (1, "The root")
      , (2, "Child 1")
      , (3, "Child 1.1")
      , (4, "Child 1.1.1")
      , (3, "Child 1.2")
      , (2, "Child 2")
      ]

zipRoot = fromTree $ Node "absolute top" []
~~~

I build the tree, and print it, like this:

~~~
putStrLn $ drawTree $ toTree $ buildTree 0 zipRoot ils
top
|
`- The root
   |
   +- Child 1
   |  |
   |  +- Child 1.1
   |  |  |
   |  |  `- Child 1.1.1
   |  |
   |  `- Child 1.2
   |
   `- Child 2
~~~

Whether this is usable for you depends a lot on how big your logs are, I
suppose.

If this was something that I'd keep around for a while I'd probably
look into rewriting `buildTree` so that it would fit for use with
`mapAccumL`.

/M

--
Magnus Therning              OpenPGP: 0x927912051716CE39
email: [hidden email]   jabber: [hidden email]
twitter: magthe               http://therning.org/magnus

The British have "the perfect temperament to be hackers—technically
skilled, slightly disrespectful of authority, and just a touch of
criminal behavior".
     — Mary Ann Davidson, Oracle's Security Chief

_______________________________________________
Beginners mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners



_______________________________________________
Beginners mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

Re: Noobie attempt to process log output into dependency graph

Imants Cekusins
> f :: [input] -> state -> outputs

.. or with state monad m it could be a combination of

f :: [input] -> m outputs

f :: [input] -> outputs  - for intermediate results where state is not R/W


state + IO is not too difficult. here is a very good explanation
I understood the last one - Use liftIO - best

_______________________________________________
Beginners mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

Re: Noobie attempt to process log output into dependency graph

John Lusk-2
Thanks!!

John.

On Thu, Dec 15, 2016 at 4:23 PM, Imants Cekusins <[hidden email]> wrote:
> f :: [input] -> state -> outputs

.. or with state monad m it could be a combination of

f :: [input] -> m outputs

f :: [input] -> outputs  - for intermediate results where state is not R/W


state + IO is not too difficult. here is a very good explanation
I understood the last one - Use liftIO - best

_______________________________________________
Beginners mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners



_______________________________________________
Beginners mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

Re: Noobie attempt to process log output into dependency graph

John Lusk-2
Ha! Fixed! And committed to the GitHub repo mentioned previously, if anybody's interested.

I spent too much time on it, but I couldn't let it go and now I have to brag.

John.

On Thu, Dec 15, 2016 at 4:25 PM, John Lusk <[hidden email]> wrote:
Thanks!!

John.

On Thu, Dec 15, 2016 at 4:23 PM, Imants Cekusins <[hidden email]> wrote:
> f :: [input] -> state -> outputs

.. or with state monad m it could be a combination of

f :: [input] -> m outputs

f :: [input] -> outputs  - for intermediate results where state is not R/W


state + IO is not too difficult. here is a very good explanation
I understood the last one - Use liftIO - best

_______________________________________________
Beginners mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners




_______________________________________________
Beginners mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

Re: Noobie attempt to process log output into dependency graph

Magnus Therning
In reply to this post by John Lusk

John Lusk <[hidden email]> writes:

> Thanks, all, that gives me something to chew on.
>
> It occurred to me (during my 45-minute commute to work) that all
> Haskell programs (listen to the noob <eyeroll/>) have the following
> structure (modulo my fractured syntax):
>
> main :: IO()
> main = do
>   inputs <- getInputs
>   doOutput $ f inputs initialState
>
> f :: [input] -> state -> outputs
>
> f [] state =
>   transformToOutputs state
>
> f (input:inputs) state =
>   f inputs (newState state input)
>
> doOutput :: [output] -> IO()
>
> doOutput outputs = do
>   putStr $ unlines outputs
>
> So all I have to do is write newState and I'm good! ^_^
>
> (transformToOutputs will, of course, be a snap.)
>
> Right?
Very many do, yes. One thing though, it is worth thinking about the
order of arguments. I often order it

     f state [] = ...
     f state (x:xs) = ...

because that fits better with `foldl` and `map` :)

/M

--
Magnus Therning              OpenPGP: 0x927912051716CE39
email: [hidden email]   jabber: [hidden email]
twitter: magthe               http://therning.org/magnus

The early bird may get the worm, but the second mouse gets the cheese.

_______________________________________________
Beginners mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners

signature.asc (847 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Noobie attempt to process log output into dependency graph

Magnus Therning
In reply to this post by John Lusk-2

John Lusk <[hidden email]> writes:

> Ha! Fixed! And committed to the GitHub repo mentioned previously, if
> anybody's interested.
>
> I spent too much time on it, but I couldn't let it go and now I have
> to brag.

Excellent!

If you have use for it in the future, but find that it's too slow or
demanding on memory then I *think* it's possible to skip building the
full tree and instead use a stack :)

/M

--
Magnus Therning              OpenPGP: 0x927912051716CE39
email: [hidden email]   jabber: [hidden email]
twitter: magthe               http://therning.org/magnus

Finagle's Fourth Law:
Once a job is fouled up, anything done to improve it only makes it
worse.

_______________________________________________
Beginners mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners

signature.asc (847 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Noobie attempt to process log output into dependency graph

John Lusk-2
I'm actually already building a stack and using a set (disguised as a map) to coalesce duplicate edges, but I have a big file to process next week, so I'll let you know.

I had actually thought that I could find an artful way to conceal the stack-as-data-structure as a stack-as-runtime-call-structure, but that was beyond my capabilities, alas. :(

Maybe someday.

On Fri, Dec 16, 2016 at 8:05 PM, Magnus Therning <[hidden email]> wrote:

John Lusk <[hidden email]> writes:

> Ha! Fixed! And committed to the GitHub repo mentioned previously, if
> anybody's interested.
>
> I spent too much time on it, but I couldn't let it go and now I have
> to brag.

Excellent!

If you have use for it in the future, but find that it's too slow or
demanding on memory then I *think* it's possible to skip building the
full tree and instead use a stack :)

/M

--
Magnus Therning              OpenPGP: 0x927912051716CE39
email: [hidden email]   jabber: [hidden email]
twitter: magthe               http://therning.org/magnus

Finagle's Fourth Law:
Once a job is fouled up, anything done to improve it only makes it
worse.

_______________________________________________
Beginners mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners



_______________________________________________
Beginners mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners