optimisation of code

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

optimisation of code

PICCA Frederic-Emmanuel
Hello,

I would like to have your advice in order to optimize this code.
The purpose is to trigg an action 'a' if a list of files (thousands) exists.
A process copy files from one directory to another.

allFilesThere :: MonadIO m => [Path Abs File] -> m Bool
allFilesThere fs = liftIO $ allM (doesFileExist . fromAbsFile) fs

trigOnAllFiles :: MonadIO m => m r -> [Path Abs File] -> m r
trigOnAllFiles a fs = go
    where
      go = do
        r <- allFilesThere fs
        if r then a else
            ( do liftIO $ threadDelay 1000000
                 go)

It works, but it consums a lot's of resources when all the files does not exists yet.
So I would like your advice in order to optimize it :)

thanks for your help.

Frederic
_______________________________________________
Beginners mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

Re: optimisation of code

Oleg Nykolyn
Hi,

Current code re-checks file existence always in same order, so worst case is - N files and only last of them does not exists.
In that case this code will re-check (N-1) files during each consecutive retry.
This can be optimized by moving already existing files to the end of file list(or dropping them from list completely, if files are only added but never removed).
For this you could re-write `allFilesThere` something like:
allFilesThere fs = liftIO $ do
  existing, non_existing <- partitionM (doesFileExist . fromAbsFile) fs
  return (non_existing++ existing, null non_existing)

Then allFilesThere could start next iteration by checking previously non-existing files and probably failing much faster.

On Fri, Sep 21, 2018 at 11:25 AM PICCA Frederic-Emmanuel <[hidden email]> wrote:
Hello,

I would like to have your advice in order to optimize this code.
The purpose is to trigg an action 'a' if a list of files (thousands) exists.
A process copy files from one directory to another.

allFilesThere :: MonadIO m => [Path Abs File] -> m Bool
allFilesThere fs = liftIO $ allM (doesFileExist . fromAbsFile) fs

trigOnAllFiles :: MonadIO m => m r -> [Path Abs File] -> m r
trigOnAllFiles a fs = go
    where
      go = do
        r <- allFilesThere fs
        if r then a else
            ( do liftIO $ threadDelay 1000000
                 go)

It works, but it consums a lot's of resources when all the files does not exists yet.
So I would like your advice in order to optimize it :)

thanks for your help.

Frederic
_______________________________________________
Beginners mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners

_______________________________________________
Beginners mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

Re: optimisation of code

David McBride
In reply to this post by PICCA Frederic-Emmanuel
My first instinct is to just use anyM instead of allM

allFilesThere :: MonadIO m => [Path Abs File] -> m Bool
allFilesThere fs = liftIO $ anyM (not . doesFileExist . fromAbsFile) fs

However you'll now have the opposite problem.  It will take a lot of resources when all the files are there.  But maybe that is okay for your use case?

On Fri, Sep 21, 2018 at 4:25 AM PICCA Frederic-Emmanuel <[hidden email]> wrote:
Hello,

I would like to have your advice in order to optimize this code.
The purpose is to trigg an action 'a' if a list of files (thousands) exists.
A process copy files from one directory to another.

allFilesThere :: MonadIO m => [Path Abs File] -> m Bool
allFilesThere fs = liftIO $ allM (doesFileExist . fromAbsFile) fs

trigOnAllFiles :: MonadIO m => m r -> [Path Abs File] -> m r
trigOnAllFiles a fs = go
    where
      go = do
        r <- allFilesThere fs
        if r then a else
            ( do liftIO $ threadDelay 1000000
                 go)

It works, but it consums a lot's of resources when all the files does not exists yet.
So I would like your advice in order to optimize it :)

thanks for your help.

Frederic
_______________________________________________
Beginners mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners

_______________________________________________
Beginners mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

Re: optimisation of code

PICCA Frederic-Emmanuel
In reply to this post by Oleg Nykolyn
> Hi,

> Current code re-checks file existence always in same order, so worst case is - N files and only last of them does not exists.
> In that case this code will re-check (N-1) files during each consecutive retry.
> This can be optimized by moving already existing files to the end of file list(or dropping them from list completely, if files are only > > added but never removed).
> For this you could re-write `allFilesThere` something like:
> allFilesThere fs = liftIO $ do
>  existing, non_existing <- partitionM (doesFileExist . fromAbsFile) fs
<  return (non_existing++ existing, null non_existing)

> Then allFilesThere could start next iteration by checking previously non-existing files and probably failing much faster.

thanks a lot,  files are never removed, so I can forget already checked files :)
_______________________________________________
Beginners mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

Re: optimisation of code

PICCA Frederic-Emmanuel
In reply to this post by David McBride
> My first instinct is to just use anyM instead of allM

> allFilesThere :: MonadIO m => [Path Abs File] -> m Bool
> allFilesThere fs = liftIO $ anyM (not . doesFileExist . fromAbsFile) fs

> However you'll now have the opposite problem.  It will take a lot of resources when all the files are there.  But maybe that is okay for your use case?

I need to reduce the worload when a file is missing.
I like a lot the partition idea.

Cheers

Frederic
_______________________________________________
Beginners mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners