Read lines of a file into a Vector

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Read lines of a file into a Vector

Jake

I want to write a function that will read the contents of a file and return a Vector of its lines:

readLines :: Handle -> IO (Vector String)

I have an implementation that works but seems to naive and inefficient to me because it reads the entire file contents, then splits it into lines, and then converts it to a Vector from a list.

readLines h = fromList . lines <$> hGetContents h

I would like to a) use hGetLine and continue until I get an isEOFError and b) read directly into a Vector instead of a list. I started something using Data.Vector.unfoldr but I the presence of the IO monad around my String was causing issues I couldn't figure out how to solve.

I'd also be curious to see how to do either one of these things separately and I assume they'd each help make my program more efficient on their own. Or, maybe my implementation is not as terrible as I thought because some laziness/fusion/optimization magic is happening behind the scenes?

Thanks,
Jake Waksbaum


_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Read lines of a file into a Vector

Oleg Grenrus
Thanks to the lazy IO, deforestration and fusion framework in `Vector`, 

    readLines h = fromList . lines <$> hGetContents h

shouldn’t create intermediate lists or/and read the whole file at once into memory.

- Oleg


On 11 Aug 2016, at 10:03, Jake <[hidden email]> wrote:

I want to write a function that will read the contents of a file and return a Vector of its lines:

readLines :: Handle -> IO (Vector String)

I have an implementation that works but seems to naive and inefficient to me because it reads the entire file contents, then splits it into lines, and then converts it to a Vector from a list.

readLines h = fromList . lines <$> hGetContents h

I would like to a) use hGetLine and continue until I get an isEOFError and b) read directly into a Vector instead of a list. I started something using Data.Vector.unfoldr but I the presence of the IO monad around my String was causing issues I couldn't figure out how to solve.

I'd also be curious to see how to do either one of these things separately and I assume they'd each help make my program more efficient on their own. Or, maybe my implementation is not as terrible as I thought because some laziness/fusion/optimization magic is happening behind the scenes?

Thanks,
Jake Waksbaum

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.


_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.

signature.asc (859 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Read lines of a file into a Vector

Tobias Dammers

If I'm not mistaken, lazy lists and lazy IO alone make this a reasonably efficient implementation; even without further optimizations, the intermediate lists would most likely never reside in memory all at once.


On Aug 11, 2016 9:27 AM, "Oleg Grenrus" <[hidden email]> wrote:
Thanks to the lazy IO, deforestration and fusion framework in `Vector`, 

    readLines h = fromList . lines <$> hGetContents h

shouldn’t create intermediate lists or/and read the whole file at once into memory.

- Oleg


On 11 Aug 2016, at 10:03, Jake <[hidden email]> wrote:

I want to write a function that will read the contents of a file and return a Vector of its lines:

readLines :: Handle -> IO (Vector String)

I have an implementation that works but seems to naive and inefficient to me because it reads the entire file contents, then splits it into lines, and then converts it to a Vector from a list.

readLines h = fromList . lines <$> hGetContents h

I would like to a) use hGetLine and continue until I get an isEOFError and b) read directly into a Vector instead of a list. I started something using Data.Vector.unfoldr but I the presence of the IO monad around my String was causing issues I couldn't figure out how to solve.

I'd also be curious to see how to do either one of these things separately and I assume they'd each help make my program more efficient on their own. Or, maybe my implementation is not as terrible as I thought because some laziness/fusion/optimization magic is happening behind the scenes?

Thanks,
Jake Waksbaum

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.


_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.