parsing a CSV file

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

parsing a CSV file

Roger Mason
Hello,

I'm attempting to write a parser for files that look like this:

Bruker Nano GmbH Berlin, Germany
Esprit 1.9

Date: 02/05/2013 10:06:49 AM
Real time: 15000
Energy Counts
-0.474    0
.....

The line before the ellipsis is repeated many times (such lines
represents a spectrum).  I need to be able to extract numbers from lines
containing <string: > and I want to extract the number pairs following
"Energy Counts\n".  The extracted data will then be written to a file in
a different format.  For now I'll be satisfied with reading the "header"
info, i.e. down to "Energy Counts\n".

Thus far, I have:
-- derived from RWH
-- file: ch16/csv2.hs
import Text.ParserCombinators.Parsec

headerLines = endBy csvFile endHeader
csvFile = endBy line eol
line = sepBy cell (char ',')
cell = many (noneOf ",\n")
eol = char '\n'

parseCSV :: String -> Either ParseError [[String]]
parseCSV input = parse csvFile "(unknown)" input

parseHDR :: String -> Either ParseError [[String]]
parseHDR input = parse headerLines "(unknown)" input

endHeader = string "Energy Counts"

This loads into GHCi (7.6.2) OK.  However, when I test it:

parseHDR "Bruker Nano GmbH Berlin, Germany\nEsprit 1.9\n\nDate:
02/05/2013 10:06:49 AM\nReal time: 15000\nEnergy Counts"

Not in scope: `parseHDR'

which makes sense because

ghci> :t endHeader

<interactive>:1:1: Not in scope: `endHeader'

Clearly, my naiive implementation of endHeader is no good.

I appreciate any pointers.

Thanks,
Roger


This electronic communication is governed by the terms and conditions at
http://www.mun.ca/cc/policies/electronic_communications_disclaimer_2012.php

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: parsing a CSV file

Roman Cheplyaka-2
* Roger Mason <[hidden email]> [2013-05-21 12:22:53-0230]

> Thus far, I have:
> -- derived from RWH
> -- file: ch16/csv2.hs
> import Text.ParserCombinators.Parsec
>
> headerLines = endBy csvFile endHeader
> csvFile = endBy line eol
> line = sepBy cell (char ',')
> cell = many (noneOf ",\n")
> eol = char '\n'
>
> parseCSV :: String -> Either ParseError [[String]]
> parseCSV input = parse csvFile "(unknown)" input
>
> parseHDR :: String -> Either ParseError [[String]]
> parseHDR input = parse headerLines "(unknown)" input
>
> endHeader = string "Energy Counts"
>
> This loads into GHCi (7.6.2) OK.  However, when I test it:
>
> parseHDR "Bruker Nano GmbH Berlin, Germany\nEsprit 1.9\n\nDate:
> 02/05/2013 10:06:49 AM\nReal time: 15000\nEnergy Counts"
>
> Not in scope: `parseHDR'
>
> which makes sense because
>
> ghci> :t endHeader
>
> <interactive>:1:1: Not in scope: `endHeader'
>
> Clearly, my naiive implementation of endHeader is no good.

Hi Roger,

"Not in scope" means that that thing is not defined.

So it's not a problem with your implementation, but with the way you
load it.

If you copy-paste your ghci session here, you may get further help.

Roman

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: parsing a CSV file

Roger Mason
Hi Roman,

On 05/21/2013 12:36 PM, Roman Cheplyaka wrote:

>
> Clearly, my naiive implementation of endHeader is no good.
> Hi Roger,
>
> "Not in scope" means that that thing is not defined.
>
> So it's not a problem with your implementation, but with the way you
> load it.
>
> If you copy-paste your ghci session here, you may get further help.
>
> Roman
Starting with a clean ghci session I get this:

ghci> :l csv.hs
[1 of 1] Compiling Main             ( csv.hs, interpreted )

csv.hs:15:24:
     Couldn't match type `[Char]' with `Char'
     Expected type: Text.Parsec.Prim.Parsec String () [[String]]
       Actual type: Text.Parsec.Prim.ParsecT
                      String () Data.Functor.Identity.Identity [[[[Char]]]]
     In the first argument of `parse', namely `headerLines'
     In the expression: parse headerLines "(unknown)" input
     In an equation for `parseHDR':
         parseHDR input = parse headerLines "(unknown)" input
Failed, modules loaded: none.

Thanks,
Roger

This electronic communication is governed by the terms and conditions at
http://www.mun.ca/cc/policies/electronic_communications_disclaimer_2012.php

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: parsing a CSV file

Roman Cheplyaka-2
* Roger Mason <[hidden email]> [2013-05-21 13:33:47-0230]

> Hi Roman,
>
> On 05/21/2013 12:36 PM, Roman Cheplyaka wrote:
> >
> >Clearly, my naiive implementation of endHeader is no good.
> >Hi Roger,
> >
> >"Not in scope" means that that thing is not defined.
> >
> >So it's not a problem with your implementation, but with the way you
> >load it.
> >
> >If you copy-paste your ghci session here, you may get further help.
> >
> >Roman
> Starting with a clean ghci session I get this:
>
> ghci> :l csv.hs
> [1 of 1] Compiling Main             ( csv.hs, interpreted )
>
> csv.hs:15:24:
>     Couldn't match type `[Char]' with `Char'
>     Expected type: Text.Parsec.Prim.Parsec String () [[String]]
>       Actual type: Text.Parsec.Prim.ParsecT
>                      String () Data.Functor.Identity.Identity [[[[Char]]]]
>     In the first argument of `parse', namely `headerLines'
>     In the expression: parse headerLines "(unknown)" input
>     In an equation for `parseHDR':
>         parseHDR input = parse headerLines "(unknown)" input
> Failed, modules loaded: none.

So this is the real error. If you read it carefully, it says that it
expected [[String]] but got [[[[Char]]]] (i.e. [[[String]]]) as a result
of the headerLines parser.

I don't have time right now to look closer at your code, but I suggest
studying the types of combinators you use (such as endBy) and trying to
write down type signatures for the rest of the values you define.
This way you'll find the error and better understand your program.

A useful trick is to start ghci with -fdefer-type-errors and use ":t" to
inspect types of various expressions that you encounter.

Roman

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: parsing a CSV file

Roger Mason
Thank you.

Roger

On 05/21/2013 03:15 PM, Roman Cheplyaka wrote:
> So this is the real error. If you read it carefully, it says that it
> expected [[String]] but got [[[[Char]]]] (i.e. [[[String]]]) as a
> result of the headerLines parser. I don't have time right now to look
> closer at your code, but I suggest studying the types of combinators
> you use (such as endBy) and trying to write down type signatures for
> the rest of the values you define. This way you'll find the error and
> better understand your program. A useful trick is to start ghci with
> -fdefer-type-errors and use ":t" to inspect types of various
> expressions that you encounter. Roman


This electronic communication is governed by the terms and conditions at
http://www.mun.ca/cc/policies/electronic_communications_disclaimer_2012.php

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe