Convert String to List/Array of Numbers

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Convert String to List/Array of Numbers

Lorenzo Isella
Dear All,
I must be stuck on something pretty basic (I am struggling badly with I/O).
Let us assume you have a rather simple file mydata.dat (3 columns of
integer numbers), see below.



1246191122 1336 1337
1246191142 1336 1337
1246191162 1336 1337
1246191182 1336 1337
1246191202 1336 1337
1246191222 1336 1337
1246191242 1336 1337
1246191262 1336 1337
1246191282 1336 1337
1246191302 1336 1337
1246191322 1336 1337
1246191342 1336 1337
1246191362 1336 1337
1246191382 1336 1337
1246191402 1336 1337
1246191422 1336 1337

Now, my intended pipeline could be

read file as string--> convert to list of integers-->pass it to hmatrix
(or try to convert it into a matrix/array).
Leaving aside the last step, I can easily do something like

let dat=readFile "mydata.dat"


in the interactive shell and get a string, but I am having problems in
converting this to a list or anything more manageable (where every entry
is an integer number i.e. something which can be summed, subtracted
etc...). Ideally even a list where every entry is a row (a list in
itself) would do.
I found online this suggestion
http://bit.ly/9jv1WG
but I am not sure if it really applies to this case.
Many thanks

Lorenzo
Reply | Threaded
Open this post in threaded view
|

Convert String to List/Array of Numbers

Felipe Lessa
On Wed, Sep 8, 2010 at 10:31 AM, Lorenzo Isella
<[hidden email]> wrote:
> in the interactive shell and get a string, but I am having
> problems in converting this to a list or anything more
> manageable (where every entry is an integer number
> i.e. something which can be summed, subtracted etc...). Ideally
> even a list where every entry is a row (a list in itself) would
> do.

Well, first of all you can split your input into lists using

    lines :: String -> [String]

Then, you can split each line into columns by using

    words :: String -> [String]

Now, on each of these columns you can convert to an integer by
using:

    read :: Read a => String -> a

So in the end, you'll end up with something of type [[Int]].


Does this help you to go into the right direction?

Cheers! =)

PS: Yes, that link's information sort of applies, but you'll be
handling lists of lists (i.e. rows of columns).

--
Felipe.
Reply | Threaded
Open this post in threaded view
|

Convert String to List/Array of Numbers

Daniel Fischer-4
In reply to this post by Lorenzo Isella
On Wednesday 08 September 2010 15:31:19, Lorenzo Isella wrote:

> Dear All,
> I must be stuck on something pretty basic (I am struggling badly with
> I/O). Let us assume you have a rather simple file mydata.dat (3 columns
> of integer numbers), see below.
>
>
>
> 1246191122 1336 1337
> 1246191142 1336 1337
> 1246191162 1336 1337
> 1246191182 1336 1337
> 1246191202 1336 1337
> 1246191222 1336 1337
> 1246191242 1336 1337
> 1246191262 1336 1337
> 1246191282 1336 1337
> 1246191302 1336 1337
> 1246191322 1336 1337
> 1246191342 1336 1337
> 1246191362 1336 1337
> 1246191382 1336 1337
> 1246191402 1336 1337
> 1246191422 1336 1337
>
> Now, my intended pipeline could be
>
> read file as string--> convert to list of integers-->pass it to hmatrix
> (or try to convert it into a matrix/array).
> Leaving aside the last step, I can easily do something like
>
> let dat=readFile "mydata.dat"
>
>
> in the interactive shell and get a string,

Not quite. `dat' is the IO-action that reads the file, of type (IO String)
and not a String.
In a programme, you'd do something like

main = do
    ... -- argument parsing perhaps
    txt <- readFile "mydata.dat"
    let dat = convert txt
    doSomething with dat

> but I am having problems in
> converting this to a list or anything more manageable (where every entry
> is an integer number i.e. something which can be summed, subtracted
> etc...). Ideally even a list where every entry is a row (a list in
> itself) would do.

Depending on what the reult type should be, different solutions are
required.
The simplest solutions for such a file format are built from

read  -- to convert e.g. "135" to 135
lines :: String -> [String]
words :: String -> [String]
map :: (a -> b) -> [a] -> [b]

If you want a flat list of Integers from that file,

convert = map read . words

will do. First, `words' splits the String on whitespace (spaces and
newlines), producing a list of digit-strings, those are then read as
Integers.

If you want a list of lists, each line its own list inside the top level
list,

convert = map (map read . words) . lines

is what you want.

If you want to convert each line into a different data structure, say
(Integer, Double, Int64), the general form would still be

convert = map parseLine . lines

and parseLine would depend on the structure you want. For the above,

parseLine str
    = case words str of
        (a : b : c : _) -> (read a, read b, read c)
        _ -> error "Bad line format"

would be a solution.

For any but the simplest formats, you should write a real parser to deal
with possible bad formatting though (writing parsers is fun in Haskell).

> I found online this suggestion
> http://bit.ly/9jv1WG
> but I am not sure if it really applies to this case.
> Many thanks
>
> Lorenzo

Reply | Threaded
Open this post in threaded view
|

Convert String to List/Array of Numbers

Brent Yorgey-2
In reply to this post by Lorenzo Isella
On Wed, Sep 08, 2010 at 03:31:19PM +0200, Lorenzo Isella wrote:

> Now, my intended pipeline could be
>
> read file as string--> convert to list of integers-->pass it to
> hmatrix (or try to convert it into a matrix/array).
> Leaving aside the last step, I can easily do something like
>
> let dat=readFile "mydata.dat"
>
>
> in the interactive shell and get a string, but I am having problems

Note, this may be a bit misleading!  The interactive shell does some
special handling of things involving I/O.  The type of readFile
"mydata.dat" is

  readFile "mydata.dat" :: IO String

That is, an *I/O operation which, when performed*, will yield a String.
This is not at all the same thing as having a String!  In order to get
your hands on the String, you will want to do something like this:

  do dat <- readFile "mydata.dat"    -- dat :: String
     let mat = parseMat dat
     ... do other stuff with mat ...

  parseMat :: String -> [[Integer]]
  parseMat = ...

You may want to read

  http://www.haskell.org/haskellwiki/Introduction_to_IO

or, really, any good Haskell tutorial (e.g. LYAH [1] or RWH [2]) will
cover this.

-Brent

[1] http://learnyouahaskell.com/
[2] http://book.realworldhaskell.org/
Reply | Threaded
Open this post in threaded view
|

Convert String to List/Array of Numbers

Lorenzo Isella
In reply to this post by Daniel Fischer-4
Hi Daniel,
Thanks for your help.
I have a couple of questions left
(1) The first one is quite down to earth.
The snippet below

---------------------------------------------------
main :: IO ()

main = do
   txt <- readFile "mydata.dat"

   let dat = convert txt

   print dat -- this prints out my chunk of data

   return ()

convert x = lines x

-----------------------------------------------

pretty much does what it is supposed to do, but if I use this definition
of convert x

convert x = map (map read . words) . lines x

I bump into compilation errors. Is that the way I am supposed to deal
with your function?

(2) This is a bit more about I/O in general. I start an action with "do"
to read some files and I define outside the action some functions which
are supposed to operate (within the do action) on the read data.
Is this the way it always has to be? I read something about monads but
did not get very far (and hope that they are not badly needed for simple
I/O). Is there a way in Haskell to have the action return to the outside
world e.g. the value of dat and then work with it elsewhere?
That is what I would do in Python or R, but I think I understood that
Haskell's philosophy is different...
Am I on the right track here? And what is the benefit of this?

Cheers

Lorenzo


On 09/08/2010 04:06 PM, Daniel Fischer wrote:

> On Wednesday 08 September 2010 15:31:19, Lorenzo Isella wrote:
>> Dear All,
>> I must be stuck on something pretty basic (I am struggling badly with
>> I/O). Let us assume you have a rather simple file mydata.dat (3 columns
>> of integer numbers), see below.
>>
>>
>>
>> 1246191122 1336 1337
>> 1246191142 1336 1337
>> 1246191162 1336 1337
>> 1246191182 1336 1337
>> 1246191202 1336 1337
>> 1246191222 1336 1337
>> 1246191242 1336 1337
>> 1246191262 1336 1337
>> 1246191282 1336 1337
>> 1246191302 1336 1337
>> 1246191322 1336 1337
>> 1246191342 1336 1337
>> 1246191362 1336 1337
>> 1246191382 1336 1337
>> 1246191402 1336 1337
>> 1246191422 1336 1337
>>
>> Now, my intended pipeline could be
>>
>> read file as string-->  convert to list of integers-->pass it to hmatrix
>> (or try to convert it into a matrix/array).
>> Leaving aside the last step, I can easily do something like
>>
>> let dat=readFile "mydata.dat"
>>
>>
>> in the interactive shell and get a string,
>
> Not quite. `dat' is the IO-action that reads the file, of type (IO String)
> and not a String.
> In a programme, you'd do something like
>
> main = do
>      ... -- argument parsing perhaps
>      txt<- readFile "mydata.dat"
>      let dat = convert txt
>      doSomething with dat
>
>> but I am having problems in
>> converting this to a list or anything more manageable (where every entry
>> is an integer number i.e. something which can be summed, subtracted
>> etc...). Ideally even a list where every entry is a row (a list in
>> itself) would do.
>
> Depending on what the reult type should be, different solutions are
> required.
> The simplest solutions for such a file format are built from
>
> read  -- to convert e.g. "135" to 135
> lines :: String ->  [String]
> words :: String ->  [String]
> map :: (a ->  b) ->  [a] ->  [b]
>
> If you want a flat list of Integers from that file,
>
> convert = map read . words
>
> will do. First, `words' splits the String on whitespace (spaces and
> newlines), producing a list of digit-strings, those are then read as
> Integers.
>
> If you want a list of lists, each line its own list inside the top level
> list,
>
> convert = map (map read . words) . lines
>
> is what you want.
>
> If you want to convert each line into a different data structure, say
> (Integer, Double, Int64), the general form would still be
>
> convert = map parseLine . lines
>
> and parseLine would depend on the structure you want. For the above,
>
> parseLine str
>      = case words str of
>          (a : b : c : _) ->  (read a, read b, read c)
>          _ ->  error "Bad line format"
>
> would be a solution.
>
> For any but the simplest formats, you should write a real parser to deal
> with possible bad formatting though (writing parsers is fun in Haskell).
>
>> I found online this suggestion
>> http://bit.ly/9jv1WG
>> but I am not sure if it really applies to this case.
>> Many thanks
>>
>> Lorenzo
>

Reply | Threaded
Open this post in threaded view
|

Convert String to List/Array of Numbers

Tim Perry-2
You either need to write:

convert x = (map (map read . words) . lines) x

or you need to write:

convert x = map (map read . words) $ lines x


The original function was written as
convert = map (map read . words) . lines

The original is in what is called "point free" form. Values are called "points"
so you have left out the value making the function "point free".  I think this
is one of the most annoying "features" of Haskell because you can't glance at a
function and know how many parameters it takes unless you also know how many
parameters each of the functions it uses need. But that aside, it is very
common. Real World Haskell covers it in Chapter 5.
http://book.realworldhaskell.org/read/writing-a-library-working-with-json-data.html


Good luck,
Tim




----- Original Message ----
From: Lorenzo Isella <[hidden email]>
To: Daniel Fischer <[hidden email]>
Cc: [hidden email]
Sent: Wed, September 8, 2010 10:24:12 AM
Subject: Re: [Haskell-beginners] Convert String to List/Array of Numbers

Hi Daniel,
Thanks for your help.
I have a couple of questions left
(1) The first one is quite down to earth.
The snippet below

---------------------------------------------------
main :: IO ()

main = do
   txt <- readFile "mydata.dat"

   let dat = convert txt

   print dat -- this prints out my chunk of data

   return ()

convert x = lines x

-----------------------------------------------

pretty much does what it is supposed to do, but if I use this definition
of convert x

convert x = map (map read . words) . lines x

I bump into compilation errors. Is that the way I am supposed to deal
with your function?

(2) This is a bit more about I/O in general. I start an action with "do"
to read some files and I define outside the action some functions which
are supposed to operate (within the do action) on the read data.
Is this the way it always has to be? I read something about monads but
did not get very far (and hope that they are not badly needed for simple
I/O). Is there a way in Haskell to have the action return to the outside
world e.g. the value of dat and then work with it elsewhere?
That is what I would do in Python or R, but I think I understood that
Haskell's philosophy is different...
Am I on the right track here? And what is the benefit of this?

Cheers

Lorenzo


On 09/08/2010 04:06 PM, Daniel Fischer wrote:

> On Wednesday 08 September 2010 15:31:19, Lorenzo Isella wrote:
>> Dear All,
>> I must be stuck on something pretty basic (I am struggling badly with
>> I/O). Let us assume you have a rather simple file mydata.dat (3 columns
>> of integer numbers), see below.
>>
>>
>>
>> 1246191122 1336 1337
>> 1246191142 1336 1337
>> 1246191162 1336 1337
>> 1246191182 1336 1337
>> 1246191202 1336 1337
>> 1246191222 1336 1337
>> 1246191242 1336 1337
>> 1246191262 1336 1337
>> 1246191282 1336 1337
>> 1246191302 1336 1337
>> 1246191322 1336 1337
>> 1246191342 1336 1337
>> 1246191362 1336 1337
>> 1246191382 1336 1337
>> 1246191402 1336 1337
>> 1246191422 1336 1337
>>
>> Now, my intended pipeline could be
>>
>> read file as string-->  convert to list of integers-->pass it to hmatrix
>> (or try to convert it into a matrix/array).
>> Leaving aside the last step, I can easily do something like
>>
>> let dat=readFile "mydata.dat"
>>
>>
>> in the interactive shell and get a string,
>
> Not quite. `dat' is the IO-action that reads the file, of type (IO String)
> and not a String.
> In a programme, you'd do something like
>
> main = do
>      ... -- argument parsing perhaps
>      txt<- readFile "mydata.dat"
>      let dat = convert txt
>      doSomething with dat
>
>> but I am having problems in
>> converting this to a list or anything more manageable (where every entry
>> is an integer number i.e. something which can be summed, subtracted
>> etc...). Ideally even a list where every entry is a row (a list in
>> itself) would do.
>
> Depending on what the reult type should be, different solutions are
> required.
> The simplest solutions for such a file format are built from
>
> read  -- to convert e.g. "135" to 135
> lines :: String ->  [String]
> words :: String ->  [String]
> map :: (a ->  b) ->  [a] ->  [b]
>
> If you want a flat list of Integers from that file,
>
> convert = map read . words
>
> will do. First, `words' splits the String on whitespace (spaces and
> newlines), producing a list of digit-strings, those are then read as
> Integers.
>
> If you want a list of lists, each line its own list inside the top level
> list,
>
> convert = map (map read . words) . lines
>
> is what you want.
>
> If you want to convert each line into a different data structure, say
> (Integer, Double, Int64), the general form would still be
>
> convert = map parseLine . lines
>
> and parseLine would depend on the structure you want. For the above,
>
> parseLine str
>      = case words str of
>          (a : b : c : _) ->  (read a, read b, read c)
>          _ ->  error "Bad line format"
>
> would be a solution.
>
> For any but the simplest formats, you should write a real parser to deal
> with possible bad formatting though (writing parsers is fun in Haskell).
>
>> I found online this suggestion
>> http://bit.ly/9jv1WG
>> but I am not sure if it really applies to this case.
>> Many thanks
>>
>> Lorenzo
>

_______________________________________________
Beginners mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/beginners

Reply | Threaded
Open this post in threaded view
|

Convert String to List/Array of Numbers

Rafael Gustavo da Cunha Pereira Pinto-2
In reply to this post by Lorenzo Isella
You should use:


convert = map (map read . words) . lines

Or:


convert x = (map (map read . words) . lines) x

Or, yet:


convert x = map (map read . words) . lines $ x


The function application occurs before the dot composition!




On Wed, Sep 8, 2010 at 14:24, Lorenzo Isella <[hidden email]>wrote:

> Hi Daniel,
> Thanks for your help.
> I have a couple of questions left
> (1) The first one is quite down to earth.
> The snippet below
>
> ---------------------------------------------------
> main :: IO ()
>
> main = do
>  txt <- readFile "mydata.dat"
>
>  let dat = convert txt
>
>  print dat -- this prints out my chunk of data
>
>  return ()
>
> convert x = lines x
>
> -----------------------------------------------
>
> pretty much does what it is supposed to do, but if I use this definition of
> convert x
>
> convert x = map (map read . words) . lines x
>
> I bump into compilation errors. Is that the way I am supposed to deal with
> your function?
>
> (2) This is a bit more about I/O in general. I start an action with "do" to
> read some files and I define outside the action some functions which are
> supposed to operate (within the do action) on the read data.
> Is this the way it always has to be? I read something about monads but did
> not get very far (and hope that they are not badly needed for simple I/O).
> Is there a way in Haskell to have the action return to the outside world
> e.g. the value of dat and then work with it elsewhere?
> That is what I would do in Python or R, but I think I understood that
> Haskell's philosophy is different...
> Am I on the right track here? And what is the benefit of this?
>
> Cheers
>
> Lorenzo
>
>
> On 09/08/2010 04:06 PM, Daniel Fischer wrote:
>
>> On Wednesday 08 September 2010 15:31:19, Lorenzo Isella wrote:
>>
>>> Dear All,
>>> I must be stuck on something pretty basic (I am struggling badly with
>>> I/O). Let us assume you have a rather simple file mydata.dat (3 columns
>>> of integer numbers), see below.
>>>
>>>
>>>
>>> 1246191122 1336 1337
>>> 1246191142 1336 1337
>>> 1246191162 1336 1337
>>> 1246191182 1336 1337
>>> 1246191202 1336 1337
>>> 1246191222 1336 1337
>>> 1246191242 1336 1337
>>> 1246191262 1336 1337
>>> 1246191282 1336 1337
>>> 1246191302 1336 1337
>>> 1246191322 1336 1337
>>> 1246191342 1336 1337
>>> 1246191362 1336 1337
>>> 1246191382 1336 1337
>>> 1246191402 1336 1337
>>> 1246191422 1336 1337
>>>
>>> Now, my intended pipeline could be
>>>
>>> read file as string-->  convert to list of integers-->pass it to hmatrix
>>> (or try to convert it into a matrix/array).
>>> Leaving aside the last step, I can easily do something like
>>>
>>> let dat=readFile "mydata.dat"
>>>
>>>
>>> in the interactive shell and get a string,
>>>
>>
>> Not quite. `dat' is the IO-action that reads the file, of type (IO String)
>> and not a String.
>> In a programme, you'd do something like
>>
>> main = do
>>     ... -- argument parsing perhaps
>>     txt<- readFile "mydata.dat"
>>     let dat = convert txt
>>     doSomething with dat
>>
>>  but I am having problems in
>>> converting this to a list or anything more manageable (where every entry
>>> is an integer number i.e. something which can be summed, subtracted
>>> etc...). Ideally even a list where every entry is a row (a list in
>>> itself) would do.
>>>
>>
>> Depending on what the reult type should be, different solutions are
>> required.
>> The simplest solutions for such a file format are built from
>>
>> read  -- to convert e.g. "135" to 135
>> lines :: String ->  [String]
>> words :: String ->  [String]
>> map :: (a ->  b) ->  [a] ->  [b]
>>
>> If you want a flat list of Integers from that file,
>>
>> convert = map read . words
>>
>> will do. First, `words' splits the String on whitespace (spaces and
>> newlines), producing a list of digit-strings, those are then read as
>> Integers.
>>
>> If you want a list of lists, each line its own list inside the top level
>> list,
>>
>> convert = map (map read . words) . lines
>>
>> is what you want.
>>
>> If you want to convert each line into a different data structure, say
>> (Integer, Double, Int64), the general form would still be
>>
>> convert = map parseLine . lines
>>
>> and parseLine would depend on the structure you want. For the above,
>>
>> parseLine str
>>     = case words str of
>>         (a : b : c : _) ->  (read a, read b, read c)
>>         _ ->  error "Bad line format"
>>
>> would be a solution.
>>
>> For any but the simplest formats, you should write a real parser to deal
>> with possible bad formatting though (writing parsers is fun in Haskell).
>>
>>  I found online this suggestion
>>> http://bit.ly/9jv1WG
>>> but I am not sure if it really applies to this case.
>>> Many thanks
>>>
>>> Lorenzo
>>>
>>
>>
> _______________________________________________
> Beginners mailing list
> [hidden email]
> http://www.haskell.org/mailman/listinfo/beginners
>



--
Rafael Gustavo da Cunha Pereira Pinto




--
Rafael Gustavo da Cunha Pereira Pinto
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/beginners/attachments/20100908/468811c2/attachment.html
Reply | Threaded
Open this post in threaded view
|

Convert String to List/Array of Numbers

Chaddaï Fouché
In reply to this post by Tim Perry-2
On Wed, Sep 8, 2010 at 8:06 PM, Tim Perry <[hidden email]> wrote:
> The original is in what is called "point free" form. Values are called "points"
> so you have left out the value making the function "point free". ?I think this
> is one of the most annoying "features" of Haskell because you can't glance at a
> function and know how many parameters it takes unless you also know how many
> parameters each of the functions it uses need. But that aside, it is very
> common. Real World Haskell covers it in Chapter 5.
> http://book.realworldhaskell.org/read/writing-a-library-working-with-json-data.html

Given that the notion of argument number isn't quite right in Haskell
and that you should put a type signature on all exported functions
which provides more exact information on the function behaviour
anyway... I would say that point-free is worth it for the clarity it
affords to the accustomed Haskeller (all but the most twisted
functions written in point-free style will only take "one" argument
anyway).

--
Jeda?
Reply | Threaded
Open this post in threaded view
|

Convert String to List/Array of Numbers

Brent Yorgey-2
In reply to this post by Lorenzo Isella
On Wed, Sep 08, 2010 at 07:24:12PM +0200, Lorenzo Isella wrote:

> Hi Daniel,
> Thanks for your help.
> I have a couple of questions left
> (1) The first one is quite down to earth.
> The snippet below
>
> ---------------------------------------------------
> main :: IO ()
>
> main = do
>   txt <- readFile "mydata.dat"
>
>   let dat = convert txt
>
>   print dat -- this prints out my chunk of data
>
>   return ()
>
> convert x = lines x
>
> -----------------------------------------------

Looks good.  Note that the return () is not necessary since 'print
dat' already results in ().

> pretty much does what it is supposed to do, but if I use this
> definition of convert x
>
> convert x = map (map read . words) . lines x

That ought to be

  convert = map (map read . words) . lines

or alternatively

  convert x = map (map read . words) (lines x)

The dot (.) is function composition, which lets you make "pipelines"
of functions.  So the first one says "convert is the function obtained
by first running 'lines' on the input, and then running 'map (map read
. words)' on the output of 'lines'.  You can also say explicitly what
to do with the input x, as in the second definition.  These two
definitions are exactly equivalent.

> (2) This is a bit more about I/O in general. I start an action with
> "do" to read some files and I define outside the action some
> functions which are supposed to operate (within the do action) on the
> read data.
> Is this the way it always has to be? I read something about monads
> but did not get very far (and hope that they are not badly needed for
> simple I/O).

When you do I/O you are using monads whether you know it or not!  But
no, you don't need a deep understanding of monads to do simple I/O.

In any event, this has nothing to do with monads in general, but is
particular to IO.  And yes, this is the way it always has to be with
I/O: there is no way to "escape", that is, there is no function* with
the type

  escapeIO :: IO a -> a

The problem is that because of Haskell's laziness, if there were such
a function you would have no idea when all the effects (like reading a
file, writing to disk, displaying something on the screen) would
happen -- or they might happen twice, or not at all!  Because of
Haskell's purity, the compiler is free to reorder and schedule
computations however it likes, and throwing side effects into the mix
would simply wreak havoc.

> Am I on the right track here? And what is the benefit of this?

The benefit is precise control of side effects, and what is known as
"referential transparency": if you have a function of type

  Int -> Int

then you know for certain that it only computes a numerical function.
Calling it will never result in things getting written to disk or the
screen or anything like that, and calling it with the same input will
always give you the same result.  This is a very strong guarantee that
gives you powerful ways to reason about programs.

-Brent

* Actually, there is, but it is only for use in very special low-level
  sorts of situations by those who really know what they are doing.
Reply | Threaded
Open this post in threaded view
|

Convert String to List/Array of Numbers

Daniel Fischer-4
In reply to this post by Lorenzo Isella
On Wednesday 08 September 2010 19:24:12, Lorenzo Isella wrote:

> Hi Daniel,
> Thanks for your help.
> I have a couple of questions left
> (1) The first one is quite down to earth.
> The snippet below
>
> ---------------------------------------------------
> main :: IO ()
>
> main = do
>    txt <- readFile "mydata.dat"
>
>    let dat = convert txt
>
>    print dat -- this prints out my chunk of data
>
>    return ()

That `return ()' is superfluous, print already has the appropriate type,

print :: Show a => a -> IO ()

return () is only needed to

- fill in do-nothing branches, if condition then doSomething else return ()
or
case expression of
    pat1 -> doSomething
    pat2 -> doSomethingElse
    _ -> return ()

- convert something to the appropriate type, e.g. if
action :: IO ExitCode
and you need an IO () in some place, then you use
action >> return ()

>
> convert x = lines x
>
> -----------------------------------------------
>
> pretty much does what it is supposed to do, but if I use this definition
> of convert x
>
> convert x = map (map read . words) . lines x
>
> I bump into compilation errors. Is that the way I am supposed to deal
> with your function?

Yes and no.
First of all, function application binds tighter than composition, so

convert x = map (map read . words) . lines x

is parsed as

convert x = (map ((map read) . words)) . (lines x)

which gives a type error because (lines x) :: [String], while the
composition expects something of type (a -> b) as second argument.
The correct form of convert could be

convert x = (map (map read . words) . lines) x

or

convert x = map (map read . words) . lines $ x

or, point-free,

convert = map (map read . words) . lines

In the latter case, you have to give it a type signature,

convert :: Read a => String -> [[a]]

or disable the monomorphism restriction
({-# LANGUAGE NoMonomorphismRestriction #-} pragma in the file resp. the
command-line flag -XNoMonomorphismRestriction), otherwise it'll likely give
rise to other type errors.

Once that is fixed, your problems aren't over yet.

Then you get compilation errors because the compiler has no way to infer at
which type to use read, should it try to read Integers, Bools, ... ?

Usually, in real code the type can be inferred from the context, at least
enough for the defaulting rules to apply (if you pass dat to something
expecting [[Bool]], the compiler knows it should use Bool's Read instance,
if it's expecting (Num a => [[a]]), it can be defaulted (and will be
defaulted to Integer unless you have an explicit default declaration
stating otherwise).

In the code above, all the compiler can find out is that

dat :: (Read a, Show a) => [[a]]

GHC will compile it if you pass -XExtendedDefaultRules on the command line
(or put {-# LANGUAGE ExtendedDefaultRules #-} at the top of the module),
then the type variable a will be defaulted to () [which is rather useless].

More realistically, you need to tell the compiler the type of dat,

    let dat :: [[Integer]]  -- or ((Num a, Read a) => [[a]])
        dat = convert txt

>
> (2) This is a bit more about I/O in general. I start an action with "do"
> to read some files and I define outside the action some functions which
> are supposed to operate (within the do action) on the read data.

Yes, you define the functions that do the actual work as pure functions
(mostly) and then bind them together in a - preferably small - main
function doing the necessary I/O (reading data or configuration files,
outputting results).

> Is this the way it always has to be? I read something about monads but
> did not get very far (and hope that they are not badly needed for simple
> I/O).

To do basic I/O, you don't need to know anything about monads, all you need
is a little nowledge of the do-notation.

> Is there a way in Haskell to have the action return to the outside
> world e.g. the value of dat and then work with it elsewhere?

For the few cases where it's necessary, there is such a beast.
Its name begins with the word `unsafe', for good reasons (the full name is
unsafePerformIO, available from System.IO.Unsafe).
When you're tempted to use it, ask yourself "Is this really a good idea?"
(like if you're tempted to use goto in C, only more so - sometimes it is,
but rarely).

> That is what I would do in Python or R, but I think I understood that
> Haskell's philosophy is different...

Well, you pass it as a parameter to other functions and IO-actions.

> Am I on the right track here? And what is the benefit of this?

Purity allows some optimisations that can't be done for functions which
might have side-effects.
And it's much easier to reason about pure (side-effect-free) functions.

>
> Cheers
>
> Lorenzo
Reply | Threaded
Open this post in threaded view
|

Convert String to List/Array of Numbers

Daniel Fischer-4
In reply to this post by Chaddaï Fouché
On Wednesday 08 September 2010 20:13:17, Chadda? Fouch? wrote:
> Given that the notion of argument number isn't quite right in Haskell

Since, strictly speaking, a function always takes exactly one argument.
Haskell is like mathematics in that respect.

But since saying "a function which takes an argument of type a, returning a
function which takes an argument of type b, returning a function which
takes an argument of type c, returning ..." is much more cumbersome than
saying "a function taking five arguments of types a, b, c, d, e
respectively and returning a value of type f", we are using the more
convenient, albeit inexact, language habitually.
Haskell is like mathematics in that respect too.

Be aware however, that the same function may be referred to as a function
taking three, four or perhaps six arguments in other circumstances.

> and that you should put a type signature on all exported functions

And also on nontrivial internal functions.

> which provides more exact information on the function behaviour
> anyway... I would say that point-free is worth it for the clarity it
> affords to the accustomed Haskeller

It takes a bit to get used to (having a mthematical background helps).
And point-freeing is not always a win in readability.
Judge on a case-by-case basis.

> (all but the most twisted
> functions written in point-free style will only take "one" argument
> anyway).

Possibly two.

foo = (sum .) . enumFromThenTo 0

hasn't yet clearly crossed the border.