hGetContents Illegal byte sequence / ghc-pkg

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

hGetContents Illegal byte sequence / ghc-pkg

Benjamin Edwards
Hello café,

I have a program that is crashing, and I have no idea why:

module Main
  where

import System.Process (readProcessWithExitCode)


main :: IO ()
main = do _ <- readProcessWithExitCode "ghc-pkg" ["describe", "hoopl"] ""
          putStrLn "Should never get here"

this is using the process package from hackage. The program crashes with 

minimal-test: fd:5: hGetContents: invalid argument (invalid byte sequence)
minimal-test: thread blocked indefinitely in an MVar operation

inspecting the source of readProcessWithExitCode yields an obvious explanation to the MVar problem, but I don't understand why hGetContents is so offended.

For the lazy it is defined as follows:

readProcessWithExitCode
    :: FilePath                 -- ^ command to run
    -> [String]                 -- ^ any arguments
    -> String                   -- ^ standard input
    -> IO (ExitCode,String,String) -- ^ exitcode, stdout, stderr
readProcessWithExitCode cmd args input = do
    (Just inh, Just outh, Just errh, pid) <-
        createProcess (proc cmd args){ std_in  = CreatePipe,
                                       std_out = CreatePipe,
                                       std_err = CreatePipe }

    outMVar <- newEmptyMVar

    -- fork off a thread to start consuming stdout
    out  <- hGetContents outh
    _ <- forkIO $ C.evaluate (length out) >> putMVar outMVar ()

    -- fork off a thread to start consuming stderr
    err  <- hGetContents errh
    _ <- forkIO $ C.evaluate (length err) >> putMVar outMVar ()

    -- now write and flush any input
    when (not (null input)) $ do hPutStr inh input; hFlush inh
    hClose inh -- done with stdin

    -- wait on the output
    takeMVar outMVar
    takeMVar outMVar
    hClose outh
    hClose errh

    -- wait on the process
    ex <- waitForProcess pid

    return (ex, out, err)

Now having looked at the source of ghc-pkg it is dumping it's output using putStr and friends, so that should be using my local encoding on the system, right? and so should hGetContents in my program..?

Now, for the curious: the reason I care is that this problem has effectively prevented me from using virthualenv. Sadness and woe.

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: hGetContents Illegal byte sequence / ghc-pkg

Brent Yorgey-2
On Sat, Aug 11, 2012 at 12:13:45PM +0100, Benjamin Edwards wrote:

> Hello café,
>
> I have a program that is crashing, and I have no idea why:
>
> module Main
>   where
>
> import System.Process (readProcessWithExitCode)
>
>
> main :: IO ()
> main = do _ <- readProcessWithExitCode "ghc-pkg" ["describe", "hoopl"] ""
>           putStrLn "Should never get here"
>
> this is using the process package from hackage. The program crashes with
>
> minimal-test: fd:5: hGetContents: invalid argument (invalid byte sequence)
> minimal-test: thread blocked indefinitely in an MVar operation
>
> inspecting the source of readProcessWithExitCode yields an obvious
> explanation to the MVar problem, but I don't understand why hGetContents is
> so offended.

The 'invalid argument' error from hGetContents indicates that a wrong
encoding is being assumed.  I don't know enough about how
putStr/hGetContents decide on an encoding, but in any case it works
for me (that is, it prints "Should never get here").  The likely
sticking point is that one of the authors of hoopl, João Dias, has a
name which contains U+00E3: LATIN SMALL LETTER A WITH TILDE.

Try doing  

  ghc-pkg describe hoopl > hoopl.txt
  file hoopl.txt

to get an indication of what encoding is being used, or manually take
a look at the bytes being generated using

  ghc-pkg describe hoopl | hexdump -C

I don't know what the solution is but at least this should give some
additional information.

-Brent

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: hGetContents Illegal byte sequence / ghc-pkg

Brandon Allbery
In reply to this post by Benjamin Edwards
On Sat, Aug 11, 2012 at 7:13 AM, Benjamin Edwards <[hidden email]> wrote:
inspecting the source of readProcessWithExitCode yields an obvious explanation to the MVar problem, but I don't understand why hGetContents is so offended.

I think last time I looked into this there was no normalization of package.conf data; if it was read in in a particular encoding, it was stored in that encoding and you can get an exception trying to dump it in a different encoding.  Certainly I have found ISO8859-1 encoded text in my package.conf.d/* files, when I would expect (and was configured for) UTF-8.  This also suggests that ISO8859-1 text in a *.cabal file would remain ISO8859-1 in the resulting package.conf file even if UTF-8 encoding was active at the time.

--
brandon s allbery                                      [hidden email]
wandering unix systems administrator (available)     (412) 475-9364 vm/sms


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: hGetContents Illegal byte sequence / ghc-pkg

David McBride
In reply to this post by Benjamin Edwards
I had this same problem a couple weeks ago when trying to install virthualenv and I don't really understand it got into a bad state, but the way I solved it was by fixing the locale settings on my gentoo machine so that I'm using UTF8.  That just involved a few changes in /etc and then the problem went away.

On Sat, Aug 11, 2012 at 7:13 AM, Benjamin Edwards <[hidden email]> wrote:
Hello café,

I have a program that is crashing, and I have no idea why:

module Main
  where

import System.Process (readProcessWithExitCode)


main :: IO ()
main = do _ <- readProcessWithExitCode "ghc-pkg" ["describe", "hoopl"] ""
          putStrLn "Should never get here"

this is using the process package from hackage. The program crashes with 

minimal-test: fd:5: hGetContents: invalid argument (invalid byte sequence)
minimal-test: thread blocked indefinitely in an MVar operation

inspecting the source of readProcessWithExitCode yields an obvious explanation to the MVar problem, but I don't understand why hGetContents is so offended.

For the lazy it is defined as follows:

readProcessWithExitCode
    :: FilePath                 -- ^ command to run
    -> [String]                 -- ^ any arguments
    -> String                   -- ^ standard input
    -> IO (ExitCode,String,String) -- ^ exitcode, stdout, stderr
readProcessWithExitCode cmd args input = do
    (Just inh, Just outh, Just errh, pid) <-
        createProcess (proc cmd args){ std_in  = CreatePipe,
                                       std_out = CreatePipe,
                                       std_err = CreatePipe }

    outMVar <- newEmptyMVar

    -- fork off a thread to start consuming stdout
    out  <- hGetContents outh
    _ <- forkIO $ C.evaluate (length out) >> putMVar outMVar ()

    -- fork off a thread to start consuming stderr
    err  <- hGetContents errh
    _ <- forkIO $ C.evaluate (length err) >> putMVar outMVar ()

    -- now write and flush any input
    when (not (null input)) $ do hPutStr inh input; hFlush inh
    hClose inh -- done with stdin

    -- wait on the output
    takeMVar outMVar
    takeMVar outMVar
    hClose outh
    hClose errh

    -- wait on the process
    ex <- waitForProcess pid

    return (ex, out, err)

Now having looked at the source of ghc-pkg it is dumping it's output using putStr and friends, so that should be using my local encoding on the system, right? and so should hGetContents in my program..?

Now, for the curious: the reason I care is that this problem has effectively prevented me from using virthualenv. Sadness and woe.

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe



_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Fwd: hGetContents Illegal byte sequence / ghc-pkg

Benjamin Edwards
Responding to the list..

---------- Forwarded message ----------
From: Benjamin Edwards <[hidden email]>
Date: 11 August 2012 17:37
Subject: Re: [Haskell-cafe] hGetContents Illegal byte sequence / ghc-pkg
To: David McBride <[hidden email]>


Thank you Gents,

Most useful. One thing that had escaped me is that of course, even if hGetContents and friends are respecting *my* locale, I have a binary distro and the package database probably isn't in my locale. I will try and find that.

Thanks,
 Ben


On 11 August 2012 15:26, David McBride <[hidden email]> wrote:
I had this same problem a couple weeks ago when trying to install virthualenv and I don't really understand it got into a bad state, but the way I solved it was by fixing the locale settings on my gentoo machine so that I'm using UTF8.  That just involved a few changes in /etc and then the problem went away.

On Sat, Aug 11, 2012 at 7:13 AM, Benjamin Edwards <[hidden email]> wrote:
Hello café,

I have a program that is crashing, and I have no idea why:

module Main
  where

import System.Process (readProcessWithExitCode)


main :: IO ()
main = do _ <- readProcessWithExitCode "ghc-pkg" ["describe", "hoopl"] ""
          putStrLn "Should never get here"

this is using the process package from hackage. The program crashes with 

minimal-test: fd:5: hGetContents: invalid argument (invalid byte sequence)
minimal-test: thread blocked indefinitely in an MVar operation

inspecting the source of readProcessWithExitCode yields an obvious explanation to the MVar problem, but I don't understand why hGetContents is so offended.

For the lazy it is defined as follows:

readProcessWithExitCode
    :: FilePath                 -- ^ command to run
    -> [String]                 -- ^ any arguments
    -> String                   -- ^ standard input
    -> IO (ExitCode,String,String) -- ^ exitcode, stdout, stderr
readProcessWithExitCode cmd args input = do
    (Just inh, Just outh, Just errh, pid) <-
        createProcess (proc cmd args){ std_in  = CreatePipe,
                                       std_out = CreatePipe,
                                       std_err = CreatePipe }

    outMVar <- newEmptyMVar

    -- fork off a thread to start consuming stdout
    out  <- hGetContents outh
    _ <- forkIO $ C.evaluate (length out) >> putMVar outMVar ()

    -- fork off a thread to start consuming stderr
    err  <- hGetContents errh
    _ <- forkIO $ C.evaluate (length err) >> putMVar outMVar ()

    -- now write and flush any input
    when (not (null input)) $ do hPutStr inh input; hFlush inh
    hClose inh -- done with stdin

    -- wait on the output
    takeMVar outMVar
    takeMVar outMVar
    hClose outh
    hClose errh

    -- wait on the process
    ex <- waitForProcess pid

    return (ex, out, err)

Now having looked at the source of ghc-pkg it is dumping it's output using putStr and friends, so that should be using my local encoding on the system, right? and so should hGetContents in my program..?

Now, for the curious: the reason I care is that this problem has effectively prevented me from using virthualenv. Sadness and woe.

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe





_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: hGetContents Illegal byte sequence / ghc-pkg

David Fox-7
In reply to this post by Benjamin Edwards
On Sat, Aug 11, 2012 at 4:13 AM, Benjamin Edwards <[hidden email]> wrote:
Hello café,

I have a program that is crashing, and I have no idea why:

module Main
  where

import System.Process (readProcessWithExitCode)


main :: IO ()
main = do _ <- readProcessWithExitCode "ghc-pkg" ["describe", "hoopl"] ""
          putStrLn "Should never get here"

this is using the process package from hackage. The program crashes with 

minimal-test: fd:5: hGetContents: invalid argument (invalid byte sequence)
minimal-test: thread blocked indefinitely in an MVar operation

inspecting the source of readProcessWithExitCode yields an obvious explanation to the MVar problem, but I don't understand why hGetContents is so offended.

For the lazy it is defined as follows:

readProcessWithExitCode
    :: FilePath                 -- ^ command to run
    -> [String]                 -- ^ any arguments
    -> String                   -- ^ standard input
    -> IO (ExitCode,String,String) -- ^ exitcode, stdout, stderr
readProcessWithExitCode cmd args input = do
    (Just inh, Just outh, Just errh, pid) <-
        createProcess (proc cmd args){ std_in  = CreatePipe,
                                       std_out = CreatePipe,
                                       std_err = CreatePipe }

    outMVar <- newEmptyMVar

    -- fork off a thread to start consuming stdout
    out  <- hGetContents outh
    _ <- forkIO $ C.evaluate (length out) >> putMVar outMVar ()

    -- fork off a thread to start consuming stderr
    err  <- hGetContents errh
    _ <- forkIO $ C.evaluate (length err) >> putMVar outMVar ()

    -- now write and flush any input
    when (not (null input)) $ do hPutStr inh input; hFlush inh
    hClose inh -- done with stdin

    -- wait on the output
    takeMVar outMVar
    takeMVar outMVar
    hClose outh
    hClose errh

    -- wait on the process
    ex <- waitForProcess pid

    return (ex, out, err)

Now having looked at the source of ghc-pkg it is dumping it's output using putStr and friends, so that should be using my local encoding on the system, right? and so should hGetContents in my program..?

Now, for the curious: the reason I care is that this problem has effectively prevented me from using virthualenv. Sadness and woe.

I would recommend using ByteStrings.  There is a link to a version of readProcessWithExitCode that uses ByteString instead of String here: http://www.haskell.org/pipermail/libraries/2012-August/018263.html


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe