Addition to unix: raw ByteString APIs

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

Addition to unix: raw ByteString APIs

Simon Marlow-7
I propose to commit the attached patch to the unix package and release
it with GHC 7.4.1.  The commit log is reproduced below.  Comments please!

The unix version number will of course be bumped appropriately.

Cheers,
        Simon

commit d5e43be90d3c6f8869dd2b0c65800c9a6dd0ac70
Author: Simon Marlow <[hidden email]>
Date:   Fri Nov 11 16:18:48 2011 +0000

     Provide a raw ByteString version of FilePath and environment APIs

     The new module System.Posix.ByteString provides exactly the same API
     as System.Posix, except that:

       - There is a new type: RawFilePath = ByteString

       - All functions mentioning FilePath in the System.Posix API
         use RawFilePath in the System.Posix.ByteString API

       - RawFilePaths are not subject to Unicode locale encoding and
         decoding, unlike FilePaths.  They are the exact bytes passed to
         and returned from the underlying POSIX API.

       - Similarly for functions that deal in environment
         strings (System.Posix.Env): these use untranslated ByteStrings
         in System.Posix.Environment

       - There is a new function

          System.Posix.ByteString.getArgs :: [ByteString]

         returning the raw untranslated arguments as passed to exec()
         when the program was started.

_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries

0001-Provide-a-raw-ByteString-version-of-FilePath-and-env.patch (219K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Addition to unix: raw ByteString APIs

Johan Tibell-2
I'm in favor. Two comments:

* System/Posix/ByteString/FilePath.hsc sticks out a bit as it's the only module that doesn't follow the Foo.Bar.ByteString pattern (i.e. ByteString as the leaf module).

* Should we newtype RawSystemPath? I cannot come up with a really good argument for it, but every time we don't hide our representations we end up getting screwed (see String, FilePath). We could provide an IsString instance and toPath/fromPath (or similarly named) helpers.

-- Johan


_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Addition to unix: raw ByteString APIs

Evan Laforge
In reply to this post by Simon Marlow-7
>      - There is a new function
>
>         System.Posix.ByteString.getArgs :: [ByteString]
>
>        returning the raw untranslated arguments as passed to exec()
>        when the program was started.

Is this one similar to the [String] getArgs in that it drops unix's
argv[0]?  I was recently surprised by that in the standard getArgs
because I wanted a program to restart itself.  I can't figure out how
to do that without access to argv[0].

I suppose for consistency the ByteString version should have the same
behaviour, so maybe this is just an opportunity to wonder why it does
that in the first place.

_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Addition to unix: raw ByteString APIs

Bas van Dijk-2
On 11 November 2011 18:16, Evan Laforge <[hidden email]> wrote:

>>      - There is a new function
>>
>>         System.Posix.ByteString.getArgs :: [ByteString]
>>
>>        returning the raw untranslated arguments as passed to exec()
>>        when the program was started.
>
> Is this one similar to the [String] getArgs in that it drops unix's
> argv[0]?  I was recently surprised by that in the standard getArgs
> because I wanted a program to restart itself.  I can't figure out how
> to do that without access to argv[0].
>
> I suppose for consistency the ByteString version should have the same
> behaviour, so maybe this is just an opportunity to wonder why it does
> that in the first place.
>
> _______________________________________________
> Libraries mailing list
> [hidden email]
> http://www.haskell.org/mailman/listinfo/libraries
>

System.Environment exports:

getProgName :: IO String

maybe System.Posix.ByteString should export a similar function:

getProgName :: IO ByteString

Bas

_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Addition to unix: raw ByteString APIs

Gregory Collins-3
In reply to this post by Simon Marlow-7
On Fri, Nov 11, 2011 at 5:23 PM, Simon Marlow <[hidden email]> wrote:
> I propose to commit the attached patch to the unix package and release it
> with GHC 7.4.1.  The commit log is reproduced below.  Comments please!

Sweet!

G
--
Gregory Collins <[hidden email]>

_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Addition to unix: raw ByteString APIs

Evan Laforge
In reply to this post by Bas van Dijk-2
> System.Environment exports:
>
> getProgName :: IO String
>
> maybe System.Posix.ByteString should export a similar function:
>
> getProgName :: IO ByteString

Yeah, that's actually not the same as argv[0], it has the path to the
binary stripped.  So you can't really use it to restart yourself
because you have no way to know what directory the binary was in.
It's frustrating because you can see in the source that it's going to
some effort to intentionally strip off information that you can't get
elsewhere.

Anyway, it probably would make sense to have the ByteString version
since it's hand-in-hand with getArgs and is a FilePath.

_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Addition to unix: raw ByteString APIs

Herbert Valerio Riedel
In reply to this post by Simon Marlow-7
On Fri, 2011-11-11 at 16:23 +0000, Simon Marlow wrote:
> I propose to commit the attached patch to the unix package and release
> it with GHC 7.4.1.  The commit log is reproduced below.  Comments please!

+1 :-)


Just one minor thing:

>        - There is a new type: RawFilePath = ByteString

Can't that be made a proper type (e.g. via a newtype) instead of being a
mere type-alias?

-- hvr



_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Addition to unix: raw ByteString APIs

Vincent Hanquez
In reply to this post by Simon Marlow-7
On 11/11/2011 04:23 PM, Simon Marlow wrote:
> I propose to commit the attached patch to the unix package and release it with
> GHC 7.4.1.  The commit log is reproduced below.  Comments please!
>
> The unix version number will of course be bumped appropriately.

That's great !
+1

Out of curiosity, is it a step in abstracting FilePath away from String ? or
that's too complicated compatibility wise ?

--
Vincent


_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Addition to unix: raw ByteString APIs

Simon Marlow-7
In reply to this post by Herbert Valerio Riedel
On 11/11/2011 20:55, Herbert Valerio Riedel wrote:

> On Fri, 2011-11-11 at 16:23 +0000, Simon Marlow wrote:
>> I propose to commit the attached patch to the unix package and release
>> it with GHC 7.4.1.  The commit log is reproduced below.  Comments please!
>
> +1 :-)
>
>
> Just one minor thing:
>
>>         - There is a new type: RawFilePath = ByteString
>
> Can't that be made a proper type (e.g. via a newtype) instead of being a
> mere type-alias?

I'd rather *not* do that:

  - The unix library doesn't generally make newtypes - take a look at
    all the other types it exports.

  - It's a low-level API, abstraction is not the goal here.

  - We know from the POSIX spec that a path is a sequence of bytes
    and nothing more.  This interface makes that explicit.

Cheers,
        Simon

_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Addition to unix: raw ByteString APIs

Balazs Komuves
In reply to this post by Evan Laforge

On Fri, Nov 11, 2011 at 7:59 PM, Evan Laforge <[hidden email]> wrote:
> System.Environment exports:
>
> getProgName :: IO String
>
> maybe System.Posix.ByteString should export a similar function:
>
> getProgName :: IO ByteString

Yeah, that's actually not the same as argv[0], it has the path to the
binary stripped.  So you can't really use it to restart yourself
because you have no way to know what directory the binary was in.
It's frustrating because you can see in the source that it's going to
some effort to intentionally strip off information that you can't get
elsewhere.


FYI, there are at least two libraries out there trying to solve this problem:

http://hackage.haskell.org/package/executable-path
http://hackage.haskell.org/package/FindBin

Unfortunately, there is no standardized way on different unix systems
to access the path of the executable running (it's not even fully
clear what it means in the presence of symlinks, etc). Actually it seems
to be impossible to do this (without argv[0]) on certain BSD systems.

Balazs



_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Addition to unix: raw ByteString APIs

Brandon Allbery
On Mon, Nov 14, 2011 at 12:05, Balazs Komuves <[hidden email]> wrote:
Unfortunately, there is no standardized way on different unix systems 
to access the path of the executable running (it's not even fully
clear what it means in the presence of symlinks, etc). Actually it seems
to be impossible to do this (without argv[0]) on certain BSD systems.

Also note:

- argv[0] won't be a full pathname if the program was found via $PATH search

- it is possible for users to pass arbitrary argv[0] to the exec() family of system calls

- some programs use special argv[0] values (this probably doesn't practically matter), notably shells look for a leading "-" (which is normally provided by "login" or "sshd" etc.) to indicate a login shell that should source ~/.profile etc.

- there are various other special cases, such as a number of Unixlikes implementing setuid shell scripts securely by passing a /dev/fd/* reference as argv[0] to avoid symlink attacks.  Again, you *probably* don't need to care about this one, but there may be others on various systems.

In short, argv[0] should not be relied on as the executable name.

(The usual way this is managed is that the real executable is something like foo.real and foo is a shell script which passes in the path to foo.real as a parameter.  During installation/configuration the shell script is modified as necessary to provide the correct path.)

--
brandon s allbery                                      [hidden email]
wandering unix systems administrator (available)     (412) 475-9364 vm/sms


_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries
Reply | Threaded
Open this post in threaded view
|

Re: Addition to unix: raw ByteString APIs

Evan Laforge
On Mon, Nov 14, 2011 at 9:45 AM, Brandon Allbery <[hidden email]> wrote:
> On Mon, Nov 14, 2011 at 12:05, Balazs Komuves <[hidden email]> wrote:
>>
>> Unfortunately, there is no standardized way on different unix systems
>> to access the path of the executable running (it's not even fully
>> clear what it means in the presence of symlinks, etc). Actually it seems
>> to be impossible to do this (without argv[0]) on certain BSD systems.
>
> Also note:
> - argv[0] won't be a full pathname if the program was found via $PATH search

Well yes, granted it's not reliable under all possible circumstances
and all possible unixes.  But it works fine for a personal tool run
under controlled circumstances.  I wound up just always calling it as
'$program $(dirname $program)' which is a bit noisy but works fine.

_______________________________________________
Libraries mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/libraries