Question about negative Integers

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Question about negative Integers

Sylvain Henry-2
Hi GHC devs,

As some of you may know, I am working on fixing several longstanding
issues with GHC's big numbers implementation (Integer, Natural). You can
read more about it here:
https://gitlab.haskell.org/hsyl20/ghc/raw/hsyl20-integer/libraries/ghc-bignum/docs/ghc-bignum.rst

To summarize, we would have a single `ghc-bignum` package with different
backends (GMP, pure Haskell, etc.). The backend is chosen with a Cabal
flag and new backends are way easier to add. All the backends use the
same representation which allows Integer and Natural types and datacons
to be wired-in which has a lot of nice consequences (remove some
dependency hacks in base package, make GHC agnostic of the backend used,
etc.).

A major roadblock in previous attempts was that integer-simple doesn't
use the same representations for numbers as integer-gmp. But I have
written a new pure Haskell implementation which happens to be faster
than integer-simple (see perf results in the document linked above) and
that uses the common representation (similar to what was used in
integer-gmp).

I am very close to submit a merge request but there is a remaining
question about the Bits instance for negative Integer numbers:

We don't store big negative Integer using two's complement encoding,
instead we use signed magnitude representation (i.e. we use constructors
to distinguish between (big) positive or negative numbers). It's already
true today in integer-simple and integer-gmp. However integer-gmp and
integer-simple fake two's complement encoding for Bits operations. As a
consequence, every Bits operation on negative Integers does *a lot* of
stuff. E.g. testing a single bit with `testBit` is linear in the size of
the number, a logical `and` between two numbers involves additions and
subtractions, etc.

Question is: do we need/want to keep this behavior? There is nothing in
the report that says that Integer's Bits instance has to mimic two's
complement encoding. What's the point of slowly accessing a fake
representation instead of the actual one? Could we deprecate this? The
instance isn't even coherent: popCount returns the negated numbers of 1s
in the absolute value as it can't return an infinite value.

Thanks,
Sylvain
_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

RE: Question about negative Integers

GHC - devs mailing list
I'm not *at all* close to this, but what you say sounds sensible.

What is the user-facing change you propose? Something about the behaviour of (Bits Integer)?  If so, fly it past the Core Libraries Committee, but in concrete form: I propose to change X to Y.

Simon

|  -----Original Message-----
|  From: ghc-devs <[hidden email]> On Behalf Of Sylvain Henry
|  Sent: 15 November 2019 16:04
|  To: ghc-devs <[hidden email]>
|  Subject: Question about negative Integers
|  
|  Hi GHC devs,
|  
|  As some of you may know, I am working on fixing several longstanding
|  issues with GHC's big numbers implementation (Integer, Natural). You can
|  read more about it here:
|  https://gitlab.haskell.org/hsyl20/ghc/raw/hsyl20-integer/libraries/ghc-
|  bignum/docs/ghc-bignum.rst
|  
|  To summarize, we would have a single `ghc-bignum` package with different
|  backends (GMP, pure Haskell, etc.). The backend is chosen with a Cabal
|  flag and new backends are way easier to add. All the backends use the
|  same representation which allows Integer and Natural types and datacons
|  to be wired-in which has a lot of nice consequences (remove some
|  dependency hacks in base package, make GHC agnostic of the backend used,
|  etc.).
|  
|  A major roadblock in previous attempts was that integer-simple doesn't
|  use the same representations for numbers as integer-gmp. But I have
|  written a new pure Haskell implementation which happens to be faster
|  than integer-simple (see perf results in the document linked above) and
|  that uses the common representation (similar to what was used in
|  integer-gmp).
|  
|  I am very close to submit a merge request but there is a remaining
|  question about the Bits instance for negative Integer numbers:
|  
|  We don't store big negative Integer using two's complement encoding,
|  instead we use signed magnitude representation (i.e. we use constructors
|  to distinguish between (big) positive or negative numbers). It's already
|  true today in integer-simple and integer-gmp. However integer-gmp and
|  integer-simple fake two's complement encoding for Bits operations. As a
|  consequence, every Bits operation on negative Integers does *a lot* of
|  stuff. E.g. testing a single bit with `testBit` is linear in the size of
|  the number, a logical `and` between two numbers involves additions and
|  subtractions, etc.
|  
|  Question is: do we need/want to keep this behavior? There is nothing in
|  the report that says that Integer's Bits instance has to mimic two's
|  complement encoding. What's the point of slowly accessing a fake
|  representation instead of the actual one? Could we deprecate this? The
|  instance isn't even coherent: popCount returns the negated numbers of 1s
|  in the absolute value as it can't return an infinite value.
|  
|  Thanks,
|  Sylvain
|  _______________________________________________
|  ghc-devs mailing list
|  [hidden email]
|  http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

Re: Question about negative Integers

Herbert Valerio Riedel-3
In reply to this post by Sylvain Henry-2



On Fri, Nov 15, 2019 at 5:04 PM Sylvain Henry <[hidden email]> wrote:
Question is: do we need/want to keep this behavior?

Yes ;-)

I chose it quite intentionally after benchmarking and carefully examining various approaches, and with the intent to use a common-denominator representation which would be easy to supplement with alternative bignum implementations.

Since you didn't seem to reference it, I wonder if you even saw the page where some of my design rationale was written down as well as hinting at how I intended to make the backend selectable via a link-time flag (similar to how you'd select the RTS via -threaded or -prof, you'd also be able to select the integer backend at link-time w/o the need to recompile anything). See


However, some time ago I did discuss picking up that plan again, but he did point out that it would make a lot more sense to leverage Backpack for this, as it seems to be a much more elegant solution to this problem than the simple but platform-specific link-time approach I was aiming for. Ben put it quite bluntly that if Backpack can't be used for this thing it was basically designed for, we should consider ripping it out again as it would have effectively failed its promise.

And I do agree! Back when I originally redesigned and rewrote integer-gmp from scratch, Backpack wasn't available yet. But now we have it, and a Backpack based solution would IMO indeed be the proper solution at this point to the problem of abstracting over integer-backends as well as representations -- it could even be combined with my original plan for C FFI based platforms (but that's mostly an optimization at that point for the special case where the backends share said common representation at the ABI level).

 


_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

Re: Question about negative Integers

Edward Kmett-2
In reply to this post by Sylvain Henry-2
> Question is: do we need/want to keep this behavior?

I think we really do want to keep this behavior. 

And not just because I for one have a decent cross-section of code that would just become horribly broken (and would have to find some way to jerry-rig the existing behavior anyways) if we randomly changed it.

The current underlying representation if more directly exposed would be quite surprising to users and doesn't at all fit the mental model of what an Int-like thing is.

Other examples: Conor McBride's work on co-deBruijn syntax exploits the current Bits instance heavily (and can be greatly streamlined by making more use of it that he doesn't, quite yet.)

-Edward

On Fri, Nov 15, 2019 at 9:34 PM Sylvain Henry <[hidden email]> wrote:
Hi GHC devs,

As some of you may know, I am working on fixing several longstanding
issues with GHC's big numbers implementation (Integer, Natural). You can
read more about it here:
https://gitlab.haskell.org/hsyl20/ghc/raw/hsyl20-integer/libraries/ghc-bignum/docs/ghc-bignum.rst

To summarize, we would have a single `ghc-bignum` package with different
backends (GMP, pure Haskell, etc.). The backend is chosen with a Cabal
flag and new backends are way easier to add. All the backends use the
same representation which allows Integer and Natural types and datacons
to be wired-in which has a lot of nice consequences (remove some
dependency hacks in base package, make GHC agnostic of the backend used,
etc.).

A major roadblock in previous attempts was that integer-simple doesn't
use the same representations for numbers as integer-gmp. But I have
written a new pure Haskell implementation which happens to be faster
than integer-simple (see perf results in the document linked above) and
that uses the common representation (similar to what was used in
integer-gmp).

I am very close to submit a merge request but there is a remaining
question about the Bits instance for negative Integer numbers:

We don't store big negative Integer using two's complement encoding,
instead we use signed magnitude representation (i.e. we use constructors
to distinguish between (big) positive or negative numbers). It's already
true today in integer-simple and integer-gmp. However integer-gmp and
integer-simple fake two's complement encoding for Bits operations. As a
consequence, every Bits operation on negative Integers does *a lot* of
stuff. E.g. testing a single bit with `testBit` is linear in the size of
the number, a logical `and` between two numbers involves additions and
subtractions, etc.

Question is: do we need/want to keep this behavior? There is nothing in
the report that says that Integer's Bits instance has to mimic two's
complement encoding. What's the point of slowly accessing a fake
representation instead of the actual one? Could we deprecate this? The
instance isn't even coherent: popCount returns the negated numbers of 1s
in the absolute value as it can't return an infinite value.

Thanks,
Sylvain
_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

Re: Question about negative Integers

Joachim Breitner-2
In reply to this post by Sylvain Henry-2
Hi,

Am Freitag, den 15.11.2019, 17:04 +0100 schrieb Sylvain Henry:
> However integer-gmp and
> integer-simple fake two's complement encoding for Bits operations.

just a small factoid: the Coq standard library provide the same
semantics. I’d lean towards leaving it as it is. If someone need the
“other” semantics, they can easily throw in a (very efficient) `abs` in
the right places.

Cheers,
Joachim

--
Joachim Breitner
  [hidden email]
  http://www.joachim-breitner.de/

_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

Re: Question about negative Integers

Sylvain Henry-2
Alright. Thanks everyone for the convincing answers. I will keep the
current behavior and I will document that operations may be slower than
could be expected.

Cheers,
Sylvain

On 16/11/2019 12:04, Joachim Breitner wrote:

> Hi,
>
> Am Freitag, den 15.11.2019, 17:04 +0100 schrieb Sylvain Henry:
>> However integer-gmp and
>> integer-simple fake two's complement encoding for Bits operations.
> just a small factoid: the Coq standard library provide the same
> semantics. I’d lean towards leaving it as it is. If someone need the
> “other” semantics, they can easily throw in a (very efficient) `abs` in
> the right places.
>
> Cheers,
> Joachim
>
_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs