Optimization of IORefs and STRefs - comparison to g++

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Optimization of IORefs and STRefs - comparison to g++

Mateusz Kłoczko
Hello!

I've performed a few simple tests using Haskell and C++ on primitives.
I've compilled all Haskell programs with -O2 optimizations, and C++ ones with -O3
ghc version - 7.10.2, gcc version : 5.1.1

I'm sending the codes in the zip file.

Problem1 -  100 000 000  iterations.

Time of execution (in seconds):
Int  pure tail recursion: 6.911011299962411e-2
Int# pure tail recursion : 4.587398100011342e-2
IORef for loop 1.1533970820000832
IORef 'tail' recursion 1.0696569040001123
STRef for loop 1.1545546840006864
STRef tail recursion 1.1110019479992843
C++ : 2.7e-07

The llvm version could be as fast as C++ one in this problem.

Buuut... then there's problem 2 (using if or case) - 100 000 000 iterations:
Int# tail recursion 1.315346227000191
IORef for loop: 2.6442542390004746
STRef for loop: 2.669217500999366
C++: 0.158056


Here haskell is about 9 times slower than C++.

Problem 4 - executing the same functionality using bit operations - 100 000 000 iterations:
Int# tail recursion: 0.12361123500068061
C++: 0.131141

So, Haskell here is as fast as C++.

My question is: can IORefs and STRefs be optimized ? I would like very much to rely on them, but I would like to achieve great speeds.
Also, is can one achieve good speeds of execution when using cases or ifs ? If not, what is the other way to do it ? Compiler flags, Cmm optimizations - they are all welcome.

And the final question - how hard would it be to implement such optimizations in ghc compiler?

Regards,
Mateusz Kłoczko

_______________________________________________
Glasgow-haskell-users mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/glasgow-haskell-users

EfficientMutability.zip (6K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Optimization of IORefs and STRefs - comparison to g++

Akio Takano
Hi Mateusz,

IORef and STRef are boxed references. That is, they are a mutable cell
that contains a pointer to some immutable Haskell value. When you
increment a (STRef Int), you first dereference the pointer, allocate a
new immutable heap object to represent the new integer value, then
mutate the reference to point to the new object. This costs much more
than updating a plain mutable integer.

As far as I know there is no particularly popular library that
provides mutable references like this. As a workaround, you can create
a 1-sized unboxed mutable vector using the vector package, and use it
like a reference.

On 3 December 2015 at 01:10, Mateusz Kłoczko
<[hidden email]> wrote:

> Hello!
>
> I've performed a few simple tests using Haskell and C++ on primitives.
> I've compilled all Haskell programs with -O2 optimizations, and C++ ones
> with -O3
> ghc version - 7.10.2, gcc version : 5.1.1
>
> I'm sending the codes in the zip file.
>
> Problem1 -  100 000 000  iterations.
>
> Time of execution (in seconds):
> Int  pure tail recursion: 6.911011299962411e-2
> Int# pure tail recursion : 4.587398100011342e-2
> IORef for loop 1.1533970820000832
> IORef 'tail' recursion 1.0696569040001123
> STRef for loop 1.1545546840006864
> STRef tail recursion 1.1110019479992843
> C++ : 2.7e-07

On this one, g++ manages to eliminate the loop entirely, but GHC doesn't.

>
> The llvm version could be as fast as C++ one in this problem.
>
> Buuut... then there's problem 2 (using if or case) - 100 000 000 iterations:
> Int# tail recursion 1.315346227000191
> IORef for loop: 2.6442542390004746
> STRef for loop: 2.669217500999366
> C++: 0.158056
>
>
> Here haskell is about 9 times slower than C++.

The main difference on this comes from the fact that GHC does not
optimize (n `remInt#` 2#) into (n `andI#` 1#). There is a ticket [0]
that contains some discussion of this issue.

[0]: https://ghc.haskell.org/trac/ghc/ticket/5615

Hope this helps,
Takano Akio
_______________________________________________
Glasgow-haskell-users mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/glasgow-haskell-users
Reply | Threaded
Open this post in threaded view
|

Re: Optimization of IORefs and STRefs - comparison to g++

Michael Snoyman

My mutable-containers package has unboxed and storable references actually.


On Fri, Dec 11, 2015, 12:26 PM Akio Takano <[hidden email]> wrote:
Hi Mateusz,

IORef and STRef are boxed references. That is, they are a mutable cell
that contains a pointer to some immutable Haskell value. When you
increment a (STRef Int), you first dereference the pointer, allocate a
new immutable heap object to represent the new integer value, then
mutate the reference to point to the new object. This costs much more
than updating a plain mutable integer.

As far as I know there is no particularly popular library that
provides mutable references like this. As a workaround, you can create
a 1-sized unboxed mutable vector using the vector package, and use it
like a reference.

On 3 December 2015 at 01:10, Mateusz Kłoczko
<[hidden email]> wrote:
> Hello!
>
> I've performed a few simple tests using Haskell and C++ on primitives.
> I've compilled all Haskell programs with -O2 optimizations, and C++ ones
> with -O3
> ghc version - 7.10.2, gcc version : 5.1.1
>
> I'm sending the codes in the zip file.
>
> Problem1 -  100 000 000  iterations.
>
> Time of execution (in seconds):
> Int  pure tail recursion: 6.911011299962411e-2
> Int# pure tail recursion : 4.587398100011342e-2
> IORef for loop 1.1533970820000832
> IORef 'tail' recursion 1.0696569040001123
> STRef for loop 1.1545546840006864
> STRef tail recursion 1.1110019479992843
> C++ : 2.7e-07

On this one, g++ manages to eliminate the loop entirely, but GHC doesn't.

>
> The llvm version could be as fast as C++ one in this problem.
>
> Buuut... then there's problem 2 (using if or case) - 100 000 000 iterations:
> Int# tail recursion 1.315346227000191
> IORef for loop: 2.6442542390004746
> STRef for loop: 2.669217500999366
> C++: 0.158056
>
>
> Here haskell is about 9 times slower than C++.

The main difference on this comes from the fact that GHC does not
optimize (n `remInt#` 2#) into (n `andI#` 1#). There is a ticket [0]
that contains some discussion of this issue.

[0]: https://ghc.haskell.org/trac/ghc/ticket/5615

Hope this helps,
Takano Akio
_______________________________________________
Glasgow-haskell-users mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/glasgow-haskell-users

_______________________________________________
Glasgow-haskell-users mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/glasgow-haskell-users