Advice needed on best way to simulate an STL vector

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
19 messages Options
Reply | Threaded
Open this post in threaded view
|

Advice needed on best way to simulate an STL vector

Brian Hulley
Hi -
In C++, STL provides a vector class which behaves as an array except you can
insert/delete elements from it. I'm wondering what is the best Haskell data
structure to use to simulate this, either mutable or immutable.

I've looked at the Array interface, but although there is the // operation
to replace some elements, there does not appear to be a simple way to
delete/insert elements.

Ideally I'd like functions like:

-- Insert new elements starting at the specified index, moving the others up
to make room
insert:: Array i e -> i -> [e] -> Array i e

-- Delete a range of elements, moving later elements back down
delete:: Array i e -> i -> i -> Array e

-- Append a new element to the end of an array
push_back :: Array i e -> e -> Array i e

Is there an efficient way of implementing these operations in Haskell, based
on arrays, or should I be using some other data structure altogether eg a
list?

Also, for large arrays, am I right in thinking that it is still more
efficient to use immutable arrays rather than mutable arrays (because it is
easier for gc to always just deal with immutable values)?

Thanks, Brian.

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Advice needed on best way to simulate an STL vector

Sebastian Sylvan
On 4/19/06, Brian Hulley <[hidden email]> wrote:

> Hi -
> In C++, STL provides a vector class which behaves as an array except you can
> insert/delete elements from it. I'm wondering what is the best Haskell data
> structure to use to simulate this, either mutable or immutable.
>
> I've looked at the Array interface, but although there is the // operation
> to replace some elements, there does not appear to be a simple way to
> delete/insert elements.
>
> Ideally I'd like functions like:
>
> -- Insert new elements starting at the specified index, moving the others up
> to make room
> insert:: Array i e -> i -> [e] -> Array i e
>
> -- Delete a range of elements, moving later elements back down
> delete:: Array i e -> i -> i -> Array e
>
> -- Append a new element to the end of an array
> push_back :: Array i e -> e -> Array i e
>
> Is there an efficient way of implementing these operations in Haskell, based
> on arrays, or should I be using some other data structure altogether eg a
> list?
>

Well not efficient in the C++ sense. You'd still have to create a
whole new array for all of those operations. You can use the (//)
operator to update all the elements you need to update...

insert i xs arr = arr // shift_list
where n = length xs
         shift_list = zip [i..] xs ++ [ (j,arr!(j - n)) | j <- [i+n ..
snd (bounds arr)]]

> Also, for large arrays, am I right in thinking that it is still more
> efficient to use immutable arrays rather than mutable arrays (because it is
> easier for gc to always just deal with immutable values)?

Well that depends on what you're doing. Mutable arrays are more
efficient for most operation (replace is O(1) instead of O(n), for
example), but it's possible that for small arrays immutable has a
smaller constant factor (I'm certainly not sure that it does!)

/S

--
Sebastian Sylvan
+46(0)736-818655
UIN: 44640862
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Advice needed on best way to simulate an STL vector

Cale Gibbard
In reply to this post by Brian Hulley
Well, you could use something like C++ does for implementing STL
vectors in terms of arrays -- iirc, it generally doubles the allocated
size each time applying an operation to the vector would make it
exceed its capacity (the current allocation size), and keeps track of
the actual number of elements stored separately. It's important to do
allocation which is proportional to the existing capacity, or repeated
insertion will become quadratic time rather than linear. So
essentially, some data structure like

data Vector a = Vector { store :: Array Int a, end :: Int }

would be a sufficient minimal way to start. When the store needs to be
increased, you simply create a new array with twice as many elements,
copy the initial elements in from the old array which is then thrown
away, and only update end to the position of the actual last element.
This is analogous to what C++ implementations of the vector class do.

What will bite you is if you try to generalise from indices of type
Int to instances of Ix -- the Ix operations assume that there are
lower and upper bounds on indices. The upper bound of course quickly
becomes a problem. You could however use Enum, which has toEnum and
fromEnum, which are sufficient to use with the implementation in terms
of Ints. It could also be claimed that Int isn't always the best
initial type to use, and indeed I'd still feel safer with Integer
somehow, but then, fromEnum and toEnum use Int, and if you have more
than 2 billion elements in your vector at the same time, maybe you
should be looking at your algorithm more carefully and/or doing your
own low level memory allocation via FFI. :)

 - Cale

On 19/04/06, Brian Hulley <[hidden email]> wrote:

> Hi -
> In C++, STL provides a vector class which behaves as an array except you can
> insert/delete elements from it. I'm wondering what is the best Haskell data
> structure to use to simulate this, either mutable or immutable.
>
> I've looked at the Array interface, but although there is the // operation
> to replace some elements, there does not appear to be a simple way to
> delete/insert elements.
>
> Ideally I'd like functions like:
>
> -- Insert new elements starting at the specified index, moving the others up
> to make room
> insert:: Array i e -> i -> [e] -> Array i e
>
> -- Delete a range of elements, moving later elements back down
> delete:: Array i e -> i -> i -> Array e
>
> -- Append a new element to the end of an array
> push_back :: Array i e -> e -> Array i e
>
> Is there an efficient way of implementing these operations in Haskell, based
> on arrays, or should I be using some other data structure altogether eg a
> list?
>
> Also, for large arrays, am I right in thinking that it is still more
> efficient to use immutable arrays rather than mutable arrays (because it is
> easier for gc to always just deal with immutable values)?
>
> Thanks, Brian.
>
> _______________________________________________
> Haskell-Cafe mailing list
> [hidden email]
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Advice needed on best way to simulate an STL vector

Udo Stenzel
In reply to this post by Brian Hulley
Brian Hulley wrote:
> In C++, STL provides a vector class which behaves as an array except you
> can insert/delete elements from it.

Though you shouldn't.  If you constantly insert and delete in the middle
of a std::vector, you're not using the right data structure.  In fact,
std::vector is almost always wrong and std::deque would probably serve
you better.


> I'm wondering what is the best Haskell
> data structure to use to simulate this, either mutable or immutable.

The obvious mutable data structure is an (STRef (STArray i a)).  You can
implement std::vector in terms of that, almost literally translating
from C++.  If you want Haskell code that looks as ugly as C++, you
should do exactly that.

Immutable array-like thing with insertion and deletion are an
ill-conceived idea, imho.  Every write operation would require a
complete copy and often a reallocation, too.

Instead, use some functional sequence implementation, like Finger Trees.
Operations in the middle of the sequence incur a logarithmic cost, but
thats better than constantly copying the whole thing around.  Being
immutable it also results in more idiomatic code where you don't need to
drag the ST monad around everywhere.  You might also consider a
Finger Tree of smallish Arrays, that's about the closest equivalent to
std::deque you can get.


Udo.
--
"In the software business there are many enterprises for which it is not
clear that science can help them; that science should try is not clear
either."
        -- E. W. Dijkstra

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe

signature.asc (196 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Advice needed on best way to simulate an STL vector

Cale Gibbard
In reply to this post by Cale Gibbard
Oops, replace Array there with DiffArray.

On 19/04/06, Cale Gibbard <[hidden email]> wrote:

> Well, you could use something like C++ does for implementing STL
> vectors in terms of arrays -- iirc, it generally doubles the allocated
> size each time applying an operation to the vector would make it
> exceed its capacity (the current allocation size), and keeps track of
> the actual number of elements stored separately. It's important to do
> allocation which is proportional to the existing capacity, or repeated
> insertion will become quadratic time rather than linear. So
> essentially, some data structure like
>
> data Vector a = Vector { store :: Array Int a, end :: Int }
>
> would be a sufficient minimal way to start. When the store needs to be
> increased, you simply create a new array with twice as many elements,
> copy the initial elements in from the old array which is then thrown
> away, and only update end to the position of the actual last element.
> This is analogous to what C++ implementations of the vector class do.
>
> What will bite you is if you try to generalise from indices of type
> Int to instances of Ix -- the Ix operations assume that there are
> lower and upper bounds on indices. The upper bound of course quickly
> becomes a problem. You could however use Enum, which has toEnum and
> fromEnum, which are sufficient to use with the implementation in terms
> of Ints. It could also be claimed that Int isn't always the best
> initial type to use, and indeed I'd still feel safer with Integer
> somehow, but then, fromEnum and toEnum use Int, and if you have more
> than 2 billion elements in your vector at the same time, maybe you
> should be looking at your algorithm more carefully and/or doing your
> own low level memory allocation via FFI. :)
>
>  - Cale
>
> On 19/04/06, Brian Hulley <[hidden email]> wrote:
> > Hi -
> > In C++, STL provides a vector class which behaves as an array except you can
> > insert/delete elements from it. I'm wondering what is the best Haskell data
> > structure to use to simulate this, either mutable or immutable.
> >
> > I've looked at the Array interface, but although there is the // operation
> > to replace some elements, there does not appear to be a simple way to
> > delete/insert elements.
> >
> > Ideally I'd like functions like:
> >
> > -- Insert new elements starting at the specified index, moving the others up
> > to make room
> > insert:: Array i e -> i -> [e] -> Array i e
> >
> > -- Delete a range of elements, moving later elements back down
> > delete:: Array i e -> i -> i -> Array e
> >
> > -- Append a new element to the end of an array
> > push_back :: Array i e -> e -> Array i e
> >
> > Is there an efficient way of implementing these operations in Haskell, based
> > on arrays, or should I be using some other data structure altogether eg a
> > list?
> >
> > Also, for large arrays, am I right in thinking that it is still more
> > efficient to use immutable arrays rather than mutable arrays (because it is
> > easier for gc to always just deal with immutable values)?
> >
> > Thanks, Brian.
> >
> > _______________________________________________
> > Haskell-Cafe mailing list
> > [hidden email]
> > http://www.haskell.org/mailman/listinfo/haskell-cafe
> >
>
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Advice needed on best way to simulate an STL vector

Brian Hulley
In reply to this post by Udo Stenzel
On Wednesday 19th April 2006 18:09PM Udo Stenzel wrote:

> Brian Hulley wrote:
>> In C++, STL provides a vector class which behaves as an array except you
>> can insert/delete elements from it.

> Though you shouldn't.  If you constantly insert and delete in the middle
> of a std::vector, you're not using the right data structure.  In fact,
> std::vector is almost always wrong and std::deque would probably serve
> you better.

std::deque only gives fast insert/delete at the ends so for insert/delete in
the middle it is still slow, and any speedup relative to std::vector might
be offset by extra slowness in subscripting if multiple physical blocks of
memory are used to simulate a contiguous array. I could have used a
std::list (which is doubly linked) but then I'd lose the constant time
random element access, so in my particular case (which was a text buffer for
an edit control implemented as a std::vector of lines where each line
contains some book-keeping info plus a std::vector of character info) the
std::vector seemed to work out to be the best one to use, since there are
more read operations (rendering, parsing etc) than write operations (user
typing a character).

>> I'm wondering what is the best Haskell
>> data structure to use to simulate this, either mutable or immutable.

> The obvious mutable data structure is an (STRef (STArray i a)).  You can
> implement std::vector in terms of that, almost literally translating
> from C++.  If you want Haskell code that looks as ugly as C++, you
> should do exactly that.

I'm keen to learn what the Haskell way is rather than just porting my old
C++ code directly.

> Immutable array-like thing with insertion and deletion are an
> ill-conceived idea, imho.  Every write operation would require a
> complete copy and often a reallocation, too.

It depends how many write operations there are in practice, versus how many
times you need to read from it using array access. A reallocation (amortized
cost O(0)) and copy (a simple memcpy) might be very fast compared to the
time it might take for generational garbage collection to deal with the
problem of cells in a previous generation referencing new cells as happens
in mutable data structures. But of course it's probably not optimal.

> Instead, use some functional sequence implementation, like Finger Trees.
> Operations in the middle of the sequence incur a logarithmic cost, but
> thats better than constantly copying the whole thing around.  Being
> immutable it also results in more idiomatic code where you don't need to
> drag the ST monad around everywhere.  You might also consider a
> Finger Tree of smallish Arrays, that's about the closest equivalent to
> std::deque you can get.

Thanks, I've downloaded a paper about them from
http://www.informatik.uni-bonn.de/~ralf/publications/FingerTrees.pdf so I'll
see if I can understand it! Looks interesting...

Best regards, Brian.

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Advice needed on best way to simulate an STL vector

Christophe Poucet
In reply to this post by Cale Gibbard
Either way,

Even an STL vector has O(N) insert because it needss to shift all the
items past the current element where insertion is taking place.  If your
application is insert intensive the most ideal structure is a map.  
Concerning the suggestion regarding doubling the capacity, I don't
believe that STL actually doubles the allocated size.  Most applications
that use vector-like data structures (STL Vector, Ruby Array) typically
use the fibonacci sequence for the sizes because the doubling grows too
fast.

Cheers

Cale Gibbard wrote:

>Oops, replace Array there with DiffArray.
>
>On 19/04/06, Cale Gibbard <[hidden email]> wrote:
>  
>
>>Well, you could use something like C++ does for implementing STL
>>vectors in terms of arrays -- iirc, it generally doubles the allocated
>>size each time applying an operation to the vector would make it
>>exceed its capacity (the current allocation size), and keeps track of
>>the actual number of elements stored separately. It's important to do
>>allocation which is proportional to the existing capacity, or repeated
>>insertion will become quadratic time rather than linear. So
>>essentially, some data structure like
>>
>>data Vector a = Vector { store :: Array Int a, end :: Int }
>>
>>would be a sufficient minimal way to start. When the store needs to be
>>increased, you simply create a new array with twice as many elements,
>>copy the initial elements in from the old array which is then thrown
>>away, and only update end to the position of the actual last element.
>>This is analogous to what C++ implementations of the vector class do.
>>
>>What will bite you is if you try to generalise from indices of type
>>Int to instances of Ix -- the Ix operations assume that there are
>>lower and upper bounds on indices. The upper bound of course quickly
>>becomes a problem. You could however use Enum, which has toEnum and
>>fromEnum, which are sufficient to use with the implementation in terms
>>of Ints. It could also be claimed that Int isn't always the best
>>initial type to use, and indeed I'd still feel safer with Integer
>>somehow, but then, fromEnum and toEnum use Int, and if you have more
>>than 2 billion elements in your vector at the same time, maybe you
>>should be looking at your algorithm more carefully and/or doing your
>>own low level memory allocation via FFI. :)
>>
>> - Cale
>>
>>On 19/04/06, Brian Hulley <[hidden email]> wrote:
>>    
>>
>>>Hi -
>>>In C++, STL provides a vector class which behaves as an array except you can
>>>insert/delete elements from it. I'm wondering what is the best Haskell data
>>>structure to use to simulate this, either mutable or immutable.
>>>
>>>I've looked at the Array interface, but although there is the // operation
>>>to replace some elements, there does not appear to be a simple way to
>>>delete/insert elements.
>>>
>>>Ideally I'd like functions like:
>>>
>>>-- Insert new elements starting at the specified index, moving the others up
>>>to make room
>>>insert:: Array i e -> i -> [e] -> Array i e
>>>
>>>-- Delete a range of elements, moving later elements back down
>>>delete:: Array i e -> i -> i -> Array e
>>>
>>>-- Append a new element to the end of an array
>>>push_back :: Array i e -> e -> Array i e
>>>
>>>Is there an efficient way of implementing these operations in Haskell, based
>>>on arrays, or should I be using some other data structure altogether eg a
>>>list?
>>>
>>>Also, for large arrays, am I right in thinking that it is still more
>>>efficient to use immutable arrays rather than mutable arrays (because it is
>>>easier for gc to always just deal with immutable values)?
>>>
>>>Thanks, Brian.
>>>
>>>_______________________________________________
>>>Haskell-Cafe mailing list
>>>[hidden email]
>>>http://www.haskell.org/mailman/listinfo/haskell-cafe
>>>
>>>      
>>>
>_______________________________________________
>Haskell-Cafe mailing list
>[hidden email]
>http://www.haskell.org/mailman/listinfo/haskell-cafe
>
>  
>

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Advice needed on best way to simulate an STL vector

Brian Hulley
In reply to this post by Cale Gibbard
Thanks. I might try this if I don't have any luck with finger trees (from
Udo's post), or if they seem too heavy for the simple thing I'm planning to
use them for (implementing the text buffer for an edit control which needs a
mutable array of lines where each line contains a mutable array of character
info). I don't need non-Int indices so your data type for Vector would be
fine.

Best regards, Brian.

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Advice needed on best way to simulate an STL vector

Brian Hulley
In reply to this post by Sebastian Sylvan
Sebastian Sylvan wrote:

> On 4/19/06, Brian Hulley <[hidden email]> wrote:
>> [snip]
>> Ideally I'd like functions like:
>>
>> -- Insert new elements starting at the specified index, moving the
>> others up to make room
>> insert:: Array i e -> i -> [e] -> Array i e
>>
>> -- Delete a range of elements, moving later elements back down
>> delete:: Array i e -> i -> i -> Array e
>>
>> -- Append a new element to the end of an array
>> push_back :: Array i e -> e -> Array i e
>>
>> Is there an efficient way of implementing these operations in
>> Haskell, based on arrays, or should I be using some other data
>> structure altogether eg a list?
>>
>
> Well not efficient in the C++ sense. You'd still have to create a
> whole new array for all of those operations. You can use the (//)
> operator to update all the elements you need to update...
>
> insert i xs arr = arr // shift_list
> where n = length xs
>          shift_list = zip [i..] xs ++ [ (j,arr!(j - n)) | j <- [i+n ..
> snd (bounds arr)]]

Thanks, but I think this would be too slow, unless the compiler could
somehow optimize out the list argument from // . I imagined that
insert/delete would just use C memcpy internally to quickly blit the cells
up or down.

>
>> Also, for large arrays, am I right in thinking that it is still more
>> efficient to use immutable arrays rather than mutable arrays
>> (because it is easier for gc to always just deal with immutable
>> values)?
>
> Well that depends on what you're doing. Mutable arrays are more
> efficient for most operation (replace is O(1) instead of O(n), for
> example), but it's possible that for small arrays immutable has a
> smaller constant factor (I'm certainly not sure that it does!)


I was thinking perhaps generational garbage collection suffers badly when
faced with mutable arrays but perhaps I'm wrong or it is not a necessary
consequence of using mutable data structures.

Best regards, Brian.

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Advice needed on best way to simulate an STL vector

Bugzilla from robdockins@fastmail.fm
In reply to this post by Brian Hulley

On Apr 19, 2006, at 3:06 PM, Brian Hulley wrote:

> Thanks. I might try this if I don't have any luck with finger trees  
> (from Udo's post), or if they seem too heavy for the simple thing  
> I'm planning to use them for (implementing the text buffer for an  
> edit control which needs a mutable array of lines where each line  
> contains a mutable array of character info). I don't need non-Int  
> indices so your data type for Vector would be fine.

In that case, you may be interested in this paper, which discusses a  
data structure specifically designed for strings called 'ropes':

http://www.cs.ubc.ca/local/reading/proceedings/spe91-95/spe/vol25/ 
issue12/spe986.pdf


I'm not aware of a Haskell implementation of ropes, but there may  
well be one floating around.  Actually, I'd be kind of surprised if  
someone hasn't implemented this already (does YI use ropes?); it  
seems like such a great fit for Haskell.




Rob Dockins

Speak softly and drive a Sherman tank.
Laugh hard; it's a long way to the bank.
           -- TMBG



_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Advice needed on best way to simulate an STL vector

Cale Gibbard
In reply to this post by Brian Hulley
I should perhaps point out that in the development GHC (iirc), there
is a library called Data.Sequence which uses 2-3 finger trees to get
rather nice efficient sequences. Operations on both ends (appending or
dropping) are constant time, while operations in the middle tend to be
on the order of the logarithm of the distance to the closer of the two
ends. For instance, concatenation and splitting at a point both have
complexity proportional to the logarithm of the smaller of the two
parts involved.

 - Cale

On 19/04/06, Brian Hulley <[hidden email]> wrote:

> Thanks. I might try this if I don't have any luck with finger trees (from
> Udo's post), or if they seem too heavy for the simple thing I'm planning to
> use them for (implementing the text buffer for an edit control which needs a
> mutable array of lines where each line contains a mutable array of character
> info). I don't need non-Int indices so your data type for Vector would be
> fine.
>
> Best regards, Brian.
>
>
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Advice needed on best way to simulate an STL vector

Udo Stenzel
In reply to this post by Brian Hulley
Brian Hulley wrote:
> in my particular case (which was a text buffer
> for an edit control implemented as a std::vector of lines where each line
> contains some book-keeping info plus a std::vector of character info)
> [...]
> I'm keen to learn what the Haskell way is rather than just porting my old
> C++ code directly.

Well, in this case I'd even say the Haskell way and the C++ way
coincide.  The best data structure is a rope (not standard in C++, but
widely available), which is basically a (balanced) tree of small
immutable strings that can share substrings.

Ropes provide indexing, concatenation and substring extraction with
logarithmic costs and traversal in amortized linear time.  Operations
through iterators have amortized constant cost, and the overall cost is
quite low.  They work best with garbage collection and actually sound
very Haskellish.  You could even dump your whole text into a single
rope, you don't need to split it into lines.  In fact, a text editor
implemented in exactly this way is the major selling point of the Rope
library.

We don't have an implementation of Ropes (yet?).  I think, a Finger Tree
of Fast Packed Strings might be the closest thing.  I'd even implement
it this way, if the low performance of raw Strings finally overcame my
inertia...


> A reallocation
> (amortized cost O(0)) and copy (a simple memcpy) might be very fast

Doing a memcpy every time a character is inserted is a Bad Thing.  It
gets slower the longer the edited line is.  Vim seems to do something
like that and I positively hate this behavior.  Use two Ropes instead
(or two Finger Trees) and the cost becomes amortized O(1).


> Thanks, I've downloaded a paper about them from
> http://www.informatik.uni-bonn.de/~ralf/publications/FingerTrees.pdf so
> I'll see if I can understand it! Looks interesting...

Even better, thanks to Ross Paterson you can get code at
http://www.soi.city.ac.uk/~ross/software/html/Data.Sequence.html or
simply get a recent version of GHC:
http://www.haskell.org/ghc/dist/current/docs/libraries/base/Data-Sequence.html


Udo.
--
Fuchs zu sein hei├čt nicht nur, einen Schwanz zu haben...
        -- Gipfelbuch auf dem Postakegel

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe

signature.asc (196 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Advice needed on best way to simulate an STL vector

Brian Hulley
In reply to this post by Cale Gibbard
Cale Gibbard wrote:
> I should perhaps point out that in the development GHC (iirc), there
> is a library called Data.Sequence which uses 2-3 finger trees to get
> rather nice efficient sequences. Operations on both ends (appending or
> dropping) are constant time, while operations in the middle tend to be
> on the order of the logarithm of the distance to the closer of the two
> ends. For instance, concatenation and splitting at a point both have
> complexity proportional to the logarithm of the smaller of the two
> parts involved.

Does anyone know where I can download this from (since I was just about to
try to implement exactly this myself based on the paper)? I can't find it
listed in the hierarchical libraries for ghc.

Thanks, Brian.

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Advice needed on best way to simulate an STL vector

Brian Hulley
In reply to this post by Bugzilla from robdockins@fastmail.fm
Robert Dockins wrote:

> On Apr 19, 2006, at 3:06 PM, Brian Hulley wrote:
>
>> Thanks. I might try this if I don't have any luck with finger trees
>> (from Udo's post), or if they seem too heavy for the simple thing
>> I'm planning to use them for (implementing the text buffer for an
>> edit control which needs a mutable array of lines where each line
>> contains a mutable array of character info). I don't need non-Int
>> indices so your data type for Vector would be fine.
>
> In that case, you may be interested in this paper, which discusses a
> data structure specifically designed for strings called 'ropes':
>
> http://www.cs.ubc.ca/local/reading/proceedings/spe91-95/spe/vol25/
> issue12/spe986.pdf
>
>
> I'm not aware of a Haskell implementation of ropes, but there may
> well be one floating around.  Actually, I'd be kind of surprised if
> someone hasn't implemented this already (does YI use ropes?); it
> seems like such a great fit for Haskell.

Thanks, I'll look into this paper too.
Incidentally, there are quite a lot of interesting issues that come up with
the task of implementing an edit control (or any other user interface
element) in Haskell. For example, if the user types several characters, a
naive implementation would enter each character individually into the text
buffer, resulting in a sequence of updates, even though the control would
not be re-rendered until there are no more (character) messages left in the
message queue, whereas another way is to represent the operations explicitly
eg Insert 'b' (Insert 'a' (Buffer ...)) so that this can be optimized to
InsertString "ab" (Buffer ...) resulting in only one update before the
control is re-rendered...

Best regards, Brian.

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Advice needed on best way to simulate an STL vector

Brian Hulley
In reply to this post by Udo Stenzel
On Wednesday 19th April 2006 20:39 Udo Stenzel wrote:
[snip]

> Even better, thanks to Ross Paterson you can get code at
> http://www.soi.city.ac.uk/~ross/software/html/Data.Sequence.html or
> simply get a recent version of GHC:
> http://www.haskell.org/ghc/dist/current/docs/libraries/base/Data-Sequence.html

Thanks for the link. On the wiki, all the links I found point to old
documentation ie docs/latest/html/libraries instead of
dist/current/docs/libraries.

Best regards, Brian.

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Advice needed on best way to simulate an STL vector

Greg Fitzgerald
In reply to this post by Brian Hulley
Brian,

> implementing the text buffer for an edit control

Just a few weeks ago I was thinking about a functional way to implement an edit control.  I ended up with the code below.  The function 'text' takes a stream of user input and returns the text to be displayed.

Rather than using a mutable buffer, I chose to implement this using a stream of user input and two stacks where each stack contains the characters on opposite sides of the cursor.  I'm certainly no expert, but I believe all operations are constant time, except that returning the final string at the end of the input stream is O(n).  Of course, if you have a huge amount of text to display and need to display it after each key is pressed, then this is pretty worthless.

text s = mySort s [] []

mySort      []     stack     sorted  = (reverse sorted) ++ stack
mySort ('R':xs)       []     sorted  = mySort xs       []     sorted
mySort ('R':xs) (s:stack)    sorted  = mySort xs    stack  (s:sorted)
mySort ('L':xs)    stack         []  = mySort xs    stack         []
mySort ('L':xs)    stack  (s:sorted) = mySort xs (s:stack)    sorted
mySort   (x:xs)    stack     sorted  = mySort xs    stack  (x:sorted)

Here's how it works:
The string 's' in the 'text' function is a string of numbers (sorry, my app only needed to handle numbers) and two special characters 'L' and 'R' which translate to MoveCursorOnePositionRight and MoveCursorOnePositionLeft

In 'mySort', numeric characters in the input stream are pushed onto the 'sorted' stack.    A 'Left' movement causes the head of the 'sorted' stack to be transferred to 'stack'.  A 'Right' movement causes the head of the 'stack' to be transferred to 'sorted'.  At the end of the input stream, the characters to the right of the cursor (the characters in 'stack') are appended to the characters to the left of the cursor (the reverse of 'sorted').

I'm new to Haskell so maybe Ropes are better, but if the problem you need to solve is as simple as mine, it's hard to read research papers when you can get the job finished with 7 lines of Prelude code.

Thanks,
Greg

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Advice needed on best way to simulate an STL vector

Brian Hulley
On Wednesday 19th April 2006 21::55 Greg Fitzgerald wrote:
> Brian,

> > implementing the text buffer for an edit control

> Just a few weeks ago I was thinking about a functional way to
> implement an edit control.  I ended up with the code below.
> The function 'text' takes a stream of  user input and returns the
> text to be displayed.

> Rather than using a mutable buffer, I chose to implement this
> using a stream of user input and two stacks where each stack
> contains the characters on opposite sides of the cursor.
>  I'm certainly no expert, but I believe all operations are constant
> time, except that returning the final string at the end of the input
> stream is O(n).  Of course, if you have a huge amount of text
> to display and need to display it after each key is pressed,
> then this is pretty worthless.
> [snip]

Thanks! I see you are representing the text buffer from the point of view of
the cursor therefore all edit ops are constant time or O(k) where k is the
numbers of chars involved in the change (eg for cut and paste).
Some ops such as clicking the mouse on a different part of the text after
scrolling up/down would be O(n) in the difference between the old and new
cursor position, but since most editing is presumably sequential, perhaps
the amortized cost would still be constant.

I think this would also work even for very large texts, because many things
can probably be done without having the create the whole string each time,
but I'll have to try it out to see how it works in practice.

In any case it is a very useful pattern to make use of the locality of
editing to get such a direct and efficient purely functional representation.
Before I was just looking at the data structure from a "bird's eye" view,
but I see the key to functional efficiency here is to represent it from a
place where everything non-local is not changing ie from the point of change
itself...

Thanks for this pattern,
Brian.

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re[2]: Advice needed on best way to simulate an STL vector

Bulat Ziganshin-2
In reply to this post by Brian Hulley
Hello Brian,

Thursday, April 20, 2006, 12:36:12 AM, you wrote:

> Thanks for the link. On the wiki, all the links I found point to old
> documentation ie docs/latest/html/libraries instead of
> dist/current/docs/libraries.

the first link is for STABLE version (6.4), the second for the HEAD
(i.e. beta-stage) one (6.5, what will become 6.6 when it will be
released)

--
Best regards,
 Bulat                            mailto:[hidden email]

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Advice needed on best way to simulate an STL vector

Ross Paterson
In reply to this post by Udo Stenzel
On Wed, Apr 19, 2006 at 09:32:19PM +0200, Udo Stenzel wrote:
> We don't have an implementation of Ropes (yet?).  I think, a Finger Tree
> of Fast Packed Strings might be the closest thing.  I'd even implement
> it this way, if the low performance of raw Strings finally overcame my
> inertia...

Sounds like a good idea.  I've placed the general annotated finger tree
structure from our paper on the webpage

        http://www.soi.city.ac.uk/~ross/papers/FingerTree.html

It might be useful if anyone would like to attempt this.

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe