k-minima in Haskell

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
37 messages Options
12
R M
Reply | Threaded
Open this post in threaded view
|

k-minima in Haskell

R M
What's the best way to implement the following function in haskell: Given a list and an integer k as input return the indices of the least k elements in the list. The code should be elegant and also, more importantly, must not make more than the minimum O(k*length(list)) number of operations.

R M


Check out what you're missing if you're not on Yahoo! Messenger
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: k-minima in Haskell

Donald Bruce Stewart
mrvr84:
>
>    What's the best way to implement the following function in
>    haskell: Given a list and an integer k as input return the
>    indices of the least k elements in the list. The code should
>    be elegant and also, more importantly, must not make more
>    than the minimum O(k*length(list)) number of operations.
>    R M

Is this a homework question?

-- Don
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: k-minima in Haskell

Stefan O'Rear
In reply to this post by R M
On Thu, Apr 12, 2007 at 08:58:33AM +0530, raghu vardhan wrote:
> What's the best way to implement the following function in haskell:
> Given a list and an integer k as input return the indices of the least
> k elements in the list. The code should be elegant and also, more
> importantly, must not make more than the minimum O(k*length(list))
> number of operations.

Go read and thoroughly understand "Why Functional Programming
Matters."

Also, your asyptotic complexity bound is just plain wrong.  I'd give
faster code, but Don is suspicious (and I can never tell these things
myself).

Stefan
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: k-minima in Haskell

Stefan O'Rear
On Wed, Apr 11, 2007 at 08:38:48PM -0700, Stefan O'Rear wrote:

> On Thu, Apr 12, 2007 at 08:58:33AM +0530, raghu vardhan wrote:
> > What's the best way to implement the following function in haskell:
> > Given a list and an integer k as input return the indices of the least
> > k elements in the list. The code should be elegant and also, more
> > importantly, must not make more than the minimum O(k*length(list))
> > number of operations.
>
> Go read and thoroughly understand "Why Functional Programming
> Matters."
>
> Also, your asyptotic complexity bound is just plain wrong.  I'd give
> faster code, but Don is suspicious (and I can never tell these things
> myself).
>
> Stefan

Don tells me (in #haskell) that you are legitimate, so here is the
example:

kminima k lst = take k $ sort lst

If you want to be really explicit about it, here is a sort that will
work:

sort [] = []
sort l@(x:_) = filter (<x) l ++ filter (==x) l ++ filter (>x) l

(A stable quicksort, btw)

Note that this is FASTER than your bound - somewhere between O(n) and
O(n*log n).

Ain't lazy evaluation great? :)

Stefan
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: k-minima in Haskell

Tim Chevalier
On 4/11/07, Stefan O'Rear <[hidden email]> wrote:
>
> If you want to be really explicit about it, here is a sort that will
> work:
>
> sort [] = []
> sort l@(x:_) = filter (<x) l ++ filter (==x) l ++ filter (>x) l
>
> (A stable quicksort, btw)

You may be missing a few recursive calls there :-)

Cheers,
Tim

--
Tim Chevalier * [hidden email] * Often in error, never in doubt
Confused? See http://catamorphism.org/transition.html
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: k-minima in Haskell

Stefan O'Rear
On Wed, Apr 11, 2007 at 09:20:12PM -0700, Tim Chevalier wrote:

> On 4/11/07, Stefan O'Rear <[hidden email]> wrote:
> >
> >If you want to be really explicit about it, here is a sort that will
> >work:
> >
> >sort [] = []
> >sort l@(x:_) = filter (<x) l ++ filter (==x) l ++ filter (>x) l
> >
> >(A stable quicksort, btw)
>
> You may be missing a few recursive calls there :-)

Indeed.

Stefan
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
R M
Reply | Threaded
Open this post in threaded view
|

Re: k-minima in Haskell

R M
In reply to this post by R M
And just to remind people, the question is to find the indices and not the numbers themselves. For example on input '3, [10,9,8,..., 3,2,1]' the output must be '[9,8,7]'.

----- Original Message ----
From: Stefan O'Rear <[hidden email]>
To: raghu vardhan <[hidden email]>
Sent: Wednesday, 11 April, 2007 11:17:15 PM
Subject: Re: [Haskell-cafe] k-minima in Haskell

On Thu, Apr 12, 2007 at 09:30:22AM +0530, raghu vardhan wrote:
> Hmmm.  That's not something I was looking for. I'm not sure the
> running time is good enough (think of k as being 2 - then you should
> not make more than 2n comparisons) - even with lazy evaluation,
> quick sort won't have a bound of O(k*n).

Muahahahaha!
Muahahahahahahahahahaha!
Muahaha!

Actually it DOES.

(this list courtesy of a ghci one-liner)

find the 3 minima of [71,71,17,14,16,91,18,71,58,75,65,79,76,18,4,45,87,51,93,36]

============

take 3 (sort [71,71,17,14,16,91,18,71,58,75,65,79,76,18,4,45,87,51,93,36])

vvvvvvvvvvvv

take 3 (sort (filter (<71) [71,71,17,14,16,91,18,71,58,75,65,79,76,18,4,45,87,51,93,36]) ++ a bunch of stuff I won't track because it won't be evaluated)

vvvvvvvvvvvv  (comparisons so far: 20)

take 3 (sort [17, 14, 16, 18, 58, 65, 18, 4, 45, 51, 36])

take 3 (sort (filter (<17) [17, 14, 16, 18, 58, 65, 18, 4, 45, 51, 36]) ++
        filter (==17) [17, 14, 16, 18, 58, 65, 18, 4, 45, 51, 36] ++
        sort (filter (>17) [17, 14, 16, 18, 58, 65, 18, 4, 45, 51, 36]))

vvvvvvvvvvvv (31)

take 3 (sort [14, 16, 18, 4] ++
        filter (==17) [17, 14, 16, 18, 58, 65, 18, 4, 45, 51, 36] ++
        sort (filter (>17) [17, 14, 16, 18, 58, 65, 18, 4, 45, 51, 36]))

take 3 (sort (filter (<14) [14, 16, 18, 4]) ++ sort (filter (==14) [14, 16, 18, 4]) ++ sort (filter (>14) [14, 16, 18, 4]) ++
        filter (==17) [17, 14, 16, 18, 58, 65, 18, 4, 45, 51, 36] ++
        sort (filter (>17) [17, 14, 16, 18, 58, 65, 18, 4, 45, 51, 36]))

vvvvvvvvvvvv (39)

take 3 (sort [4] ++ sort [14] ++ sort (filter (>14) [14, 16, 18, 4]) ++
        filter (==17) [17, 14, 16, 18, 58, 65, 18, 4, 45, 51, 36] ++
        sort (filter (>17) [17, 14, 16, 18, 58, 65, 18, 4, 45, 51, 36]))

take 3 (4 : 14 : sort (filter (>14) [14, 16, 18, 4]) ++
        filter (==17) [17, 14, 16, 18, 58, 65, 18, 4, 45, 51, 36] ++
        sort (filter (>17) [17, 14, 16, 18, 58, 65, 18, 4, 45, 51, 36]))

4 : 14 : take 1 (sort (filter (>14) [14, 16, 18, 4]) ++
        filter (==17) [17, 14, 16, 18, 58, 65, 18, 4, 45, 51, 36] ++
        sort (filter (>17) [17, 14, 16, 18, 58, 65, 18, 4, 45, 51, 36]))

vvvvvvvvvvvv (43)

4 : 14 : take 1 (sort [16, 18] ++
        filter (==17) [17, 14, 16, 18, 58, 65, 18, 4, 45, 51, 36] ++
        sort (filter (>17) [17, 14, 16, 18, 58, 65, 18, 4, 45, 51, 36]))

vvvvvvvvvvvv (47)

[4, 14, 16]

47 close enough to O(n*k) for you?  (remember this is quicksort we are
dealing with, O(n^2) worst case)

Stefan



Check out what you're missing if you're not on Yahoo! Messenger
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: k-minima in Haskell

ajb@spamcop.net
In reply to this post by R M
G'day all.

Quoting raghu vardhan <[hidden email]>:

> What's the best way to implement the following function in haskell:
> Given a list and an integer k as input return the indices of the least k
> elements in the list. The code should be elegant and also, more importantly,
> must not make more than the minimum O(k*length(list)) number of operations.

Pretty much like everyone has says, although it's best to use a real
lazy O(n log n) sort, not quicksort-with-dumbest-pivot.  To get the
indices, use the Schwartzian transform:

sortWith :: (Ord b) => (a -> b) -> [a] -> [a]
sortWith f = mergeRuns . runs
  where
    runs = map (:[])

    mergeRuns [] = []
    mergeRuns [xs] = xs
    mergeRuns xss = mergeRuns (mergeRun xss)

    mergeRun (xs1:xs2:xss) = mergeOne xs1 xs2 : mergeRun xss
    mergeRun xss = xss

    mergeOne [] ys = ys
    mergeOne xs [] = xs
    mergeOne xs'@(x:xs) ys':(y:ys)
        = case compare (f x) (f y) of
            LT -> x : mergeOne xs ys'
            GT -> y : mergeOne xs' ys
            EQ -> x : y : mergeOne xs ys

getKMinima :: (Ord a) => [a] -> [Int]
getKMinima k = map fst . take k . sortWith snd . zip [0..]

Cheers,
Andrew Bromage
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
R M
Reply | Threaded
Open this post in threaded view
|

Re: k-minima in Haskell

R M
In reply to this post by R M

There seems to be some confusion about the question. There are two key things to keep in mind here:
1) You must make at most O(k*n) comparisons (in the worst case) if the list has length n.
2) The output must be the indices and not the numbers themselves.

So, any algorithm that sorts is no good (think of n as huge, and k small). Another interesting caveat to this is that if k=2, you can actually solve the problem with just (n+log n) comparisons in worst case(instead of 2*n, that you get by a naive approach), and it's a nice exercise to do this.

As a further clarification, this is not a homework question. I genereally do whatever programming I do in Matlab, as I work with matrices (huge ones) and use this function a lot. I just wanted to see how different an implementation you can get in Haskell (I am new to Haskell so I might not be able to come up with the best way to do this).

----- Original Message ----
From: Stefan O'Rear <[hidden email]>
To: raghu vardhan <[hidden email]>
Cc: [hidden email]
Sent: Wednesday, 11 April, 2007 10:47:08 PM
Subject: Re: [Haskell-cafe] k-minima in Haskell

On Wed, Apr 11, 2007 at 08:38:48PM -0700, Stefan O'Rear wrote:

> On Thu, Apr 12, 2007 at 08:58:33AM +0530, raghu vardhan wrote:
> > What's the best way to implement the following function in haskell:
> > Given a list and an integer k as input return the indices of the least
> > k elements in the list. The code should be elegant and also, more
> > importantly, must not make more than the minimum O(k*length(list))
> > number of operations.
>
> Go read and thoroughly understand "Why Functional Programming
> Matters."
>
> Also, your asyptotic complexity bound is just plain wrong.  I'd give
> faster code, but Don is suspicious (and I can never tell these things
> myself).
>
> Stefan

Don tells me (in #haskell) that you are legitimate, so here is the
example:

kminima k lst = take k $ sort lst

If you want to be really explicit about it, here is a sort that will
work:

sort [] = []
sort l@(x:_) = filter (<x) l ++ filter (==x) l ++ filter (>x) l

(A stable quicksort, btw)

Note that this is FASTER than your bound - somewhere between O(n) and
O(n*log n).

Ain't lazy evaluation great? :)

Stefan



Check out what you're missing if you're not on Yahoo! Messenger
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: k-minima in Haskell

ajb@spamcop.net
G'day all.

Quoting raghu vardhan <[hidden email]>:

> So, any algorithm that sorts is no good (think of n as huge, and k small).

The essence of all the answers that you've received is that it doesn't
matter if k is small, sorting is still the most elegant answer in Haskell.

The reason is that in Haskell, a function which sort function will produce the
sorted
list lazily.  If you don't demand the while list, you don't perform
the whole sort.

Some algorithms are better than others for minimising the amount of
work if not all of the list is demanded, but ideally, producing the
first k elements will take O(n log k + k) time.

Cheers,
Andrew Bromage
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: k-minima in Haskell

Nicolas Frisby
[sorry for the double, ajb]

Since there seemed to be a disconnect between the expectation and
the previous answers, I thought an alternative suggestion might help
out. This sort of thing (haha) usually isn't my cup o' tea, so please
point out any blunders.

RM, is this more like what you had in mind? It leans more towards an
iterative approach. If so, I encourage you to compare this to the
other solutions and try understand how those solutions rely upon
laziness. Stefan and Andrew, feel free to point out the drawbacks to
this approach that I'm probably overlooking.


kminima n l = map fst (foldr insert' (List.sort pre) suf)
    where (pre, suf) = (splitAt n . zip [0..]) l

-- I think this insertion sort could be
-- O(log k) with a better data structure.
insert' x [] = []
insert' x (y : ys) | snd x < snd y = x : y : dropLast ys'
                           | otherwise = y : insert' x ys
    where dropLast ys = take (length ys - 1) ys

We grab the first k elements and sort them, this list is our first
approximation to the k-minima. We then process the rest of the list,
checking if the current element is smaller than any of the minima of
the current approximation. If the current element is smaller, we
improve the current approximation by inserting the new element and
dropping the biggest (i.e. last) minimum from the minima accumulator.

The worst behavior is: sort(k) + (n-k) * k comparisons. This could be
improved (to: sort(k) + (n-k) * log k comparisons, I think) with a
better data structure for the minima approximation.

On 4/12/07, [hidden email] <[hidden email]> wrote:

> G'day all.
>
> Quoting raghu vardhan <[hidden email]>:
>
> > So, any algorithm that sorts is no good (think of n as huge, and k small).
>
> The essence of all the answers that you've received is that it doesn't
> matter if k is small, sorting is still the most elegant answer in Haskell.
>
> The reason is that in Haskell, a function which sort function will produce the
> sorted
> list lazily.  If you don't demand the while list, you don't perform
> the whole sort.
>
> Some algorithms are better than others for minimising the amount of
> work if not all of the list is demanded, but ideally, producing the
> first k elements will take O(n log k + k) time.
>
> Cheers,
> Andrew Bromage
> _______________________________________________
> Haskell-Cafe mailing list
> [hidden email]
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: k-minima in Haskell

Heinrich Apfelmus
In reply to this post by ajb@spamcop.net
raghu vardhan <[hidden email]>:
> So, any algorithm that sorts is no good (think of n as huge, and k small).

With lazy evaluation, it is!

  http://article.gmane.org/gmane.comp.lang.haskell.general/15010


[hidden email] wrote:
> The essence of all the answers that you've received is that it doesn't
> matter if k is small, sorting is still the most elegant answer in Haskell.

(It's not only most elegant, it can even be put to work.)

> The reason is that in Haskell, a function which sort function will produce the
> sorted list lazily. If you don't demand the while list, you don't perform
> the whole sort.
>
> Some algorithms are better than others for minimising the amount of
> work if not all of the list is demanded, but ideally, producing the
> first k elements will take O(n log k + k) time.

You mean O(k * log n + n) of course.

Regards,
apfelmus

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: k-minima in Haskell

Dino Morelli
In reply to this post by R M
On Thu, 12 Apr 2007, raghu vardhan wrote:

>
> What's the best way to implement the following function in haskell:
> Given a list and an integer k as input return the indices of the least k
> elements in the list. The code should be elegant and also, more importantly, must not make more than the minimum O(k*length(list)) number of operations.
>
> R M
>


I don't know about performance, but trying this problem I was struck
again by the gorgeous, terse code that can be created:


import Data.List
import Data.Ord

kminima :: (Enum a, Num a, Ord b) => Int -> [b] -> [a]
kminima k lst = take k $ map fst $ sortBy (comparing snd) $ zip [0 ..] lst


commented:

    kminima k lst =
       take k                     -- We want k items off the front
       $ map fst                  -- Just the list of indices
       $ sortBy (comparing snd)   -- Sort the pairs by their snd
       $ zip [0 ..] lst           -- Preserve indices in tuples


Prelude> :l kminima.hs
[1 of 1] Compiling Main             ( kminima.lhs, interpreted )
Ok, modules loaded: Main.
*Main> kminima 3 [71,71,17,14,16,91,18,71,58,75,65,79,76,18,4,45,87,51,93,36]
[14,3,4]
*Main> kminima 4 [10,9,8,7,6,5,4,3,2,1]
[9,8,7,6]


--
  .~.    Dino Morelli
  /V\    email: [hidden email]
/( )\   irc: dino-
^^-^^   preferred distro: Debian GNU/Linux  http://www.debian.org
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: k-minima in Haskell

Kurt Hutchinson
In reply to this post by ajb@spamcop.net
On 4/12/07, [hidden email] <[hidden email]> wrote:

> To get the indices, use the Schwartzian transform:
>
> sortWith :: (Ord b) => (a -> b) -> [a] -> [a]
> sortWith f = mergeRuns . runs
>   where
>     runs = map (:[])
>
>     mergeRuns [] = []
>     mergeRuns [xs] = xs
>     mergeRuns xss = mergeRuns (mergeRun xss)
>
>     mergeRun (xs1:xs2:xss) = mergeOne xs1 xs2 : mergeRun xss
>     mergeRun xss = xss
>
>     mergeOne [] ys = ys
>     mergeOne xs [] = xs
>     mergeOne xs'@(x:xs) ys':(y:ys)
>         = case compare (f x) (f y) of
>             LT -> x : mergeOne xs ys'
>             GT -> y : mergeOne xs' ys
>             EQ -> x : y : mergeOne xs ys
>
> getKMinima :: (Ord a) => [a] -> [Int]
> getKMinima k = map fst . take k . sortWith snd . zip [0..]

For my own edification, what is the benefit of this sortWith over sortBy?

f `on` g = \ x y -> f ( g x ) ( g y )
kminima k = map fst . take k . sortBy ( compare `on` snd ) . zip [0..]
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: k-minima in Haskell

vincent kraeutler
Kurt Hutchinson wrote:

> On 4/12/07, [hidden email] <[hidden email]> wrote:
>> To get the indices, use the Schwartzian transform:
>>
>> sortWith :: (Ord b) => (a -> b) -> [a] -> [a]
>> sortWith f = mergeRuns . runs
>>   where
>>     runs = map (:[])
>>
>>     mergeRuns [] = []
>>     mergeRuns [xs] = xs
>>     mergeRuns xss = mergeRuns (mergeRun xss)
>>
>>     mergeRun (xs1:xs2:xss) = mergeOne xs1 xs2 : mergeRun xss
>>     mergeRun xss = xss
>>
>>     mergeOne [] ys = ys
>>     mergeOne xs [] = xs
>>     mergeOne xs'@(x:xs) ys':(y:ys)
>>         = case compare (f x) (f y) of
>>             LT -> x : mergeOne xs ys'
>>             GT -> y : mergeOne xs' ys
>>             EQ -> x : y : mergeOne xs ys
>>
>> getKMinima :: (Ord a) => [a] -> [Int]
>> getKMinima k = map fst . take k . sortWith snd . zip [0..]
>
> For my own edification, what is the benefit of this sortWith over sortBy?
>
> f `on` g = \ x y -> f ( g x ) ( g y )
> kminima k = map fst . take k . sortBy ( compare `on` snd ) . zip [0..]

possibly related (newbie question):

pairs are instances of Ord, why not directly sort those (implying the
item to be sorted is fst):

> kminima k = \list -> map snd $ take k $ sort $ zip list [0..]




_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe

signature.asc (257 bytes) Download Attachment
R M
Reply | Threaded
Open this post in threaded view
|

Re: k-minima in Haskell

R M
In reply to this post by Heinrich Apfelmus
 The link pretty much answers my question, though you probably require a little bit more book keeping to get the indices out. Compared to the iterative version or the matlab version, this definitely is more elegant, but also is trickier.

apfelmus wrote
raghu vardhan <mrvr84@yahoo.co.in>:
> So, any algorithm that sorts is no good (think of n as huge, and k small).

With lazy evaluation, it is!

  http://article.gmane.org/gmane.comp.lang.haskell.general/15010


ajb@spamcop.net wrote:
> The essence of all the answers that you've received is that it doesn't
> matter if k is small, sorting is still the most elegant answer in Haskell.

(It's not only most elegant, it can even be put to work.)

> The reason is that in Haskell, a function which sort function will produce the
> sorted list lazily. If you don't demand the while list, you don't perform
> the whole sort.
>
> Some algorithms are better than others for minimising the amount of
> work if not all of the list is demanded, but ideally, producing the
> first k elements will take O(n log k + k) time.

You mean O(k * log n + n) of course.

Regards,
apfelmus

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: k-minima in Haskell

Yitzchak Gale
\begin{code}

kmin :: Ord a => Int -> [a] -> [Int]
kmin k x = map snd $ Set.toList $ foldl' insertIfSmall (Set.fromList h) t
  where
    (h, t) = splitAt k $ zip x [0..]

insertIfSmall :: Ord a => Set.Set a -> a -> Set.Set a
insertIfSmall s e
 | e < mx    = Set.insert e s'
 | otherwise = s
 where
   (mx, s') = Set.deleteFindMax s

\end{code}

This gives O(log k * (n + k)) execution in constant memory.
If you need the result indices to be in order, you can put in
a sort at the end without changing the complexity.

This could be improved by a significant constant factor
by using a priority queue instead of Set. Any Edison people
out there?

Regards,
Yitz
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: k-minima in Haskell

Yitzchak Gale
In reply to this post by R M
\begin{code}

kmin :: Ord a => Int -> [a] -> [Int]
kmin k x = map snd $ Set.toList $ foldl' insertIfSmall (Set.fromList h) t
  where
    (h, t) = splitAt k $ zip x [0..]

insertIfSmall :: Ord a => Set.Set a -> a -> Set.Set a
insertIfSmall s e
 | e < mx    = Set.insert e s'
 | otherwise = s
 where
   (mx, s') = Set.deleteFindMax s

\end{code}

This gives O(log k * (n + k)) execution in constant memory.
If you need the result indices to be in order, you can put in
a sort at the end without changing the complexity.

This could be improved by a significant constant factor
by using a priority queue instead of Set. Any Edison people
out there?

Regards,
Yitz
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Re: k-minima in Haskell

Steve Downey-2
In reply to this post by Heinrich Apfelmus
Does the answer change if the data source isn't a list, already in memory, but a stream? That is, will the sort end up pulling the entire stream into memory, when we only need k elements from the entire stream.

Interestingly, this is almost exactly the same as one of my standard interview questions, with the main difference being looking for the k elements closest to a target value, rather than the smallest. Not that it really changes the problem, but recognizing that is one of the things I'm looking for.

On 4/12/07, apfelmus <[hidden email]> wrote:
raghu vardhan <[hidden email]>:
> So, any algorithm that sorts is no good (think of n as huge, and k small).

With lazy evaluation, it is!

   http://article.gmane.org/gmane.comp.lang.haskell.general/15010


[hidden email] wrote:
> The essence of all the answers that you've received is that it doesn't
> matter if k is small, sorting is still the most elegant answer in Haskell.

(It's not only most elegant, it can even be put to work.)

> The reason is that in Haskell, a function which sort function will produce the
> sorted list lazily. If you don't demand the while list, you don't perform
> the whole sort.
>
> Some algorithms are better than others for minimising the amount of
> work if not all of the list is demanded, but ideally, producing the
> first k elements will take O(n log k + k) time.

You mean O(k * log n + n) of course.

Regards,
apfelmus

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: k-minima in Haskell

ajb@spamcop.net
In reply to this post by Kurt Hutchinson
G'day all.

Quoting Kurt Hutchinson <[hidden email]>:

> For my own edification, what is the benefit of this sortWith over sortBy?

None.  I wanted to write my own sort, illustrating how the lazy
evaluation thing works, and I didn't want a name clash with an
existing library function.

Cheers,
Andrew Bromage
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
12