Faster set intersections?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
20 messages Options
Reply | Threaded
Open this post in threaded view
|

Faster set intersections?

Jeffrey Brown
The following expressions both cause GHCI to hang:

> S.intersection (S.fromList [1,2]) (S.fromList [1..])
fromList ^CInterrupted.
> S.intersection (S.fromList [1..]) (S.fromList [1,2])
fromList ^CInterrupted.
>

Is there a smarter way to take the intersection of sets when at least one of them is small (but I don't know which one that is)?

--
Jeff Brown | Jeffrey Benjamin Brown
Website   |   Facebook   |   LinkedIn(spammy, so I often miss messages here)   |   Github   

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Faster set intersections?

Brandon Allbery
I suspect not. Maybe with fromAscList or fromDistinctAscList so it can know the list won't pop a smaller (here, duplicate) value on it, if it's lazy enough (I haven't checked). Lists don't record that they were generated from enumerations.

On Sat, Dec 8, 2018 at 7:46 PM Jeffrey Brown <[hidden email]> wrote:
The following expressions both cause GHCI to hang:

> S.intersection (S.fromList [1,2]) (S.fromList [1..])
fromList ^CInterrupted.
> S.intersection (S.fromList [1..]) (S.fromList [1,2])
fromList ^CInterrupted.
>

Is there a smarter way to take the intersection of sets when at least one of them is small (but I don't know which one that is)?

--
Jeff Brown | Jeffrey Benjamin Brown
Website   |   Facebook   |   LinkedIn(spammy, so I often miss messages here)   |   Github   
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.


--
brandon s allbery kf8nh

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Faster set intersections?

Jan-Willem Maessen
In reply to this post by Jeffrey Brown
On Sat, Dec 8, 2018 at 7:46 PM Jeffrey Brown <[hidden email]> wrote:
The following expressions both cause GHCI to hang:

> S.intersection (S.fromList [1,2]) (S.fromList [1..])
fromList ^CInterrupted.
> S.intersection (S.fromList [1..]) (S.fromList [1,2])
fromList ^CInterrupted.
>

Is there a smarter way to take the intersection of sets when at least one of them is small (but I don't know which one that is)?

Taking the intersection of already-constructed, finite sets is very fast if one of them is small (should be O(n lg m) where n is the size of the smaller set and m the size of the larger one).  However, Data.Set doesn't handle infinite sets as it tries to construct balanced trees and is strict in the keys.  Even fromAscList won't help you here.  To represent infinite (or bigger than you actually want to construct) sets you'll likely need to roll your own custom representation, and I suspect it will in general need to either have limitations on the key space or worse asymptotic performance.

That said, you can often get the behavior you're looking for by representing big sets by a membership predicate and using filter.  Here's an (untested) example of the right sort of thing:

data MySet k = Pred (k -> Bool) | Concrete (S.Set k)

intersection (Pred a) (Pred b) = Pred (\k -> a k && b k)
intersection (Concrete a) (Pred b) = Concrete (S.filter b a)
intersection (Pred a) (Concrete b) = Concrete (S.filter a b)
intersection (Concrete a) (Concrete b) = Concrete (S.intersection a b)

Note that a concrete set "concretizes" anything it touches.  Don't take unions of these sets, though, it'll just be a mess.

-Jan-Willem Maessen
 

--
Jeff Brown | Jeffrey Benjamin Brown
Website   |   Facebook   |   LinkedIn(spammy, so I often miss messages here)   |   Github   
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Faster set intersections?

David Feuer
In reply to this post by Jeffrey Brown
Not like that, no. The Set type is explicitly for *finite* sets only. fromList [1..] is bottom and will run out of memory. You'd need a *very* different implementation to be able to support infinite sets at all, and even then you'd only catch certain special cases.

On Sat, Dec 8, 2018, 7:46 PM Jeffrey Brown <[hidden email] wrote:
The following expressions both cause GHCI to hang:

> S.intersection (S.fromList [1,2]) (S.fromList [1..])
fromList ^CInterrupted.
> S.intersection (S.fromList [1..]) (S.fromList [1,2])
fromList ^CInterrupted.
>

Is there a smarter way to take the intersection of sets when at least one of them is small (but I don't know which one that is)?

--
Jeff Brown | Jeffrey Benjamin Brown
Website   |   Facebook   |   LinkedIn(spammy, so I often miss messages here)   |   Github   
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Faster set intersections?

Ivan Lazar Miljenovic
In reply to this post by Jan-Willem Maessen
On Dec 9, 2018 11:36 AM, "Jan-Willem Maessen" <[hidden email]> wrote:

Note that a concrete set "concretizes" anything it touches.  Don't take unions of these sets, though, it'll just be a mess.

Won't a union just be the same as intersection but using || instead of && ?

-Jan-Willem Maessen
 

--
Jeff Brown | Jeffrey Benjamin Brown
Website   |   Facebook   |   LinkedIn(spammy, so I often miss messages here)   |   Github   
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.


_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Faster set intersections?

Olaf Klinke
In reply to this post by Jeffrey Brown
> Note that a concrete set "concretizes" anything it touches.  Don't take
> unions of these sets, though, it'll just be a mess.
>
>
> Won't a union just be the same as intersection but using || instead of && ?
>
>
> -Jan-Willem Maessen

Unions of predicates and concrete sets are easy, thanks to Set.member:

union (Pred p) (Concrete s) = Pred (\k -> p k || member k s)

What you can not do, of course, is enumerate and fold these sets.
There is a set type [1] which supports a litte bit more:

Set a = Maybe ((a -> Bool) -> a)

It has unions, intersections and a Monad instance and can represent infinite sets. If the base type has an Ord instance, the set can be enumerated. If the base type has an Eq instance, so has (Set a). Some functions usually implemented using Foldable are also possible, e.g. minimum and maximum.
Caveat: Performance can be poor, depending on how the function inside the set was defined.

Cheers,
Olaf

[1] http://hackage.haskell.org/package/infinite-search
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Faster set intersections?

Siddharth Bhat
I don't understand, how does 

(a -> Bool) -> a

model a set?

Thanks
Siddharth

On Sun, 9 Dec, 2018, 22:08 Olaf Klinke, <[hidden email]> wrote:
> Note that a concrete set "concretizes" anything it touches.  Don't take
> unions of these sets, though, it'll just be a mess.
>
>
> Won't a union just be the same as intersection but using || instead of && ?
>
>
> -Jan-Willem Maessen

Unions of predicates and concrete sets are easy, thanks to Set.member:

union (Pred p) (Concrete s) = Pred (\k -> p k || member k s)

What you can not do, of course, is enumerate and fold these sets.
There is a set type [1] which supports a litte bit more:

Set a = Maybe ((a -> Bool) -> a)

It has unions, intersections and a Monad instance and can represent infinite sets. If the base type has an Ord instance, the set can be enumerated. If the base type has an Eq instance, so has (Set a). Some functions usually implemented using Foldable are also possible, e.g. minimum and maximum.
Caveat: Performance can be poor, depending on how the function inside the set was defined.

Cheers,
Olaf

[1] http://hackage.haskell.org/package/infinite-search
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
--
Sending this from my phone, please excuse any typos!

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Faster set intersections?

Brandon Allbery
Naïvely, a set implemented as a predicate determining membership?

On Sun, Dec 9, 2018 at 1:32 PM Siddharth Bhat <[hidden email]> wrote:
I don't understand, how does 

(a -> Bool) -> a

model a set?

Thanks
Siddharth

On Sun, 9 Dec, 2018, 22:08 Olaf Klinke, <[hidden email]> wrote:
> Note that a concrete set "concretizes" anything it touches.  Don't take
> unions of these sets, though, it'll just be a mess.
>
>
> Won't a union just be the same as intersection but using || instead of && ?
>
>
> -Jan-Willem Maessen

Unions of predicates and concrete sets are easy, thanks to Set.member:

union (Pred p) (Concrete s) = Pred (\k -> p k || member k s)

What you can not do, of course, is enumerate and fold these sets.
There is a set type [1] which supports a litte bit more:

Set a = Maybe ((a -> Bool) -> a)

It has unions, intersections and a Monad instance and can represent infinite sets. If the base type has an Ord instance, the set can be enumerated. If the base type has an Eq instance, so has (Set a). Some functions usually implemented using Foldable are also possible, e.g. minimum and maximum.
Caveat: Performance can be poor, depending on how the function inside the set was defined.

Cheers,
Olaf

[1] http://hackage.haskell.org/package/infinite-search
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
--
Sending this from my phone, please excuse any typos!
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.


--
brandon s allbery kf8nh

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Faster set intersections?

Tom Ellis-5
That would be `a -> Bool`.

On Sun, Dec 09, 2018 at 01:36:06PM -0500, Brandon Allbery wrote:

> Naïvely, a set implemented as a predicate determining membership?
>
> On Sun, Dec 9, 2018 at 1:32 PM Siddharth Bhat <[hidden email]> wrote:
>
> > I don't understand, how does
> >
> > (a -> Bool) -> a
> >
> > model a set?
> >
> > Thanks
> > Siddharth
> >
> > On Sun, 9 Dec, 2018, 22:08 Olaf Klinke, <[hidden email]> wrote:
> >
> >> > Note that a concrete set "concretizes" anything it touches.  Don't take
> >> > unions of these sets, though, it'll just be a mess.
> >> >
> >> >
> >> > Won't a union just be the same as intersection but using || instead of
> >> && ?
> >> >
> >> >
> >> > -Jan-Willem Maessen
> >>
> >> Unions of predicates and concrete sets are easy, thanks to Set.member:
> >>
> >> union (Pred p) (Concrete s) = Pred (\k -> p k || member k s)
> >>
> >> What you can not do, of course, is enumerate and fold these sets.
> >> There is a set type [1] which supports a litte bit more:
> >>
> >> Set a = Maybe ((a -> Bool) -> a)
> >>
> >> It has unions, intersections and a Monad instance and can represent
> >> infinite sets. If the base type has an Ord instance, the set can be
> >> enumerated. If the base type has an Eq instance, so has (Set a). Some
> >> functions usually implemented using Foldable are also possible, e.g.
> >> minimum and maximum.
> >> Caveat: Performance can be poor, depending on how the function inside the
> >> set was defined.
> >>
> >> Cheers,
> >> Olaf
> >>
> >> [1] http://hackage.haskell.org/package/infinite-search
> >> _______________________________________________
> >> Haskell-Cafe mailing list
> >> To (un)subscribe, modify options or view archives go to:
> >> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> >> Only members subscribed via the mailman list are allowed to post.
> >
> > --
> > Sending this from my phone, please excuse any typos!
> > _______________________________________________
> > Haskell-Cafe mailing list
> > To (un)subscribe, modify options or view archives go to:
> > http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> > Only members subscribed via the mailman list are allowed to post.
>
>
>
> --
> brandon s allbery kf8nh
> [hidden email]

> _______________________________________________
> Haskell-Cafe mailing list
> To (un)subscribe, modify options or view archives go to:
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> Only members subscribed via the mailman list are allowed to post.

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Faster set intersections?

migmit-2
In reply to this post by Brandon Allbery
My guess — by finding a member that satisfies a predicate, if it's at all possible, and any member if the predicate is const False. It's actually pretty awesome.

> On 9 Dec 2018, at 19:36, Brandon Allbery <[hidden email]> wrote:
>
> Naïvely, a set implemented as a predicate determining membership?
>
> On Sun, Dec 9, 2018 at 1:32 PM Siddharth Bhat <[hidden email]> wrote:
> I don't understand, how does
>
> (a -> Bool) -> a
>
> model a set?
>
> Thanks
> Siddharth
>
> On Sun, 9 Dec, 2018, 22:08 Olaf Klinke, <[hidden email]> wrote:
> > Note that a concrete set "concretizes" anything it touches.  Don't take
> > unions of these sets, though, it'll just be a mess.
> >
> >
> > Won't a union just be the same as intersection but using || instead of && ?
> >
> >
> > -Jan-Willem Maessen
>
> Unions of predicates and concrete sets are easy, thanks to Set.member:
>
> union (Pred p) (Concrete s) = Pred (\k -> p k || member k s)
>
> What you can not do, of course, is enumerate and fold these sets.
> There is a set type [1] which supports a litte bit more:
>
> Set a = Maybe ((a -> Bool) -> a)
>
> It has unions, intersections and a Monad instance and can represent infinite sets. If the base type has an Ord instance, the set can be enumerated. If the base type has an Eq instance, so has (Set a). Some functions usually implemented using Foldable are also possible, e.g. minimum and maximum.
> Caveat: Performance can be poor, depending on how the function inside the set was defined.
>
> Cheers,
> Olaf
>
> [1] http://hackage.haskell.org/package/infinite-search
> _______________________________________________
> Haskell-Cafe mailing list
> To (un)subscribe, modify options or view archives go to:
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> Only members subscribed via the mailman list are allowed to post.
> --
> Sending this from my phone, please excuse any typos!
> _______________________________________________
> Haskell-Cafe mailing list
> To (un)subscribe, modify options or view archives go to:
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> Only members subscribed via the mailman list are allowed to post.
>
>
> --
> brandon s allbery kf8nh
> [hidden email]
> _______________________________________________
> Haskell-Cafe mailing list
> To (un)subscribe, modify options or view archives go to:
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> Only members subscribed via the mailman list are allowed to post.

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Faster set intersections?

David Feuer
That does seem to be what the source code says. The code only handles non-empty sets, but I *believe* the Maybe-wrapped version can handle intersections, albeit inefficiently. Can a version that looks like (a -> Bool) -> Maybe a work?

On Sun, Dec 9, 2018, 1:41 PM MigMit <[hidden email] wrote:
My guess — by finding a member that satisfies a predicate, if it's at all possible, and any member if the predicate is const False. It's actually pretty awesome.

> On 9 Dec 2018, at 19:36, Brandon Allbery <[hidden email]> wrote:
>
> Naïvely, a set implemented as a predicate determining membership?
>
> On Sun, Dec 9, 2018 at 1:32 PM Siddharth Bhat <[hidden email]> wrote:
> I don't understand, how does
>
> (a -> Bool) -> a
>
> model a set?
>
> Thanks
> Siddharth
>
> On Sun, 9 Dec, 2018, 22:08 Olaf Klinke, <[hidden email]> wrote:
> > Note that a concrete set "concretizes" anything it touches.  Don't take
> > unions of these sets, though, it'll just be a mess.
> >
> >
> > Won't a union just be the same as intersection but using || instead of && ?
> >
> >
> > -Jan-Willem Maessen
>
> Unions of predicates and concrete sets are easy, thanks to Set.member:
>
> union (Pred p) (Concrete s) = Pred (\k -> p k || member k s)
>
> What you can not do, of course, is enumerate and fold these sets.
> There is a set type [1] which supports a litte bit more:
>
> Set a = Maybe ((a -> Bool) -> a)
>
> It has unions, intersections and a Monad instance and can represent infinite sets. If the base type has an Ord instance, the set can be enumerated. If the base type has an Eq instance, so has (Set a). Some functions usually implemented using Foldable are also possible, e.g. minimum and maximum.
> Caveat: Performance can be poor, depending on how the function inside the set was defined.
>
> Cheers,
> Olaf
>
> [1] http://hackage.haskell.org/package/infinite-search
> _______________________________________________
> Haskell-Cafe mailing list
> To (un)subscribe, modify options or view archives go to:
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> Only members subscribed via the mailman list are allowed to post.
> --
> Sending this from my phone, please excuse any typos!
> _______________________________________________
> Haskell-Cafe mailing list
> To (un)subscribe, modify options or view archives go to:
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> Only members subscribed via the mailman list are allowed to post.
>
>
> --
> brandon s allbery kf8nh
> [hidden email]
> _______________________________________________
> Haskell-Cafe mailing list
> To (un)subscribe, modify options or view archives go to:
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> Only members subscribed via the mailman list are allowed to post.

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Faster set intersections?

Olaf Klinke
In reply to this post by Siddharth Bhat

> Am 09.12.2018 um 19:31 schrieb Siddharth Bhat <[hidden email]>:
>
> I don't understand, how does
>
> (a -> Bool) -> a
>
> model a set?
MigMit was right. When I learned about this, I thought is was black magic. Suppose
find :: (a -> Bool) -> a
This function 'find' models a hypothetical non-empty set S. The specification is that for every predicate
p :: a -> Bool
that terminates on every element of S (this condition is important!), find p will be a member of S that satisfies p, if there is such a member, and any member of S otherwise. Since find is specified to always return a member of S, the set S is guaranteed to be non-empty. You can select some element from S by calling 'find' on (const True).

For example, the singleton x is defined as the constant function
\p -> x
The doubleton {x,y} is defined as
\p -> if p x then x else y
Binary union is defined as
union find find' = \p -> if p (find p) then find p else find' p
Existential quantification over S:
exists p = p (find p)
Universal quantification over S:
forall p = not (exists (not.p))

In order to represent the empty set as well, I stuck in the Maybe, so that Nothing represents the empty set and (Just find) represents a non-empty set.

Olaf

>
> Thanks
> Siddharth
>
> On Sun, 9 Dec, 2018, 22:08 Olaf Klinke, <[hidden email]> wrote:
> > Note that a concrete set "concretizes" anything it touches.  Don't take
> > unions of these sets, though, it'll just be a mess.
> >
> >
> > Won't a union just be the same as intersection but using || instead of && ?
> >
> >
> > -Jan-Willem Maessen
>
> Unions of predicates and concrete sets are easy, thanks to Set.member:
>
> union (Pred p) (Concrete s) = Pred (\k -> p k || member k s)
>
> What you can not do, of course, is enumerate and fold these sets.
> There is a set type [1] which supports a litte bit more:
>
> Set a = Maybe ((a -> Bool) -> a)
>
> It has unions, intersections and a Monad instance and can represent infinite sets. If the base type has an Ord instance, the set can be enumerated. If the base type has an Eq instance, so has (Set a). Some functions usually implemented using Foldable are also possible, e.g. minimum and maximum.
> Caveat: Performance can be poor, depending on how the function inside the set was defined.
>
> Cheers,
> Olaf
>
> [1] http://hackage.haskell.org/package/infinite-search
> _______________________________________________
> Haskell-Cafe mailing list
> To (un)subscribe, modify options or view archives go to:
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> Only members subscribed via the mailman list are allowed to post.
> --
> Sending this from my phone, please excuse any typos!

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
KC
Reply | Threaded
Open this post in threaded view
|

Re: Faster set intersections?

KC
In reply to this post by Jeffrey Brown
Bloom Filters?
If some false positives are OK 

--
Sent from an expensive device which will be obsolete in a few months
Casey

On Sat, Dec 8, 2018, 4:46 PM Jeffrey Brown <[hidden email] wrote:
The following expressions both cause GHCI to hang:

> S.intersection (S.fromList [1,2]) (S.fromList [1..])
fromList ^CInterrupted.
> S.intersection (S.fromList [1..]) (S.fromList [1,2])
fromList ^CInterrupted.
>

Is there a smarter way to take the intersection of sets when at least one of them is small (but I don't know which one that is)?

--
Jeff Brown | Jeffrey Benjamin Brown
Website   |   Facebook   |   LinkedIn(spammy, so I often miss messages here)   |   Github   
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Faster set intersections?

Olaf Klinke
In reply to this post by David Feuer

> Am 09.12.2018 um 19:47 schrieb David Feuer <[hidden email]>:
>
> That does seem to be what the source code says. The code only handles non-empty sets, but I *believe* the Maybe-wrapped version can handle intersections, albeit inefficiently. Can a version that looks like (a -> Bool) -> Maybe a work?
I think so. The empty set is then represented by the function that returns Nothing even on (const True).
Indeed, with Maybe on the outside the intersection function requires the Eq constraint on the base type: We have
member :: Eq a => a -> ((a -> Bool) -> a) -> Bool
member x find = x == (find (x==))

Then the intersection of find and find' can be implemented using
\p -> find (\x -> p x && member x find')

But I fail to see how having Maybe on the inside remedies this situation. Furthermore, on Eq types these sets are not so interesting, for the following reason.

One can write a function
Eq a => ((a -> Bool) -> a) -> [a]
that enumerates the elements of the set. Because we have universal quantification, this list can not be infinite. Which makes sense, topologically: These so-called searchable sets are topologically compact, and the Eq constraint means the space is discrete. Compact subsets of a discrete space are finite.

Olaf

>
> On Sun, Dec 9, 2018, 1:41 PM MigMit <[hidden email] wrote:
> My guess — by finding a member that satisfies a predicate, if it's at all possible, and any member if the predicate is const False. It's actually pretty awesome.
>
> > On 9 Dec 2018, at 19:36, Brandon Allbery <[hidden email]> wrote:
> >
> > Naïvely, a set implemented as a predicate determining membership?
> >
> > On Sun, Dec 9, 2018 at 1:32 PM Siddharth Bhat <[hidden email]> wrote:
> > I don't understand, how does
> >
> > (a -> Bool) -> a
> >
> > model a set?
> >
> > Thanks
> > Siddharth
> >
> > On Sun, 9 Dec, 2018, 22:08 Olaf Klinke, <[hidden email]> wrote:
> > > Note that a concrete set "concretizes" anything it touches.  Don't take
> > > unions of these sets, though, it'll just be a mess.
> > >
> > >
> > > Won't a union just be the same as intersection but using || instead of && ?
> > >
> > >
> > > -Jan-Willem Maessen
> >
> > Unions of predicates and concrete sets are easy, thanks to Set.member:
> >
> > union (Pred p) (Concrete s) = Pred (\k -> p k || member k s)
> >
> > What you can not do, of course, is enumerate and fold these sets.
> > There is a set type [1] which supports a litte bit more:
> >
> > Set a = Maybe ((a -> Bool) -> a)
> >
> > It has unions, intersections and a Monad instance and can represent infinite sets. If the base type has an Ord instance, the set can be enumerated. If the base type has an Eq instance, so has (Set a). Some functions usually implemented using Foldable are also possible, e.g. minimum and maximum.
> > Caveat: Performance can be poor, depending on how the function inside the set was defined.
> >
> > Cheers,
> > Olaf
> >
> > [1] http://hackage.haskell.org/package/infinite-search
> > _______________________________________________
> > Haskell-Cafe mailing list
> > To (un)subscribe, modify options or view archives go to:
> > http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> > Only members subscribed via the mailman list are allowed to post.
> > --
> > Sending this from my phone, please excuse any typos!
> > _______________________________________________
> > Haskell-Cafe mailing list
> > To (un)subscribe, modify options or view archives go to:
> > http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> > Only members subscribed via the mailman list are allowed to post.
> >
> >
> > --
> > brandon s allbery kf8nh
> > [hidden email]
> > _______________________________________________
> > Haskell-Cafe mailing list
> > To (un)subscribe, modify options or view archives go to:
> > http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> > Only members subscribed via the mailman list are allowed to post.
>
> _______________________________________________
> Haskell-Cafe mailing list
> To (un)subscribe, modify options or view archives go to:
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> Only members subscribed via the mailman list are allowed to post.

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Faster set intersections?

David Feuer
On Sun, Dec 9, 2018, 2:30 PM Olaf Klinke <[hidden email] wrote:

But I fail to see how having Maybe on the inside remedies this situation. Furthermore, on Eq types these sets are not so interesting, for the following reason.

With Maybe on the outside, you can't jump straight to defining the function; you must first determine whether the intersection is empty. To my mind, a more natural definition for (possibly empty) sets is

data Set a = Set {find :: (a -> Maybe b) -> Maybe b}

which really explains how this is a set.

One can write a function
Eq a => ((a -> Bool) -> a) -> [a]
that enumerates the elements of the set. Because we have universal quantification, this list can not be infinite. Which makes sense, topologically: These so-called searchable sets are topologically compact, and the Eq constraint means the space is discrete. Compact subsets of a discrete space are finite.

I don't understand how that's finite and not just countable.

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Faster set intersections?

David Feuer


On Sun, Dec 9, 2018, 3:54 PM David Feuer <[hidden email] wrote:

I don't understand how that's finite and not just countable.

Ah, I guess because there's no way to look at all elements of an infinite set. D'oh.

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Faster set intersections?

Doug McIlroy
In reply to this post by Jeffrey Brown
> The following expressions both cause GHCI to hang:
>
> S.intersection (S.fromList [1,2]) (S.fromList [1..])
> S.intersection (S.fromList [1..]) (S.fromList [1,2])
>
> Is there a smarter way to take the intersection of sets when at least one
> of them is small (but I don't know which one that is)?

"Small" is not enough. If all you know is that the lists
represent sets of integers, then

   S.intersection (S.fromList [-1,-2]) (S.fromList [1..])

must diverge. A sufficient condition for success in such examples
is that the sets come from a well-ordered domain and are presented
in order. Then it's easy to write a merge-like algorithm.
Both sets can be infinite; every finite prefix of the intersection
will eventually appear.

(Note that the integers are not well-ordered, though the positive
integers are.)

Doug
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Faster set intersections?

Siddharth Bhat
According to this construction, enumeration will need some SAT enumeration like process, where you search with a new predicate barring all previous candidates? 

On Mon, 10 Dec, 2018, 08:50 Doug McIlroy, <[hidden email]> wrote:
> The following expressions both cause GHCI to hang:
>
> S.intersection (S.fromList [1,2]) (S.fromList [1..])
> S.intersection (S.fromList [1..]) (S.fromList [1,2])
>
> Is there a smarter way to take the intersection of sets when at least one
> of them is small (but I don't know which one that is)?

"Small" is not enough. If all you know is that the lists
represent sets of integers, then

   S.intersection (S.fromList [-1,-2]) (S.fromList [1..])

must diverge. A sufficient condition for success in such examples
is that the sets come from a well-ordered domain and are presented
in order. Then it's easy to write a merge-like algorithm.
Both sets can be infinite; every finite prefix of the intersection
will eventually appear.

(Note that the integers are not well-ordered, though the positive
integers are.)

Doug
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
--
Sending this from my phone, please excuse any typos!

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Faster set intersections?

Clinton Mead
In reply to this post by Olaf Klinke

One can write a function
Eq a => ((a -> Bool) -> a) -> [a]
that enumerates the elements of the set. Because we have universal quantification, this list can not be infinite. Which makes sense, topologically: These so-called searchable sets are topologically compact, and the Eq constraint means the space is discrete. Compact subsets of a discrete space are finite.

Olaf


Olaf (or anyone else), can you help me out here and write this function, with some example inputs and outputs? 

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Faster set intersections?

Olaf Klinke

> Am 11.12.2018 um 00:50 schrieb Clinton Mead <[hidden email]>:
>
>
> One can write a function
> Eq a => ((a -> Bool) -> a) -> [a]
> that enumerates the elements of the set. Because we have universal quantification, this list can not be infinite. Which makes sense, topologically: These so-called searchable sets are topologically compact, and the Eq constraint means the space is discrete. Compact subsets of a discrete space are finite.
>
> Olaf
>
>
> Olaf (or anyone else), can you help me out here and write this function, with some example inputs and outputs?

Below is my testing version of the searchable sets. The names differ slightly from the ones in the infinite-search package. The function you are looking for is called 'enumerate' -- Olaf

module Searchable where
import Control.Monad
import qualified Control.Applicative
import Data.Set (Set)
import qualified Data.Set as Set
-- Escardo's Monad for non-empty compact subsets.
-- Data type of searchable subsets of a.
-- Specification:
-- For every predicate p :: a -> Bool and searchable set k :: S a,
-- there exists some x in k with p(x) if and only if  p(find k p).
newtype S a = Finder ((a -> Bool) -> a)

find :: S a -> (a -> Bool) -> a
find (Finder f) p = f p

exists :: S a -> (a -> Bool) -> Bool
exists (Finder f) p = p (f p)

forall :: S a -> (a -> Bool) -> Bool
forall k p = not $ exists k (not.p)

instance Functor S where
    fmap g (Finder f) = Finder (\p -> g(f(p.g)))

singleton :: a -> S a
singleton x = Finder (const x)

-- Compact subsets of totally ordered types have least and greatest elements.
isSingleton :: (Ord a) => S a -> Bool
isSingleton k = (inf k id) >= (sup k id)

doubleton :: a -> a -> S a
doubleton x y = Finder (\p -> if p x then x else y)

-- doubleton x y = finite [x,y]
finite :: [a] -> S a
finite [] = error "{} is compact, but empty"
finite [x] = singleton x
finite (x:xs) = (singleton x) \/ (finite xs)

-- the union of compactly many compact sets is compact.
-- union (doubleton x y) = x \/ y
union :: S (S a) -> S a
union (Finder ff) = Finder (\p ->
    let Finder f = ff (\k -> exists k p)
    in f p)

-- binary union. Generalises doubleton.
infixl 5 \/
(\/) :: S a -> S a -> S a
(Finder f) \/ (Finder f') = Finder (\p -> if p (f p) then f p else f' p)

instance Monad S where
    return = singleton
    k >>= f = union $ fmap f k
instance Control.Applicative.Applicative S where
    pure = return
    (<*>) = ap

-- Searchable subsets of discrete spaces are clopen,
-- searchable subsets of Hausdorff spaces are closed.  
-- Notice that
--
-- @
-- (flip member) :: S a -> (a -> Bool)
-- @
--
-- This predicate returns 'False' iff the element is not member of the set
-- and 'True' for every element of the set, provided that equality is decidable.
member :: (Eq a) => a -> S a -> Bool
x `member` k = exists k (x ==)

-- output the part of the list that is in the compact set.
filterS :: (Eq a) => S a -> [a] -> [a]
filterS k = filter (flip member k)

-- In every sober space, the intersection of a compact set
-- with a closed set is compact (but may be empty).
-- If the intersection is not empty, this function will compute it.
intersect :: (a -> Bool) -> S a -> S a
intersect c k = Finder (\p -> find k (\x -> p x && c x))

-- instances of Ord have searchable subsets that can be well-ordered.
sort :: (Ord a) => S a -> Set a
sort k = let
    i = inf k id
    extend current_largest set = if y > current_largest then extend y (Set.insert y set) else set where
        y = find k (current_largest <)
    in extend i (Set.singleton i)

-- instances of Eq have searchable subsets that can be enumerated. O(n^2).
-- The Eq constraint means the underlying space is discrete,
-- and compact subsets of a discrete space are finite.
enumerate :: (Eq a) => S a -> [a]
enumerate k = let
    another p = let y = find k p in if p y then Just y else Nothing
    Just x0 = another (const True)
    extend enumeration = case another (not.flip elem enumeration) of
        Nothing -> enumeration
        Just e  -> extend (e:enumeration)
    in extend [x0]

-- fold of a function into a commutative monoid.
-- Since searchable sets have no intrinsic order,
-- the result of the fold is only well-defined if the monoid is commutative.
foldMapEq :: (Monoid m, Eq m) => (a -> m) -> S a -> m
foldMapEq f k = mconcat (enumerate (fmap f k))

-- show at most 8 elements of a compact set.
instance (Show a, Ord a) => Show (S a) where
    show k = begin $ take 8 $ Set.elems $ sort k where
        begin l
            | length l < 8 = show l
            | otherwise = (init $ show l)++", ...]"

-- Equality lifts to searchable subsets,
-- because the subset relation is computable.
instance (Eq a) => Eq (S a) where
    k == k' = let
        subset k k' = forall k (\x -> member x k')
        in (subset k k') && (subset k' k)

-- A continuous map on a compact set attains its infimum and supremum.
inf :: (Ord b) => S a -> (a -> b) -> a
inf k@(Finder f) g = f (\x -> forall k (\y -> g(x) <= g(y)))

sup :: (Ord b) => S a -> (a -> b) -> a
sup k@(Finder f) g = f (\x -> forall k (\y -> g(y) <= g(x)))

-- Hausdorff distance, given a metric.
-- This is a special case of the so-called Egli-Milner relation lifting.
dHaus :: (Ord r) => (a -> a -> r) -> S a -> S a -> r
dHaus d k k' = let
    h k1 k2 = i (sup k1 i) where
        i = \x -> d x (inf k2 (d x))
    in max (h k k') (h k' k)

{-- Examples --}

-- [()] = ℕ ∪ {∞} is compact.
nats :: S [()]
nats = Finder f where
    f p = if p [] then [] else ():f (\n -> p (():n))

-- Cantor space (ℕ → 2) is a compact subspace of Baire space (ℕ → ℕ).
cantor :: S [Int]
cantor = sequence $ repeat $ doubleton 0 1

-- Drinker's paradox:
-- For evers non-empty compact pub,
-- there is a person x such that
-- if x drinks, then everybody in the pub drinks.
drinker :: S person -> (person -> Bool) -> person
drinker in_pub drinks = find in_pub (not.drinks)
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.