Abstraction in data types

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
19 messages Options
Reply | Threaded
Open this post in threaded view
|

Abstraction in data types

Darrin Chandler
Hi,

Trying to get up to speed in Haskell, I'm playing with doing some
abstraction in data types. Specifically, I have this:

type Cartesian_coord = Float

type Latitude  = Float
type Longitude = Float

data Point = Cartesian (Cartesian_coord, Cartesian_coord)
                | Spherical (Latitude, Longitude)

type Center = Point
type Radius = Float

data Shape = Circle Center Radius
                | Polygon [Point]

This obviously stinks since a Polygon could contain mixed Cartesian and
Spherical points. Polygon needs to be one or the other, but not mixed.

I could define seperate types for Cartesian and Spherical and seperate
CartesianPoly and SphericalPoly, but that doesn't seem very elegant and
also increases as I add more coordinate systems and shapes. I read a
little on GADTs, et al, but I'm not sure if that's what I want for this
or not.

Any help appreciated!

--
Darrin Chandler            |  Phoenix BSD User Group  |  MetaBUG
[hidden email]   |  http://phxbug.org/      |  http://metabug.org/
http://www.stilyagin.com/  |  Daemons in the Desert   |  Global BUG Federation

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe

attachment0 (202 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Abstraction in data types

Alexander Solla-2
I wrote this to Darrin, but didn't CC cafe:

On Mar 17, 2010, at 9:20 PM, Darrin Chandler wrote:

type Cartesian_coord = Float

type Latitude  = Float
type Longitude = Float

data Point = Cartesian (Cartesian_coord, Cartesian_coord)
| Spherical (Latitude, Longitude)

type Center = Point
type Radius = Float

data Shape = Circle Center Radius
| Polygon [Point]

This obviously stinks since a Polygon could contain mixed Cartesian and
Spherical points. Polygon needs to be one or the other, but not mixed.

My suggestion would be to use an alternate representation of "spherical" points in terms of polar coordinates, and then to normalize and mix at will:

type Theta = Float
type Radius = Float

data Point = Cartesian (Cartesian_coord, Cartesian_coord)
           | Polar   (Theta, Radius)

normalize_point :: Point -> Point
normalize_point Cartesian x y = Cartesian x y
normalize_point Polar t r = Cartesian x y where x = r * cos t; y = r * sin t;

It really depends on what you want to do with your points.  If you want to do linear algebra, you might want your points to depend on a basis, for example.  But your "spherical" points don't really form a basis in three-space, or even over all of two-space.

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Abstraction in data types

Lyndon Maydwell
You could probably also use a typeclass for pointy things rather than
a data type, this would then require you to use existential
quantification to construct a hetrogenous list.

For example:

Class Point where
    getCartesian :: ...
    getPolar :: ...

data Shape = Point p => ... | Polygon [p]

Correct me if this is wrong though :-)

On Thu, Mar 18, 2010 at 12:56 PM, Alexander Solla <[hidden email]> wrote:

> I wrote this to Darrin, but didn't CC cafe:
> On Mar 17, 2010, at 9:20 PM, Darrin Chandler wrote:
>
> type Cartesian_coord = Float
>
> type Latitude  = Float
> type Longitude = Float
>
> data Point = Cartesian (Cartesian_coord, Cartesian_coord)
> | Spherical (Latitude, Longitude)
>
> type Center = Point
> type Radius = Float
>
> data Shape = Circle Center Radius
> | Polygon [Point]
>
> This obviously stinks since a Polygon could contain mixed Cartesian and
> Spherical points. Polygon needs to be one or the other, but not mixed.
>
> My suggestion would be to use an alternate representation of "spherical"
> points in terms of polar coordinates, and then to normalize and mix at will:
> type Theta = Float
> type Radius = Float
> data Point = Cartesian (Cartesian_coord, Cartesian_coord)
>            | Polar   (Theta, Radius)
> normalize_point :: Point -> Point
> normalize_point Cartesian x y = Cartesian x y
> normalize_point Polar t r = Cartesian x y where x = r * cos t; y = r * sin
> t;
> It really depends on what you want to do with your points.  If you want to
> do linear algebra, you might want your points to depend on a basis, for
> example.  But your "spherical" points don't really form a basis in
> three-space, or even over all of two-space.
> _______________________________________________
> Haskell-Cafe mailing list
> [hidden email]
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
>
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Abstraction in data types

Alexander Solla-2
In reply to this post by Alexander Solla-2

On Mar 17, 2010, at 9:56 PM, Alexander Solla wrote:

But your "spherical" points don't really form a basis in three-space, or even over all of two-space.

I'll take this back.  Lattitude and longitude is enough to "form a basis" on R^2, by taking a basis for the surface of the sphere in terms of latitude and longitude and projecting it stereographically.  So if you wanted to use the normalization idea, you could use the stereographic projection formulas to turn a spherical point into a Cartesian point.


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Abstraction in data types

Darrin Chandler
In reply to this post by Alexander Solla-2
On Wed, Mar 17, 2010 at 09:56:14PM -0700, Alexander Solla wrote:

> I wrote this to Darrin, but didn't CC cafe:
>
> On Mar 17, 2010, at 9:20 PM, Darrin Chandler wrote:
>
> >type Cartesian_coord = Float
> >
> >type Latitude  = Float
> >type Longitude = Float
> >
> >data Point = Cartesian (Cartesian_coord, Cartesian_coord)
> > | Spherical (Latitude, Longitude)
> >
> >type Center = Point
> >type Radius = Float
> >
> >data Shape = Circle Center Radius
> > | Polygon [Point]
> >
> >This obviously stinks since a Polygon could contain mixed
> >Cartesian and
> >Spherical points. Polygon needs to be one or the other, but not mixed.
>
> My suggestion would be to use an alternate representation of
> "spherical" points in terms of polar coordinates, and then to
> normalize and mix at will:
>
> type Theta = Float
> type Radius = Float
>
> data Point = Cartesian (Cartesian_coord, Cartesian_coord)
>            | Polar   (Theta, Radius)
>
> normalize_point :: Point -> Point
> normalize_point Cartesian x y = Cartesian x y
> normalize_point Polar t r = Cartesian x y where x = r * cos t; y = r
> * sin t;
>
> It really depends on what you want to do with your points.  If you
> want to do linear algebra, you might want your points to depend on a
> basis, for example.  But your "spherical" points don't really form a
> basis in three-space, or even over all of two-space.
I see what you mean, but I don't think that's what I need. I want to
have keep Lat/Lon, as I may have large groups of shapes in Lat/Lon and
want to do things with them as is. And the same for cartesian coords.
Sometimes I will translate betweem lat/lon and cartesian, but many times
I will be doing calculations in "native" coordinates. But it's a nice
technique you show, and it will come in handy elsewhere.

--
Darrin Chandler            |  Phoenix BSD User Group  |  MetaBUG
[hidden email]   |  http://phxbug.org/      |  http://metabug.org/
http://www.stilyagin.com/  |  Daemons in the Desert   |  Global BUG Federation

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe

attachment0 (202 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Abstraction in data types

Darrin Chandler
In reply to this post by Alexander Solla-2
On Wed, Mar 17, 2010 at 10:20:02PM -0700, Alexander Solla wrote:

>
> On Mar 17, 2010, at 9:56 PM, Alexander Solla wrote:
>
> >But your "spherical" points don't really form a basis in three-
> >space, or even over all of two-space.
>
> I'll take this back.  Lattitude and longitude is enough to "form a
> basis" on R^2, by taking a basis for the surface of the sphere in
> terms of latitude and longitude and projecting it stereographically.
> So if you wanted to use the normalization idea, you could use the
> stereographic projection formulas to turn a spherical point into a
> Cartesian point.
>
> http://en.wikipedia.org/wiki/Stereographic_projection
Yes. I believe other projections can be used as well (orthographic,
etc).

--
Darrin Chandler            |  Phoenix BSD User Group  |  MetaBUG
[hidden email]   |  http://phxbug.org/      |  http://metabug.org/
http://www.stilyagin.com/  |  Daemons in the Desert   |  Global BUG Federation

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe

attachment0 (202 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Abstraction in data types

Darrin Chandler
In reply to this post by Lyndon Maydwell
On Thu, Mar 18, 2010 at 01:06:25PM +0800, Lyndon Maydwell wrote:

> You could probably also use a typeclass for pointy things rather than
> a data type, this would then require you to use existential
> quantification to construct a hetrogenous list.
>
> For example:
>
> Class Point where
>     getCartesian :: ...
>     getPolar :: ...
>
> data Shape = Point p => ... | Polygon [p]
>
> Correct me if this is wrong though :-)
So in "normal" use Polygon list would be homogeneous, but could be made
heterogeneous with effort? If I have that right it's closer, but I'd
love to have the compiler whine if someone tried to mix them.

> On Thu, Mar 18, 2010 at 12:56 PM, Alexander Solla <[hidden email]> wrote:
> > I wrote this to Darrin, but didn't CC cafe:
> > On Mar 17, 2010, at 9:20 PM, Darrin Chandler wrote:
> >
> > type Cartesian_coord = Float
> >
> > type Latitude  = Float
> > type Longitude = Float
> >
> > data Point = Cartesian (Cartesian_coord, Cartesian_coord)
> > | Spherical (Latitude, Longitude)
> >
> > type Center = Point
> > type Radius = Float
> >
> > data Shape = Circle Center Radius
> > | Polygon [Point]
> >
> > This obviously stinks since a Polygon could contain mixed Cartesian and
> > Spherical points. Polygon needs to be one or the other, but not mixed.
> >
> > My suggestion would be to use an alternate representation of "spherical"
> > points in terms of polar coordinates, and then to normalize and mix at will:
> > type Theta = Float
> > type Radius = Float
> > data Point = Cartesian (Cartesian_coord, Cartesian_coord)
> >            | Polar   (Theta, Radius)
> > normalize_point :: Point -> Point
> > normalize_point Cartesian x y = Cartesian x y
> > normalize_point Polar t r = Cartesian x y where x = r * cos t; y = r * sin
> > t;
> > It really depends on what you want to do with your points.  If you want to
> > do linear algebra, you might want your points to depend on a basis, for
> > example.  But your "spherical" points don't really form a basis in
> > three-space, or even over all of two-space.
> > _______________________________________________
> > Haskell-Cafe mailing list
> > [hidden email]
> > http://www.haskell.org/mailman/listinfo/haskell-cafe
> >
> >
> _______________________________________________
> Haskell-Cafe mailing list
> [hidden email]
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
--
Darrin Chandler            |  Phoenix BSD User Group  |  MetaBUG
[hidden email]   |  http://phxbug.org/      |  http://metabug.org/
http://www.stilyagin.com/  |  Daemons in the Desert   |  Global BUG Federation

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe

attachment0 (202 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Abstraction in data types

Lyndon Maydwell
Well you need to define a new datatype to make a hertrogenous list, so
I don't think there's any real way you can get around people doing
that...

On Thu, Mar 18, 2010 at 1:27 PM, Darrin Chandler
<[hidden email]> wrote:

> On Thu, Mar 18, 2010 at 01:06:25PM +0800, Lyndon Maydwell wrote:
>> You could probably also use a typeclass for pointy things rather than
>> a data type, this would then require you to use existential
>> quantification to construct a hetrogenous list.
>>
>> For example:
>>
>> Class Point where
>>     getCartesian :: ...
>>     getPolar :: ...
>>
>> data Shape = Point p => ... | Polygon [p]
>>
>> Correct me if this is wrong though :-)
>
> So in "normal" use Polygon list would be homogeneous, but could be made
> heterogeneous with effort? If I have that right it's closer, but I'd
> love to have the compiler whine if someone tried to mix them.
>
>> On Thu, Mar 18, 2010 at 12:56 PM, Alexander Solla <[hidden email]> wrote:
>> > I wrote this to Darrin, but didn't CC cafe:
>> > On Mar 17, 2010, at 9:20 PM, Darrin Chandler wrote:
>> >
>> > type Cartesian_coord = Float
>> >
>> > type Latitude  = Float
>> > type Longitude = Float
>> >
>> > data Point = Cartesian (Cartesian_coord, Cartesian_coord)
>> > | Spherical (Latitude, Longitude)
>> >
>> > type Center = Point
>> > type Radius = Float
>> >
>> > data Shape = Circle Center Radius
>> > | Polygon [Point]
>> >
>> > This obviously stinks since a Polygon could contain mixed Cartesian and
>> > Spherical points. Polygon needs to be one or the other, but not mixed.
>> >
>> > My suggestion would be to use an alternate representation of "spherical"
>> > points in terms of polar coordinates, and then to normalize and mix at will:
>> > type Theta = Float
>> > type Radius = Float
>> > data Point = Cartesian (Cartesian_coord, Cartesian_coord)
>> >            | Polar   (Theta, Radius)
>> > normalize_point :: Point -> Point
>> > normalize_point Cartesian x y = Cartesian x y
>> > normalize_point Polar t r = Cartesian x y where x = r * cos t; y = r * sin
>> > t;
>> > It really depends on what you want to do with your points.  If you want to
>> > do linear algebra, you might want your points to depend on a basis, for
>> > example.  But your "spherical" points don't really form a basis in
>> > three-space, or even over all of two-space.
>> > _______________________________________________
>> > Haskell-Cafe mailing list
>> > [hidden email]
>> > http://www.haskell.org/mailman/listinfo/haskell-cafe
>> >
>> >
>> _______________________________________________
>> Haskell-Cafe mailing list
>> [hidden email]
>> http://www.haskell.org/mailman/listinfo/haskell-cafe
>>
>
> --
> Darrin Chandler            |  Phoenix BSD User Group  |  MetaBUG
> [hidden email]   |  http://phxbug.org/      |  http://metabug.org/
> http://www.stilyagin.com/  |  Daemons in the Desert   |  Global BUG Federation
>
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Abstraction in data types

Brandon S Allbery KF8NH
In reply to this post by Darrin Chandler
On Mar 18, 2010, at 01:27 , Darrin Chandler wrote:
> On Thu, Mar 18, 2010 at 01:06:25PM +0800, Lyndon Maydwell wrote:
>> You could probably also use a typeclass for pointy things rather than
>> a data type, this would then require you to use existential
>> quantification to construct a hetrogenous list.
>
> So in "normal" use Polygon list would be homogeneous, but could be  
> made
> heterogeneous with effort? If I have that right it's closer, but I'd
> love to have the compiler whine if someone tried to mix them.

They can be mixed only with significant effort.  And you can't really  
prevent it once your users know the magic of existential  
quantification; *but* those lists won't typecheck when passed to your  
routines expecting a Polygon, because you will be expecting a (Point a  
=> Polygon [a]) but they will be passing a (Polygon [forall a. Point a  
=> a]) or something similar.

--
brandon s. allbery [solaris,freebsd,perl,pugs,haskell] [hidden email]
system administrator [openafs,heimdal,too many hats] [hidden email]
electrical and computer engineering, carnegie mellon university    KF8NH



_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe

PGP.sig (202 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Abstraction in data types

Alexander Solla-2
In reply to this post by Darrin Chandler

On Mar 17, 2010, at 10:27 PM, Darrin Chandler wrote:

Let's go back to your original code:

data Point = Cartesian (Cartesian_coord, Cartesian_coord)
                | Spherical (Latitude, Longitude)

type Center = Point
type Radius = Float

data Shape = Circle Center Radius
                | Polygon [Point]


normalize_shape :: Shape -> Shape
normalize_shape Circle c r = Circle c r
normalize_shape Polygon ps = Polygon $ fmap normalize_point ps where
        normalize_point = something appropriate for the function.

In fact, you could lift this into a higher order function, that takes  
a normalize_point function as an argument:

normalize_shape :: (Point -> Point) -> Shape -> Shape
normalize_shape f (Circle c r)= Circle (f c) r
normalize_shape f (Polygon ps) = Polygon $ fmap f ps

Now, I'm not suggesting that you should always normalize shapes, as I  
had with normalize_point before.  But this combinator captures some  
nice, generic logic. For example, you can do stuff like:

cartesian_shape :: Shape -> Shape
cartesian_shape = normalize_shape cartesian_point where ...

normalize_shape is the sort of function you would use while defining a  
function, and possibly provide function specific behavior in the  
function's where clause.

double_shape :: Shape -> Shape
double_shape (Circle c r) = Circle c (2 * r)
double_shape (Polygon ps) = Polygon $ normalize_shape (double_point .  
cartesian_point) ps where
        double_point Cartesian (x, y) = Cartesian (sqrt(2) * x, sqrt(2) * y)
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Abstraction in data types

Stephen Tetley-2
In reply to this post by Brandon S Allbery KF8NH
On 18 March 2010 05:48, Brandon S. Allbery KF8NH <[hidden email]> wrote:

> They can be mixed only with significant effort.  And you can't really
> prevent it once your users know the magic of existential quantification;
> *but* those lists won't typecheck when passed to your routines expecting a
> Polygon, because you will be expecting a (Point a => Polygon [a]) but they
> will be passing a (Polygon [forall a. Point a => a]) or something similar.


If you can live without constructors, an ADT view is another possibility...

Jeremy Gibbons gives a clear explanation of using an ADT to implement
complex numbers with polar and cartesian representations. Both
representations share the same type, so can be stored together in
lists - figuratively speaking the the existential type has been moved
"further down" inside the type.

See section 2:
http://www.comlab.ox.ac.uk/jeremy.gibbons/publications/adt.pdf

Best wishes

Stephen
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Abstraction in data types

Tillmann Rendel-5
In reply to this post by Darrin Chandler
Darrin Chandler wrote:

> data Point = Cartesian (Cartesian_coord, Cartesian_coord)
> | Spherical (Latitude, Longitude)
>
> type Center = Point
> type Radius = Float
>
> data Shape = Circle Center Radius
> | Polygon [Point]
>
> This obviously stinks since a Polygon could contain mixed Cartesian and
> Spherical points. Polygon needs to be one or the other, but not mixed.

The problem is that you cannot distinguish in the type system whether a
Point was created by the Cartesian or the Spherical constructor. These
constructors have the following types:

   Cartesian :: (Cartesian_coord, Cartesion_coor) -> Point
   Spherical :: (Latitude, Latitude) -> Point

The types of both constructors end with the same type, the type Point.
If you manage to change your code so that the constructors end in
different types, then the difference between them will be visible to the
type system, and you will have the static knowledge you want.

I know of two strategies to make the constructors end in different
types. They could belong to different datatypes, or they could belong to
different instances of a family of datatypes.

You could split the type Point into two types as follows.

   data Cartesian
     = Cartesian Cartesian_coord Cartesian_coord

   data Spherical
     = Spherical Latitude Longitude

Now the constructors have the following types:

   Cartesian :: Cartesian_coord -> Cartesian_coord -> Cartesian
   Spherical :: Latitude -> Longitude -> Spherical

Since the types of the constructors end in different types, we can
distinguish values created with Cartesian from values created with
Spherical in the type system.

(Note that I dropped the use of tuples from your types to make the
example more idiomatic Haskell).

However, how are we going to write the Shape datatype now that we have
two different types of points? We write a family of Shape datatypes
parameterized in the type of points we want to use.

   data Shape point
     = Circle point Radius
     | Polygon [point]

Note that point is a type variable, which can be instantiated with
whatever type we like, including Cartesian and Spherical. For example,
Shape Cartesian is the type of shapes described with cartesian
coordinates, and Shape Spherical is the type of shapes described with
spherical coordinates. But we could also have a type like Shape Integer,
which is the type of shapes described with Integer points, whatever that
means.

This generality of the Shape family of types has advantages and
disadvantages. An advantage is that we can make Shape an instance of the
type class Functor as follows.

   instance Functor Shape where
     fmap f (Circle p r) = Circle (f p) r
     fmap f (Polygon ps) = Polygon (map f ps)

Note that fmap applies a function to every point in the shape, but does
not change the overall shape structure. This could be used, for example,
to convert between a spherical and a cartesian shape, given some
function to convert between spherical and cartesian points.

   convertPoint :: Spherical -> Cartesian
   convertPoint = ...

   convertShape :: Shape Spherical -> Shape Cartesian
   convertShape = fmap convertPoint

Another example could be a shape with all the cartesian points stored in
same database. Lets assume that we have a function to lookup which point
is stored in the database under some key.

   lookupPoint :: Database -> Key -> Cartesian
   lookupPoint = ...

We can use fmap to write the function which takes a Shape with keys as
points, and looks up all the keys in the database, returning a Shape of
cartesian coordinates.

   lookupShape :: Database -> Shape Key -> Shape Cartesian
   lookupShape db = fmap (lookupPoint db)

The Functor typeclass is further extended and specialized by typeclasses
like Traversable, Foldable, Applicative, Alternative, Monad, etc.
Probably some of them are applicable here.

So an advantage of this kind of open family Shape is that we can write
instances for standard typeclasses, and another advantage is that we may
find good uses for types such as (Shape Key).

However, a disadvantage may be that we cannot restrict a priori which
types of points are useable in our program. That also means that we
cannot use pattern matching to discover at runtime which type of point
was used in some shape.

If we make cartesian and spherical points belong to the same family of
types, we relate them somewhat, but keep their distinction visible for
the type system. We want to introduce a family of types with two
members, one for spherical, and one for cartesian coordinates. We
introduce empty data types to index the family.

   data Cartesian
   data Spherical

And we express Point as a GADT using Spherical and Cartesian as follows.

   data Point i
     = Cartesian :: Cartesian_coord -> Cartesian_coord -> Point Cartesian
     | Spherical :: Latitude -> Longitude -> Point Spherical

Again, the constructors have different types. The types differ in the
type argument to Point. Since the type argument i is never used in the
definition of Point to describe values, it is called a phantom type.

Now, we can use pattern matching to figure out the type of points used
in some statically unknown shape. Lets again assume that we know how to
convert a spherical into an cartesian point.

   convertPoint :: Point Spherical -> Point Cartesian
   convertPoint (Spherical latitude longitude) = ...

Note that we do not have to write a clause for Cartesian points, because
a (Point Spherical) value could never have been created by the Cartesian
constructor.

We can now write a normalizePoint function which converts spherical
points into cartesian points, but which does not change already
cartesian points.

   normalize :: Point i -> Point Cartesian
   normalize p@(Spherical _ _) = convertPoint p
   normalize p@(Cartesian _ _) = p

Note that after matching p with Spherical, we are allowed to call
convertPoint, and after matching p with Cartesian, we are allowed to
return p as a Point Cartesian. This is allowed by the type checker,
because the type checker tracks which statically unknown information we
rediscover at runtime.

We can use the same phantom types Cartesian and Spherical to write the
Shape datatype.

   data Shape i
     = Circle (Point i) Radius
     | Polygon [Point i]

But we cannot make Shape an instance of Functor.

So if you want the type system to dinstiguish between values created by
different constructors of the same datatype, you have to make the
difference between the constructors visible to the type system. You
could either split the datatype into a bunch of unrelated types, or you
could rewrite the datatype into a family of datatypes. The most
important difference between these approaches is whether you want to
have an open or a closed family of types.

   Tillmann
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Abstraction in data types

Ozgur Akgun
I was just reading through the discussion, and Tillmann, your reply is one of the best written descriptions I've ever seen here. (or even in any other mail list!)

Of course, I see many good replies here, but they almost always turn out to be irrelevant to the original question. Yours on the other hand directly answers the original question.

Just wanted to thank.

-- Ozgur

On 18 March 2010 13:58, Tillmann Rendel <[hidden email]> wrote:
Darrin Chandler wrote:
data Point      = Cartesian (Cartesian_coord, Cartesian_coord)
               | Spherical (Latitude, Longitude)

type Center = Point
type Radius = Float

data Shape      = Circle Center Radius
               | Polygon [Point]

This obviously stinks since a Polygon could contain mixed Cartesian and
Spherical points. Polygon needs to be one or the other, but not mixed.

The problem is that you cannot distinguish in the type system whether a Point was created by the Cartesian or the Spherical constructor. These constructors have the following types:

 Cartesian :: (Cartesian_coord, Cartesion_coor) -> Point
 Spherical :: (Latitude, Latitude) -> Point

The types of both constructors end with the same type, the type Point. If you manage to change your code so that the constructors end in different types, then the difference between them will be visible to the type system, and you will have the static knowledge you want.

I know of two strategies to make the constructors end in different types. They could belong to different datatypes, or they could belong to different instances of a family of datatypes.

You could split the type Point into two types as follows.

 data Cartesian

   = Cartesian Cartesian_coord Cartesian_coord

 data Spherical
   = Spherical Latitude Longitude

Now the constructors have the following types:

 Cartesian :: Cartesian_coord -> Cartesian_coord -> Cartesian
 Spherical :: Latitude -> Longitude -> Spherical

Since the types of the constructors end in different types, we can distinguish values created with Cartesian from values created with Spherical in the type system.

(Note that I dropped the use of tuples from your types to make the example more idiomatic Haskell).

However, how are we going to write the Shape datatype now that we have two different types of points? We write a family of Shape datatypes parameterized in the type of points we want to use.

 data Shape point
   = Circle point Radius
   | Polygon [point]

Note that point is a type variable, which can be instantiated with whatever type we like, including Cartesian and Spherical. For example, Shape Cartesian is the type of shapes described with cartesian coordinates, and Shape Spherical is the type of shapes described with spherical coordinates. But we could also have a type like Shape Integer, which is the type of shapes described with Integer points, whatever that means.

This generality of the Shape family of types has advantages and disadvantages. An advantage is that we can make Shape an instance of the type class Functor as follows.

 instance Functor Shape where
   fmap f (Circle p r) = Circle (f p) r
   fmap f (Polygon ps) = Polygon (map f ps)

Note that fmap applies a function to every point in the shape, but does not change the overall shape structure. This could be used, for example, to convert between a spherical and a cartesian shape, given some function to convert between spherical and cartesian points.

 convertPoint :: Spherical -> Cartesian
 convertPoint = ...

 convertShape :: Shape Spherical -> Shape Cartesian
 convertShape = fmap convertPoint

Another example could be a shape with all the cartesian points stored in same database. Lets assume that we have a function to lookup which point is stored in the database under some key.

 lookupPoint :: Database -> Key -> Cartesian
 lookupPoint = ...

We can use fmap to write the function which takes a Shape with keys as points, and looks up all the keys in the database, returning a Shape of cartesian coordinates.

 lookupShape :: Database -> Shape Key -> Shape Cartesian
 lookupShape db = fmap (lookupPoint db)

The Functor typeclass is further extended and specialized by typeclasses like Traversable, Foldable, Applicative, Alternative, Monad, etc. Probably some of them are applicable here.

So an advantage of this kind of open family Shape is that we can write instances for standard typeclasses, and another advantage is that we may find good uses for types such as (Shape Key).

However, a disadvantage may be that we cannot restrict a priori which types of points are useable in our program. That also means that we cannot use pattern matching to discover at runtime which type of point was used in some shape.

If we make cartesian and spherical points belong to the same family of types, we relate them somewhat, but keep their distinction visible for the type system. We want to introduce a family of types with two members, one for spherical, and one for cartesian coordinates. We introduce empty data types to index the family.

 data Cartesian
 data Spherical

And we express Point as a GADT using Spherical and Cartesian as follows.

 data Point i
   = Cartesian :: Cartesian_coord -> Cartesian_coord -> Point Cartesian
   | Spherical :: Latitude -> Longitude -> Point Spherical

Again, the constructors have different types. The types differ in the type argument to Point. Since the type argument i is never used in the definition of Point to describe values, it is called a phantom type.

Now, we can use pattern matching to figure out the type of points used in some statically unknown shape. Lets again assume that we know how to convert a spherical into an cartesian point.

 convertPoint :: Point Spherical -> Point Cartesian
 convertPoint (Spherical latitude longitude) = ...

Note that we do not have to write a clause for Cartesian points, because a (Point Spherical) value could never have been created by the Cartesian constructor.

We can now write a normalizePoint function which converts spherical points into cartesian points, but which does not change already cartesian points.

 normalize :: Point i -> Point Cartesian
 normalize p@(Spherical _ _) = convertPoint p
 normalize p@(Cartesian _ _) = p

Note that after matching p with Spherical, we are allowed to call convertPoint, and after matching p with Cartesian, we are allowed to return p as a Point Cartesian. This is allowed by the type checker, because the type checker tracks which statically unknown information we rediscover at runtime.

We can use the same phantom types Cartesian and Spherical to write the Shape datatype.

 data Shape i
   = Circle (Point i) Radius
   | Polygon [Point i]

But we cannot make Shape an instance of Functor.

So if you want the type system to dinstiguish between values created by different constructors of the same datatype, you have to make the difference between the constructors visible to the type system. You could either split the datatype into a bunch of unrelated types, or you could rewrite the datatype into a family of datatypes. The most important difference between these approaches is whether you want to have an open or a closed family of types.

 Tillmann

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe



--
Ozgur Akgun

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Abstraction in data types

Darrin Chandler
In reply to this post by Tillmann Rendel-5
Thanks to those who responded. Solutions that didn't work for my
specific case still taught me more about expressing things in the
Haskell type system.

And... this particular response is extremely well written and useful.
You've made the issues involved very clear and understandable. I really
appreciate it. Thanks!

On Thu, Mar 18, 2010 at 02:58:04PM +0100, Tillmann Rendel wrote:

> Darrin Chandler wrote:
> >data Point = Cartesian (Cartesian_coord, Cartesian_coord)
> > | Spherical (Latitude, Longitude)
> >
> >type Center = Point
> >type Radius = Float
> >
> >data Shape = Circle Center Radius
> > | Polygon [Point]
> >
> >This obviously stinks since a Polygon could contain mixed Cartesian and
> >Spherical points. Polygon needs to be one or the other, but not mixed.
>
> The problem is that you cannot distinguish in the type system
> whether a Point was created by the Cartesian or the Spherical
> constructor. These constructors have the following types:
>
>   Cartesian :: (Cartesian_coord, Cartesion_coor) -> Point
>   Spherical :: (Latitude, Latitude) -> Point
>
> The types of both constructors end with the same type, the type
> Point. If you manage to change your code so that the constructors
> end in different types, then the difference between them will be
> visible to the type system, and you will have the static knowledge
> you want.
>
> I know of two strategies to make the constructors end in different
> types. They could belong to different datatypes, or they could
> belong to different instances of a family of datatypes.
>
> You could split the type Point into two types as follows.
>
>   data Cartesian
>     = Cartesian Cartesian_coord Cartesian_coord
>
>   data Spherical
>     = Spherical Latitude Longitude
>
> Now the constructors have the following types:
>
>   Cartesian :: Cartesian_coord -> Cartesian_coord -> Cartesian
>   Spherical :: Latitude -> Longitude -> Spherical
>
> Since the types of the constructors end in different types, we can
> distinguish values created with Cartesian from values created with
> Spherical in the type system.
>
> (Note that I dropped the use of tuples from your types to make the
> example more idiomatic Haskell).
>
> However, how are we going to write the Shape datatype now that we
> have two different types of points? We write a family of Shape
> datatypes parameterized in the type of points we want to use.
>
>   data Shape point
>     = Circle point Radius
>     | Polygon [point]
>
> Note that point is a type variable, which can be instantiated with
> whatever type we like, including Cartesian and Spherical. For
> example, Shape Cartesian is the type of shapes described with
> cartesian coordinates, and Shape Spherical is the type of shapes
> described with spherical coordinates. But we could also have a type
> like Shape Integer, which is the type of shapes described with
> Integer points, whatever that means.
>
> This generality of the Shape family of types has advantages and
> disadvantages. An advantage is that we can make Shape an instance of
> the type class Functor as follows.
>
>   instance Functor Shape where
>     fmap f (Circle p r) = Circle (f p) r
>     fmap f (Polygon ps) = Polygon (map f ps)
>
> Note that fmap applies a function to every point in the shape, but
> does not change the overall shape structure. This could be used, for
> example, to convert between a spherical and a cartesian shape, given
> some function to convert between spherical and cartesian points.
>
>   convertPoint :: Spherical -> Cartesian
>   convertPoint = ...
>
>   convertShape :: Shape Spherical -> Shape Cartesian
>   convertShape = fmap convertPoint
>
> Another example could be a shape with all the cartesian points
> stored in same database. Lets assume that we have a function to
> lookup which point is stored in the database under some key.
>
>   lookupPoint :: Database -> Key -> Cartesian
>   lookupPoint = ...
>
> We can use fmap to write the function which takes a Shape with keys
> as points, and looks up all the keys in the database, returning a
> Shape of cartesian coordinates.
>
>   lookupShape :: Database -> Shape Key -> Shape Cartesian
>   lookupShape db = fmap (lookupPoint db)
>
> The Functor typeclass is further extended and specialized by
> typeclasses like Traversable, Foldable, Applicative, Alternative,
> Monad, etc. Probably some of them are applicable here.
>
> So an advantage of this kind of open family Shape is that we can
> write instances for standard typeclasses, and another advantage is
> that we may find good uses for types such as (Shape Key).
>
> However, a disadvantage may be that we cannot restrict a priori
> which types of points are useable in our program. That also means
> that we cannot use pattern matching to discover at runtime which
> type of point was used in some shape.
>
> If we make cartesian and spherical points belong to the same family
> of types, we relate them somewhat, but keep their distinction
> visible for the type system. We want to introduce a family of types
> with two members, one for spherical, and one for cartesian
> coordinates. We introduce empty data types to index the family.
>
>   data Cartesian
>   data Spherical
>
> And we express Point as a GADT using Spherical and Cartesian as follows.
>
>   data Point i
>     = Cartesian :: Cartesian_coord -> Cartesian_coord -> Point Cartesian
>     | Spherical :: Latitude -> Longitude -> Point Spherical
>
> Again, the constructors have different types. The types differ in
> the type argument to Point. Since the type argument i is never used
> in the definition of Point to describe values, it is called a
> phantom type.
>
> Now, we can use pattern matching to figure out the type of points
> used in some statically unknown shape. Lets again assume that we
> know how to convert a spherical into an cartesian point.
>
>   convertPoint :: Point Spherical -> Point Cartesian
>   convertPoint (Spherical latitude longitude) = ...
>
> Note that we do not have to write a clause for Cartesian points,
> because a (Point Spherical) value could never have been created by
> the Cartesian constructor.
>
> We can now write a normalizePoint function which converts spherical
> points into cartesian points, but which does not change already
> cartesian points.
>
>   normalize :: Point i -> Point Cartesian
>   normalize p@(Spherical _ _) = convertPoint p
>   normalize p@(Cartesian _ _) = p
>
> Note that after matching p with Spherical, we are allowed to call
> convertPoint, and after matching p with Cartesian, we are allowed to
> return p as a Point Cartesian. This is allowed by the type checker,
> because the type checker tracks which statically unknown information
> we rediscover at runtime.
>
> We can use the same phantom types Cartesian and Spherical to write
> the Shape datatype.
>
>   data Shape i
>     = Circle (Point i) Radius
>     | Polygon [Point i]
>
> But we cannot make Shape an instance of Functor.
>
> So if you want the type system to dinstiguish between values created
> by different constructors of the same datatype, you have to make the
> difference between the constructors visible to the type system. You
> could either split the datatype into a bunch of unrelated types, or
> you could rewrite the datatype into a family of datatypes. The most
> important difference between these approaches is whether you want to
> have an open or a closed family of types.
>
>   Tillmann
>
--
Darrin Chandler            |  Phoenix BSD User Group  |  MetaBUG
[hidden email]   |  http://phxbug.org/      |  http://metabug.org/
http://www.stilyagin.com/  |  Daemons in the Desert   |  Global BUG Federation

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe

attachment0 (202 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Abstraction in data types

Job Vranish-2
In reply to this post by Darrin Chandler
A phantom type might do what you want:

-- notice the type parameter on point that isn't used in the type
data Point  a  = Cartesian (Cartesian_coord, Cartesian_coord)
                         | Spherical (Latitude, Longitude)

-- make some dummy types
data SphericalP
data CartesianP

--make some constructors that add a restriction to the phantom type
-- notice the CartesianP type restriction, this isn't needed but allows us to restrict our type later if we want
mkCartesian :: (Cartesian_coord, Cartesian_coord) -> Point  CartesianP
mkCartesian = Cartesian

mkSherical :: (Latitude, Longitude) -> Point  SphericalP
mkSherical = Spherical


type Center = Point
type Radius = Float

-- now the shape type doesn't care which type of point you have, but requires that all the points are the same
data Shape  a = Circle Center Radius
                           | Polygon [Point a]


The main problem here, is that you want to hide the Cartesian and Spherical constructors and only use mkCartesian and mkSherical to make Points (so that they have the proper restrictions). But this prevents you from using pattern matching where you have the constructors hidden.

GADTs however will solve that:

data Point a where
  Cartesian :: (Cartesian_coord, Cartesian_coord) -> Point CartesianP
  Spherical :: (Latitude, Longitude)-> Point SphericalP

Hope that helps :)

- Job

On Thu, Mar 18, 2010 at 12:20 AM, Darrin Chandler <[hidden email]> wrote:
Hi,

Trying to get up to speed in Haskell, I'm playing with doing some
abstraction in data types. Specifically, I have this:

type Cartesian_coord = Float

type Latitude  = Float
type Longitude = Float

data Point      = Cartesian (Cartesian_coord, Cartesian_coord)
               | Spherical (Latitude, Longitude)

type Center = Point
type Radius = Float

data Shape      = Circle Center Radius
               | Polygon [Point]

This obviously stinks since a Polygon could contain mixed Cartesian and
Spherical points. Polygon needs to be one or the other, but not mixed.

I could define seperate types for Cartesian and Spherical and seperate
CartesianPoly and SphericalPoly, but that doesn't seem very elegant and
also increases as I add more coordinate systems and shapes. I read a
little on GADTs, et al, but I'm not sure if that's what I want for this
or not.

Any help appreciated!

--
Darrin Chandler            |  Phoenix BSD User Group  |  MetaBUG
[hidden email]   |  http://phxbug.org/      |  http://metabug.org/
http://www.stilyagin.com/  |  Daemons in the Desert   |  Global BUG Federation

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe



_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Abstraction in data types

John Meacham
In reply to this post by Darrin Chandler
On Wed, Mar 17, 2010 at 09:20:49PM -0700, Darrin Chandler wrote:
> data Point = Cartesian (Cartesian_coord, Cartesian_coord)
> | Spherical (Latitude, Longitude)

Just a quick unrelated note, though you are probably aware of this,
doing

> data Foo = Foo (X,Y)
means something subtly different than  
> data Foo = Foo X Y
and can be less efficient.

A quick way to see they are different is to count the bottoms,

in the first case (where _ is bottom and X is some value)
you have the cases
> Foo _
> Foo (_,_)
> Foo (X,_)
> Foo (_,X)
> Foo (X,X)
and in the other case you have
> Foo _ _
> Foo X _
> Foo _ X
> Foo X X

so one has 5 distinct values, and the other has 4, hence they are not
isomorphic. All things being equal, this means the second case will be
more efficient as there is one less case it needs to distinguish (every
potential bottom implys an 'eval' somewhere). Depending on your code,
all things may not be equal and there are rare times when the tupled
version is more efficient however.

        John


--
John Meacham - ⑆repetae.net⑆john⑈ - http://notanumber.net/
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Abstraction in data types

Luke Palmer-2
On Thu, Mar 18, 2010 at 12:17 PM, John Meacham <[hidden email]> wrote:

> On Wed, Mar 17, 2010 at 09:20:49PM -0700, Darrin Chandler wrote:
>> data Point    = Cartesian (Cartesian_coord, Cartesian_coord)
>>               | Spherical (Latitude, Longitude)
>
> Just a quick unrelated note, though you are probably aware of this,
> doing
>
>> data Foo = Foo (X,Y)
> means something subtly different than
>> data Foo = Foo X Y
> and can be less efficient.

On the other hand, the latter is equivalent to:

newtype Foo = Foo (X,Y)

> A quick way to see they are different is to count the bottoms,
>
> in the first case (where _ is bottom and X is some value)
> you have the cases
>> Foo _
>> Foo (_,_)
>> Foo (X,_)
>> Foo (_,X)
>> Foo (X,X)
> and in the other case you have
>> Foo _ _
>> Foo X _
>> Foo _ X
>> Foo X X
>
> so one has 5 distinct values, and the other has 4, hence they are not
> isomorphic. All things being equal, this means the second case will be
> more efficient as there is one less case it needs to distinguish (every
> potential bottom implys an 'eval' somewhere). Depending on your code,
> all things may not be equal and there are rare times when the tupled
> version is more efficient however.
>
>        John
>
>
> --
> John Meacham - ⑆repetae.net⑆john⑈ - http://notanumber.net/
> _______________________________________________
> Haskell-Cafe mailing list
> [hidden email]
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Abstraction in data types

John Meacham
In reply to this post by Darrin Chandler
On Wed, Mar 17, 2010 at 09:20:49PM -0700, Darrin Chandler wrote:

> Trying to get up to speed in Haskell, I'm playing with doing some
> abstraction in data types. Specifically, I have this:
>
> type Cartesian_coord = Float
>
> type Latitude  = Float
> type Longitude = Float
>
> data Point = Cartesian (Cartesian_coord, Cartesian_coord)
> | Spherical (Latitude, Longitude)
>
> type Center = Point
> type Radius = Float
>
> data Shape = Circle Center Radius
> | Polygon [Point]

phantom types can help you, providing the ability to distinguish the two
without the run-time overhead of checking the Cartesioan and Spherical
constructors

> data Cartesian -- empty, just used for the type constructor
> data Spherical
> data Point a = Point Float Float
> data Shape a = Circle (Point a) Radius
>              | Polygon [Point a]

now you can have routines like
> spPoint :: Latitude -> Longitude -> Point Spherical
> cPoint :: Cartesian_coord -> Cartesian_coord -> Point Cartesian
to create points of each, yet you can still have functions on 'Point a'
that will work on any type of point.

You may want to create a class that converts between the two

> class Coordinated f where
>    toCartesian :: f Spherical -> f Cartesian
>    toSpherical :: f Cartesian -> f Spherical
>
> instance Coordinated Point where ...
> instance Coordinated Shape where ...

        John

--
John Meacham - ⑆repetae.net⑆john⑈ - http://notanumber.net/
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Abstraction in data types

Brandon S Allbery KF8NH
In reply to this post by Tillmann Rendel-5
Only one small nit here:

On Mar 18, 2010, at 09:58 , Tillmann Rendel wrote:
> And we express Point as a GADT using Spherical and Cartesian as  
> follows.
>
>  data Point i
>    = Cartesian :: Cartesian_coord -> Cartesian_coord -> Point  
> Cartesian
>    | Spherical :: Latitude -> Longitude -> Point Spherical

The actual syntax for this is

 > data Point i where
 >   Cartesian :: Cartesian_coord -> Cartesian_coord -> Point Cartesian
 >   Spherical :: Latitude -> Longitude -> Point Spherical

--
brandon s. allbery [solaris,freebsd,perl,pugs,haskell] [hidden email]
system administrator [openafs,heimdal,too many hats] [hidden email]
electrical and computer engineering, carnegie mellon university    KF8NH



_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe

PGP.sig (202 bytes) Download Attachment