Consensus about databases / serialization

classic Classic list List threaded Threaded
15 messages Options
bf3
Reply | Threaded
Open this post in threaded view
|

Consensus about databases / serialization

bf3

As I’m a selfmade man, I never really studied relational databases in detail. My intuition told me that the “relational” part was not really suitable for the 3D data, 2D images, animation curves, state machines, and other data I encountered in the videogame and animation business. I could always get away with files, and for the applications I needed to deploy, plugging in a couple of extra gigabytes of RAM and serializing the “object” state to disk was more practical, cheaper and faster.

 

However, a couple of years ago I started studying computer science (I seem to do the theory after the practice, weird behavior ;-) at the Open University, and one of the exams I did was about databases. Initially this course convinced me that databases are actually very nice, but the course ended with a topic on object oriented databases, which were designed to make storing data like “3D models, graphs, networks, and complex datastructures” more practical. Duh.

 

Since then, I did deploy a few commercial applications for customers using databases, which worked fine for the typical “simple/flat” database data. I hated embedding a dynamic untyped language like SQL, as much as I hated embedding code in HTML or XML… IMHO it feels UGLY and unsafe. Regarding the other popular data storage format – XML – I did use that a lot, but it seems like going back to the stone ages, when hierarchical stores/databases got invented (and ditched?)

 

Now, initially after an introduction to Microsoft’s LINQ, and recently having read a very brief overview of HAppS, it seems I’m not the only one with those “feelings”.

 

Ouch, this introduction got way to long, sorry about that ;-)

 

Finally some practical questions:

·        regarding Haskell and databases, the page http://haskell.org/haskellwiki/Libraries_and_tools/Database_interfaces describes a few, but which are the ones that are stable and practical? Any user experiences?

·        HApps is not listed in the page above, because it does not use databases? Is HApps reliable or experimental, and does it scale well? Any success stories?

·        regarding Haskell and serialization, I don’t think that implementing Read/Show is a good way for real serialization, so what other options exist? I could find some libraries at http://hackage.haskell.org/packages/archive/pkg-list.html#cat:Data, but again which are the most practical and stable? When programming in C++/MFC and C#/.NET, I tended to develop my own serialization frameworks because I used that for many things, like logging commands to disk, performing undo/redo, intra and inter process cut/copy/paste, save/load, etc…

·        Regarding serialization, I’m kinda curious how ADTs and even GADTs are stored and retrieved in a relational database? I guess it could be done using BLOBs and serialization to ByteStrings, so bypassing a lot of the database table structures?

·        If I would want to experiment with say HAppS, the way I understand it, I first would first have to study “Scratch your boilerplate” and Template Haskell, and maybe some other language features? I’m still new to Haskell, and the road to understanding all language elements and extensions is very long, so sequentially learning it would be insane I guess. I have no practical experience with TH, but I spent a long time trying to do “aspect oriented programming” in C# without success, so TH looks uber to me…

 

Thanks a lot and best wishes for 2008?

 

Peter


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Consensus about databases / serialization

Salvatore Insalaco
> ·        regarding Haskell and databases, the page
> http://haskell.org/haskellwiki/Libraries_and_tools/Database_interfaces
> describes a few, but which are the ones that are stable and practical? Any
> user experiences?

During my experiments I found Takusen
(http://darcs.haskell.org/takusen/) and HDBC
(http://software.complete.org/hdbc) very useful, even if I liked
Takusen interface more.

> ·        regarding Haskell and serialization, I don't think that
> implementing Read/Show is a good way for real serialization, so what other
> options exist?

I could suggest Data.Binary (http://code.haskell.org/binary/), that is
very well performing and supported.

There are ways to generate instances of Binary automatically. I like
the "Derive" approach most
(http://www.cs.york.ac.uk/fp/darcs/derive/derive.htm), as it uses
Template Haskell and does not require separate pre-processing.
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Consensus about databases / serialization

Cristian Baboi
In reply to this post by bf3
I recommend you read "Extending the database relational model to capture  
more meaning" by E.F. Codd.


On Wed, 02 Jan 2008 13:50:46 +0200, Peter Verswyvelen <[hidden email]>  
wrote:

> As I'm a selfmade man, I never really studied relational databases in
> detail. My intuition told me that the "relational" part was not really
> suitable for the 3D data, 2D images, animation curves, state machines,  
> and
> other data I encountered in the videogame and animation business. I could
> always get away with files, and for the applications I needed to deploy,
> plugging in a couple of extra gigabytes of RAM and serializing the  
> "object"
> state to disk was more practical, cheaper and faster.
>
>
> However, a couple of years ago I started studying computer science (I  
> seem
> to do the theory after the practice, weird behavior ;-) at the Open
> University, and one of the exams I did was about databases. Initially  
> this
> course convinced me that databases are actually very nice, but the course
> ended with a topic on object oriented databases, which were designed to  
> make
> storing data like "3D models, graphs, networks, and complex  
> datastructures"
> more practical. Duh.
>
>
> Since then, I did deploy a few commercial applications for customers  
> using
> databases, which worked fine for the typical "simple/flat" database  
> data. I
> hated embedding a dynamic untyped language like SQL, as much as I hated
> embedding code in HTML or XML. IMHO it feels UGLY and unsafe. Regarding  
> the
> other popular data storage format - XML - I did use that a lot, but it  
> seems
> like going back to the stone ages, when hierarchical stores/databases got
> invented (and ditched?)
>
>
> Now, initially after an introduction to Microsoft's LINQ, and recently
> having read a very brief overview of HAppS, it seems I'm not the only one
> with those "feelings".
>
>
> Ouch, this introduction got way to long, sorry about that ;-)
>
>
> Finally some practical questions:
>
> .        regarding Haskell and databases, the page
> http://haskell.org/haskellwiki/Libraries_and_tools/Database_interfaces
> describes a few, but which are the ones that are stable and practical?  
> Any
> user experiences?
>
> .        HApps is not listed in the page above, because it does not use
> databases? Is HApps reliable or experimental, and does it scale well? Any
> success stories?
>
> .        regarding Haskell and serialization, I don't think that
> implementing Read/Show is a good way for real serialization, so what  
> other
> options exist? I could find some libraries at
> http://hackage.haskell.org/packages/archive/pkg-list.html#cat:Data, but
> again which are the most practical and stable? When programming in  
> C++/MFC
> and C#/.NET, I tended to develop my own serialization frameworks because  
> I
> used that for many things, like logging commands to disk, performing
> undo/redo, intra and inter process cut/copy/paste, save/load, etc.
>
> .        Regarding serialization, I'm kinda curious how ADTs and even  
> GADTs
> are stored and retrieved in a relational database? I guess it could be  
> done
> using BLOBs and serialization to ByteStrings, so bypassing a lot of the
> database table structures?
>
> .        If I would want to experiment with say HAppS, the way I  
> understand
> it, I first would first have to study "Scratch your boilerplate" and
> Template Haskell, and maybe some other language features? I'm still new  
> to
> Haskell, and the road to understanding all language elements and  
> extensions
> is very long, so sequentially learning it would be insane I guess. I  
> have no
> practical experience with TH, but I spent a long time trying to do  
> "aspect
> oriented programming" in C# without success, so TH looks uber to me.
>
>
> Thanks a lot and best wishes for 2008?
>
>
> Peter
>


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Consensus about databases / serialization

Jeff Polakow
In reply to this post by bf3

Hello,

I use HDBC for ODBC database access, and HAppS as a web server. I am fairly happy with both. Here are some further thoughts...

> Finally some practical questions:
> ·        regarding Haskell and databases, the page http://haskell.
> org/haskellwiki/Libraries_and_tools/Database_interfaces describes a
> few, but which are the ones that are stable and practical? Any user
> experiences?

>
HDBC is fairly stable (although its ODBC driver crashes ghc 6.8 on windows). I think HSQL is similarly stable. Takusen offers a slightly higher-level interface and some performance guarantees; it is a nice system but lacks support for ODBC (supposedly this is in the works). HaskelDB is probably the ideal database access system for Haskell, however the distribution was in bad shape (no documentation, hard to compile, etc.) the last I looked maybe 6 months ago.

> ·        HApps is not listed in the page above, because it does not
> use databases? Is HApps reliable or experimental, and does it scale
> well? Any success stories?

>
HAppS is a general server framework for Haskell. HAppS is very appealing because it allows you to dynamically create pages directly with Haskell. HAppS encourages storing your server state in memory, but it is easy to read in state on the fly from external sources. The only caveat with HAppS is that the system has been in active development for the past few months is just starting (I hope) to settle down; thus useful documentation/examples are hard to find, but the HAppS developers are pretty good at replying to help requests on the HAppS IRC and the HAppS mailing list. I am currently using an old (and stable) version of HAppS but expect to upgrade to the latest version soon.

> ·        If I would want to experiment with say HAppS, the way I
> understand it, I first would first have to study “Scratch your
> boilerplate” and Template Haskell, and maybe some other language
> features? I’m still new to Haskell, and the road to understanding
> all language elements and extensions is very long, so sequentially
> learning it would be insane I guess. I have no practical experience
> with TH, but I spent a long time trying to do “aspect oriented
> programming” in C# without success, so TH looks uber to me…

>  
While HAppS does use SYB and TH, you don't need to understand them to effectively use HAppS; of course you'll need to understand them, at least basic TH, to understand the details of what HAppS is doing.

hope that helps,
  Jeff

---

This e-mail may contain confidential and/or privileged information. If you
are not the intended recipient (or have received this e-mail in error)
please notify the sender immediately and destroy this e-mail. Any
unauthorized copying, disclosure or distribution of the material in this
e-mail is strictly forbidden.

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Consensus about databases / serialization

Steve Lihn
I have started documenting the Database Wikibook, in particular, about
HDBC. It is still very rough at this time, but something is better
than nothing :-) If you want to add more content, certainly welcome!

http://en.wikibooks.org/wiki/Haskell/Database

On 1/2/08, Jeff Polakow <[hidden email]> wrote:
> Hello,
> I use HDBC for ODBC database access,
and HAppS as a web server. I am fairly happy with both. Here are some further
thoughts...
> > Finally some practical questions:

> > ·        regarding Haskell
and databases, the page http://haskell.
>
> org/haskellwiki/Libraries_and_tools/Database_interfaces describes
a
>
> few, but which are the ones that are stable and practical? Any user

>
> experiences?

> >

> HDBC is fairly stable (although its ODBC driver crashes
ghc 6.8 on windows). I think HSQL is similarly stable. Takusen offers a
slightly higher-level interface and some performance guarantees; it is
a nice system but lacks support for ODBC (supposedly this is in the works).
HaskelDB is probably the ideal database access system for Haskell, however
the distribution was in bad shape (no documentation, hard to compile, etc.)
the last I looked maybe 6 months ago.

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Consensus about databases / serialization

Justin Bailey
In reply to this post by bf3
I can speak to haskelldb a little, see below:

On Jan 2, 2008 3:50 AM, Peter Verswyvelen <[hidden email]> wrote:
> ·        regarding Haskell and databases, the page
> http://haskell.org/haskellwiki/Libraries_and_tools/Database_interfaces
> describes a few, but which are the ones that are stable and practical? Any
> user experiences?

I started looking at haskell database libraries to generate SQL for
me. Haskelldb does this well - it uses a higher-level representation
of queries based on "relational algebra" (also the basis of SQL) which
is pretty easy to understand if you know SQL.  It takes care of a lof
the details of generating SQL strings, and does it in a mostly
type-safe way.

It is a bit complicated to install the library and all its
dependencies, because it can work with 3+ (mysql, postgres, odbc)
databases using two different backends (hdbc and hsql). I chose to go
with HDBC because it compiled on Windows and postgres because thats
what we have at my workplace. Once I got it built and installed its
worked well for me.

Until the most recent versions though, it added a "distinct" operator
to all select statements. I submitted a patch which was accepted and
now that behavior is no longer the default. It is semi-actively
maintained by the original authors and Bjorn, at least, has been very
responsive to my queries on the haskelldb-users mailing list. He also
has made minor updates to keep it compiling with the latest GHC and
Cabal.

Hope that helps!

Justin
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
bf3
Reply | Threaded
Open this post in threaded view
|

Re: Consensus about databases / serialization

bf3
Looks good! I liked relational algebra much much more than SQL, so I'll certainly have to look into that.

Thanks,
Peter

Justin Bailey wrote:
I can speak to haskelldb a little, see below:

On Jan 2, 2008 3:50 AM, Peter Verswyvelen [hidden email] wrote:
  
·        regarding Haskell and databases, the page
http://haskell.org/haskellwiki/Libraries_and_tools/Database_interfaces
describes a few, but which are the ones that are stable and practical? Any
user experiences?
    

I started looking at haskell database libraries to generate SQL for
me. Haskelldb does this well - it uses a higher-level representation
of queries based on "relational algebra" (also the basis of SQL) which
is pretty easy to understand if you know SQL.  It takes care of a lof
the details of generating SQL strings, and does it in a mostly
type-safe way.

It is a bit complicated to install the library and all its
dependencies, because it can work with 3+ (mysql, postgres, odbc)
databases using two different backends (hdbc and hsql). I chose to go
with HDBC because it compiled on Windows and postgres because thats
what we have at my workplace. Once I got it built and installed its
worked well for me.

Until the most recent versions though, it added a "distinct" operator
to all select statements. I submitted a patch which was accepted and
now that behavior is no longer the default. It is semi-actively
maintained by the original authors and Bjorn, at least, has been very
responsive to my queries on the haskelldb-users mailing list. He also
has made minor updates to keep it compiling with the latest GHC and
Cabal.

Hope that helps!

Justin


  


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Consensus about databases / serialization

Yitzchak Gale
Peter Verswyvelen wrote:
>  Looks good! I liked relational algebra much much more than SQL, so I'll
> certainly have to look into that.

I agree. I have not tried haskelldb yet, but I would
like to.

My impression from some previous posts is that
because of the high-level approach, it is difficult
to control the precise SQL that is generated. In practice,
you almost always have to do some tweaking that is
at least DB-dependent, and often application dependent.

Is there any way to do that in haskelldb? If not,
is there an obvious way to add it?

Thanks,
Yitz
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
bf3
Reply | Threaded
Open this post in threaded view
|

RE: Consensus about databases / serialization

bf3
Yitz wrote:
> My impression from some previous posts is that
> because of the high-level approach, it is difficult
> to control the precise SQL that is generated. In practice,
> you almost always have to do some tweaking that is
> at least DB-dependent, and often application dependent.

Can't the same be said regarding SQL itself? It sometimes needs tweaking.
That's the problem with any high level abstraction no? Just like in Haskell
you sometimes have to use strictness tweaks. Of course having an extra layer
on top of SQL will make the tweaking more difficult :)

Peter


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

RE: Consensus about databases / serialization

Lihn, Steve
 
For small queries, it does not matter much which approach you choose.
But for large, complex queries, such 3-table join (especial Star
Transformation) and/or large data set (millions of rows involved in
large data warehouses), the performance will differ by order of
magnitude, depending on how things are optimized.  

Steve

-----Original Message-----
From: [hidden email]
[mailto:[hidden email]] On Behalf Of Peter Verswyvelen
Subject: RE: [Haskell-cafe] Consensus about databases / serialization

Yitz wrote:
> My impression from some previous posts is that
> because of the high-level approach, it is difficult
> to control the precise SQL that is generated. In practice,
> you almost always have to do some tweaking that is
> at least DB-dependent, and often application dependent.

Can't the same be said regarding SQL itself? It sometimes needs
tweaking.
That's the problem with any high level abstraction no? Just like in
Haskell
you sometimes have to use strictness tweaks. Of course having an extra
layer
on top of SQL will make the tweaking more difficult :)

Peter




------------------------------------------------------------------------------
Notice:  This e-mail message, together with any attachments, contains
information of Merck & Co., Inc. (One Merck Drive, Whitehouse Station,
New Jersey, USA 08889), and/or its affiliates (which may be known
outside the United States as Merck Frosst, Merck Sharp & Dohme or MSD
and in Japan, as Banyu - direct contact information for affiliates is
available at http://www.merck.com/contact/contacts.html) that may be
confidential, proprietary copyrighted and/or legally privileged. It is
intended solely for the use of the individual or entity named on this
message. If you are not the intended recipient, and have received this
message in error, please notify us immediately by reply e-mail and then
delete it from your system.

------------------------------------------------------------------------------
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Consensus about databases / serialization

Yitzchak Gale
In reply to this post by bf3
I wrote:
>>... to control the precise SQL that is generated. In practice,
>> you almost always have to do some tweaking that is
>> at least DB-dependent, and often application dependent.

Peter Verswyvelen wrote:
> Can't the same be said regarding SQL itself? It sometimes needs tweaking.
> That's the problem with any high level abstraction no?

Certainly. In an ideal world, you could just write your queries
in straightforward SQL and the DB would figure out what to
do. But in real life, that is not how it works.

So that complexity then gets passed up to the Haskell
interface layers. Again, in an ideal world you would like to
imagine that a high-level interface like haskelldb would
be smart enough to compile any relational algebraic
expression into SQL that will do the Right Thing for the
given backend.

But that would be very difficult. For example - there may
be things you need to tweak that are both
application-dependent and DB dependent.

So to be usable in a serious DB project, there
would have to be some kind of hooks that would allow
you to tweak the SQL. After doing that - what have we
gained by taking the high-level approach to begin with?
I'm not sure.

I would like to hear about people's thoughts and experiences
on this.

-Yitz
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Consensus about databases / serialization

Yitzchak Gale
In reply to this post by Lihn, Steve
Lihn, Steve wrote:
> For small queries, it does not matter much which approach you choose.
> But for large, complex queries, such 3-table join (especial Star
> Transformation) and/or large data set (millions of rows involved in
> large data warehouses), the performance will differ by order of
> magnitude, depending on how things are optimized.

Ah, yes. and that brings up another issue - how do the various
backends scale for:

- large SQL passed in
- results with many records
- records with many fields
- records/fields with many bytes
- several cursors

What laziness options are available?

-Yitz
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
bf3
Reply | Threaded
Open this post in threaded view
|

RE: Consensus about databases / serialization

bf3
In reply to this post by Yitzchak Gale
I see. But ouch, exactly the same could be said for Haskell no? :)

Naaah...

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Yitzchak
Gale
Sent: Thursday, January 03, 2008 10:09 PM
To: Peter Verswyvelen
Cc: Justin Bailey; Haskell-Cafe
Subject: Re: [Haskell-cafe] Consensus about databases / serialization

I wrote:
>>... to control the precise SQL that is generated. In practice,
>> you almost always have to do some tweaking that is
>> at least DB-dependent, and often application dependent.

Peter Verswyvelen wrote:
> Can't the same be said regarding SQL itself? It sometimes needs tweaking.
> That's the problem with any high level abstraction no?

Certainly. In an ideal world, you could just write your queries
in straightforward SQL and the DB would figure out what to
do. But in real life, that is not how it works.

So that complexity then gets passed up to the Haskell
interface layers. Again, in an ideal world you would like to
imagine that a high-level interface like haskelldb would
be smart enough to compile any relational algebraic
expression into SQL that will do the Right Thing for the
given backend.

But that would be very difficult. For example - there may
be things you need to tweak that are both
application-dependent and DB dependent.

So to be usable in a serious DB project, there
would have to be some kind of hooks that would allow
you to tweak the SQL. After doing that - what have we
gained by taking the high-level approach to begin with?
I'm not sure.

I would like to hear about people's thoughts and experiences
on this.

-Yitz

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Consensus about databases / serialization

Yitzchak Gale
Peter Verswyvelen wrote:
> I see. But ouch, exactly the same could be said for Haskell no? :)
> Naaah...

Actually, that is one of the things that is so impressive
about Haskell. It starts at such a high level, with such
beautiful and powerful abstractions. But if needed, you
can optimize down through many layers. All the way
down to what they do on the Shootout, where they
compete against C.

It took a huge amount of effort over many years to
achieve that. DB support still has a long way to go,
but it is great to see that people are working on it
at varying levels of abstraction.

-Yitz
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Consensus about databases / serialization

Brandon S Allbery KF8NH
In reply to this post by bf3

On Jan 3, 2008, at 16:32 , Peter Verswyvelen wrote:

> I see. But ouch, exactly the same could be said for Haskell no? :)

Optimization by quasirandom insertion of bangs / seq?  Already there :)

--
brandon s. allbery [solaris,freebsd,perl,pugs,haskell] [hidden email]
system administrator [openafs,heimdal,too many hats] [hidden email]
electrical and computer engineering, carnegie mellon university    KF8NH


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe