[ANN] Laborantin: experimentation framework

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

[ANN] Laborantin: experimentation framework

lucas di cioccio
Dear all,

I am happy to announce Laborantin. Laborantin is a Haskell library and DSL for
running and analyzing controlled experiments.

Repository: https://github.com/lucasdicioccio/laborantin-hs
Hackage page: http://hackage.haskell.org/package/laborantin-hs

Laborantin's opinion is that running proper experiments is a non-trivial and
often overlooked problem. Therefore, we should provide good tools to assist
experimenters. The hope is that, with Laborantin, experimenters will spend more
time on their core problem while racing through the menial tasks of editing
scripts because one data point is missing in a plot. At the same time,
Laborantin is also an effort within the broad open-science movement. Indeed,
Laborantin's DSL separates boilerplate from the actual experiment
implementation. Thus, Laborantin could reduce the friction for code and
data-reuse.

One family of experiments that fit well Laborantin are benchmarks with tedious
setup and teardown procedures (for instance starting, configuring, and stopping
remote machines). Analyses that require measurements from a variety of data
points in a multi-dimensional parameter space also fall in the scope of
Laborantin.

When using Laborantin, the experimenter:

* Can express experimental scenarios using a readable and familiar DSL.
  This feature, albeit subjective, was confirmed by non-Haskeller colleagues.
* Saves time on boilerplate such as writing command-line parsers or
  encoding dependencies between experiments and analysis results in a Makefile.
* Benefits from auto-documentation and result introspection features when one
  comes back to a project, possibly months or weeks later.
* Harnesses the power of Haskell type-system to catch common errors at compile time

If you had to read one story to understand the pain points that Laborantin
tries to address, it should be Section 5 of "Strategies for Sound Internet
Measurement" (V. Paxson, IMC 2004).

I'd be glad to take question and comments (or, even better, code reviews and
pull requests).

Kind regards,
--Lucas DiCioccio (@lucasdicioccio on GitHub/Twitter)

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: [ANN] Laborantin: experimentation framework

Tom Nielsen
Hi Lucas,

In connection with your work on Laborantin, you may be interested in our papers:

Braincurry: A domain-specific language for integrative neuroscience

A formal mathematical framework for physiological observations, experiments and analyses.

I found it difficult to excite experimental biologists about the benefit of adopting experiment description languages. I am now concentrating on a functional language for statistical data analysis - see https://bayeshive.com

Tom


On 23 December 2013 09:27, lucas di cioccio <[hidden email]> wrote:
Dear all,

I am happy to announce Laborantin. Laborantin is a Haskell library and DSL for
running and analyzing controlled experiments.

Repository: https://github.com/lucasdicioccio/laborantin-hs
Hackage page: http://hackage.haskell.org/package/laborantin-hs

Laborantin's opinion is that running proper experiments is a non-trivial and
often overlooked problem. Therefore, we should provide good tools to assist
experimenters. The hope is that, with Laborantin, experimenters will spend more
time on their core problem while racing through the menial tasks of editing
scripts because one data point is missing in a plot. At the same time,
Laborantin is also an effort within the broad open-science movement. Indeed,
Laborantin's DSL separates boilerplate from the actual experiment
implementation. Thus, Laborantin could reduce the friction for code and
data-reuse.

One family of experiments that fit well Laborantin are benchmarks with tedious
setup and teardown procedures (for instance starting, configuring, and stopping
remote machines). Analyses that require measurements from a variety of data
points in a multi-dimensional parameter space also fall in the scope of
Laborantin.

When using Laborantin, the experimenter:

* Can express experimental scenarios using a readable and familiar DSL.
  This feature, albeit subjective, was confirmed by non-Haskeller colleagues.
* Saves time on boilerplate such as writing command-line parsers or
  encoding dependencies between experiments and analysis results in a Makefile.
* Benefits from auto-documentation and result introspection features when one
  comes back to a project, possibly months or weeks later.
* Harnesses the power of Haskell type-system to catch common errors at compile time

If you had to read one story to understand the pain points that Laborantin
tries to address, it should be Section 5 of "Strategies for Sound Internet
Measurement" (V. Paxson, IMC 2004).

I'd be glad to take question and comments (or, even better, code reviews and
pull requests).

Kind regards,
--Lucas DiCioccio (@lucasdicioccio on GitHub/Twitter)

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe



_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: [ANN] Laborantin: experimentation framework

Corey O'Connor
In reply to this post by lucas di cioccio
This looks really cool!

Cheers,
Corey



On Mon, Dec 23, 2013 at 1:27 AM, lucas di cioccio <[hidden email]> wrote:
Dear all,

I am happy to announce Laborantin. Laborantin is a Haskell library and DSL for
running and analyzing controlled experiments.

Repository: https://github.com/lucasdicioccio/laborantin-hs
Hackage page: http://hackage.haskell.org/package/laborantin-hs

Laborantin's opinion is that running proper experiments is a non-trivial and
often overlooked problem. Therefore, we should provide good tools to assist
experimenters. The hope is that, with Laborantin, experimenters will spend more
time on their core problem while racing through the menial tasks of editing
scripts because one data point is missing in a plot. At the same time,
Laborantin is also an effort within the broad open-science movement. Indeed,
Laborantin's DSL separates boilerplate from the actual experiment
implementation. Thus, Laborantin could reduce the friction for code and
data-reuse.

One family of experiments that fit well Laborantin are benchmarks with tedious
setup and teardown procedures (for instance starting, configuring, and stopping
remote machines). Analyses that require measurements from a variety of data
points in a multi-dimensional parameter space also fall in the scope of
Laborantin.

When using Laborantin, the experimenter:

* Can express experimental scenarios using a readable and familiar DSL.
  This feature, albeit subjective, was confirmed by non-Haskeller colleagues.
* Saves time on boilerplate such as writing command-line parsers or
  encoding dependencies between experiments and analysis results in a Makefile.
* Benefits from auto-documentation and result introspection features when one
  comes back to a project, possibly months or weeks later.
* Harnesses the power of Haskell type-system to catch common errors at compile time

If you had to read one story to understand the pain points that Laborantin
tries to address, it should be Section 5 of "Strategies for Sound Internet
Measurement" (V. Paxson, IMC 2004).

I'd be glad to take question and comments (or, even better, code reviews and
pull requests).

Kind regards,
--Lucas DiCioccio (@lucasdicioccio on GitHub/Twitter)

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe



_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: [ANN] Laborantin: experimentation framework

Adam Vogt
In reply to this post by lucas di cioccio
Hello Lucas,

Am I correct to say that laborantin only does full factorial
experiments? Perhaps there is a straightforward way for users to
specify which model parameters should be confounded in a fractional
factorial design. Another extension would be to move towards
sequential designs, where the trials to run depend on the results so
far. Then more time is spent on the "interesting" regions of the
parameter space.

I think getVar/param could be re-worked to give errors at compile
time. Now you get a runtime error if you typo a parameter or get the
type wrong. Another mistake is to include parameters in the experiment
that do not have any effect on the `run` action, unless those
parameters are there for doing replicates.

Those might be addressed by doing something like:

    a <- parameter "destination" $ do ...
    run $ print =<< param a

Where the types are something like:

  param :: Data.Tagged.Tagged a Text -> M a
  values :: [T a] -> M (Tagged a Text)
  str :: Text -> T Text
  num :: Double -> T Double

with M being whatever state monad you currently use, and param does
the same thing it always has, except now it knows which type you put
in the values list, and it cannot be called with any string. The third
requirement might be met by requiring -fwarn-unused-matches.

An alternative strategy is to change your type Step, into an algebraic
data type with a function to convert it into what it is currently.
Before the experiment happens, you can have a function go through that
data to make sure it will succeed with it's getVar/param. This is
called a deep embedding:
<http://www.haskell.org/haskellwiki/Embedded_domain_specific_language>.

Regards,
Adam

On Mon, Dec 23, 2013 at 4:27 AM, lucas di cioccio
<[hidden email]> wrote:

> Dear all,
>
> I am happy to announce Laborantin. Laborantin is a Haskell library and DSL
> for
> running and analyzing controlled experiments.
>
> Repository: https://github.com/lucasdicioccio/laborantin-hs
> Hackage page: http://hackage.haskell.org/package/laborantin-hs
>
> Laborantin's opinion is that running proper experiments is a non-trivial and
> often overlooked problem. Therefore, we should provide good tools to assist
> experimenters. The hope is that, with Laborantin, experimenters will spend
> more
> time on their core problem while racing through the menial tasks of editing
> scripts because one data point is missing in a plot. At the same time,
> Laborantin is also an effort within the broad open-science movement. Indeed,
> Laborantin's DSL separates boilerplate from the actual experiment
> implementation. Thus, Laborantin could reduce the friction for code and
> data-reuse.
>
> One family of experiments that fit well Laborantin are benchmarks with
> tedious
> setup and teardown procedures (for instance starting, configuring, and
> stopping
> remote machines). Analyses that require measurements from a variety of data
> points in a multi-dimensional parameter space also fall in the scope of
> Laborantin.
>
> When using Laborantin, the experimenter:
>
> * Can express experimental scenarios using a readable and familiar DSL.
>   This feature, albeit subjective, was confirmed by non-Haskeller
> colleagues.
> * Saves time on boilerplate such as writing command-line parsers or
>   encoding dependencies between experiments and analysis results in a
> Makefile.
> * Benefits from auto-documentation and result introspection features when
> one
>   comes back to a project, possibly months or weeks later.
> * Harnesses the power of Haskell type-system to catch common errors at
> compile time
>
> If you had to read one story to understand the pain points that Laborantin
> tries to address, it should be Section 5 of "Strategies for Sound Internet
> Measurement" (V. Paxson, IMC 2004).
>
> I'd be glad to take question and comments (or, even better, code reviews and
> pull requests).
>
> Kind regards,
> --Lucas DiCioccio (@lucasdicioccio on GitHub/Twitter)
>
> _______________________________________________
> Haskell-Cafe mailing list
> [hidden email]
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: [ANN] Laborantin: experimentation framework

lucas di cioccio
In reply to this post by Tom Nielsen
Hi Tom,

Thanks for the pointers.

It is interesting to see that Braincurry and Laborantin have similar designs although we come from very different application domains. You've picked paths that I was not sure to explore (e.g., have experiment parameters be a parameterizable datatype rather than a value in a pre-defined datatype).
I didn't think about enabling algebraic composition of "experiments". It looks like I can incorporate this idea in Laborantin too as a way to "combine" setup/run/teardown hooks. I'll definitely have a second look at Braincurry but first I'll have to read the 2nd paper.

One thing I really would like to support is a way to "inject experiments" into another system and run experiments "live". For instance, A/B testing web pages in a Warp application.

BayesHive looks very nice! congrats.

Enjoy a nice year 2014 and best wishes,
--Lucas


2013/12/30 Tom Nielsen <[hidden email]>
Hi Lucas,

In connection with your work on Laborantin, you may be interested in our papers:

Braincurry: A domain-specific language for integrative neuroscience

A formal mathematical framework for physiological observations, experiments and analyses.

I found it difficult to excite experimental biologists about the benefit of adopting experiment description languages. I am now concentrating on a functional language for statistical data analysis - see https://bayeshive.com

Tom


On 23 December 2013 09:27, lucas di cioccio <[hidden email]> wrote:
Dear all,

I am happy to announce Laborantin. Laborantin is a Haskell library and DSL for
running and analyzing controlled experiments.

Repository: https://github.com/lucasdicioccio/laborantin-hs
Hackage page: http://hackage.haskell.org/package/laborantin-hs

Laborantin's opinion is that running proper experiments is a non-trivial and
often overlooked problem. Therefore, we should provide good tools to assist
experimenters. The hope is that, with Laborantin, experimenters will spend more
time on their core problem while racing through the menial tasks of editing
scripts because one data point is missing in a plot. At the same time,
Laborantin is also an effort within the broad open-science movement. Indeed,
Laborantin's DSL separates boilerplate from the actual experiment
implementation. Thus, Laborantin could reduce the friction for code and
data-reuse.

One family of experiments that fit well Laborantin are benchmarks with tedious
setup and teardown procedures (for instance starting, configuring, and stopping
remote machines). Analyses that require measurements from a variety of data
points in a multi-dimensional parameter space also fall in the scope of
Laborantin.

When using Laborantin, the experimenter:

* Can express experimental scenarios using a readable and familiar DSL.
  This feature, albeit subjective, was confirmed by non-Haskeller colleagues.
* Saves time on boilerplate such as writing command-line parsers or
  encoding dependencies between experiments and analysis results in a Makefile.
* Benefits from auto-documentation and result introspection features when one
  comes back to a project, possibly months or weeks later.
* Harnesses the power of Haskell type-system to catch common errors at compile time

If you had to read one story to understand the pain points that Laborantin
tries to address, it should be Section 5 of "Strategies for Sound Internet
Measurement" (V. Paxson, IMC 2004).

I'd be glad to take question and comments (or, even better, code reviews and
pull requests).

Kind regards,
--Lucas DiCioccio (@lucasdicioccio on GitHub/Twitter)

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe




_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: [ANN] Laborantin: experimentation framework

lucas di cioccio
In reply to this post by Adam Vogt
Hi Adam, thanks for your inputs.

2013/12/31 adam vogt <[hidden email]>
Hello Lucas,

Am I correct to say that laborantin only does full factorial
experiments? Perhaps there is a straightforward way for users to
specify which model parameters should be confounded in a fractional
factorial design. Another extension would be to move towards
sequential designs, where the trials to run depend on the results so
far. Then more time is spent on the "interesting" regions of the
parameter space.

Actually, the parameters specified in the DSL are "indicative" values for a full-factorial default.
At this point, a command-line handler is responsible for exploring the parameter space and executing scenarios. This command-line handler has a way to specify fractional factorial designs by evaluating a query like: "(@sc.param 'foo' > @sc.param 'bar') and @sc.param 'baz' in [1,2,3,'toto']" .
This small query language was my first attempt at "expression parsing and evaluation" and the code might be ugly, but it works and fits most of my current needs. Bonus: with this design, the algorithm to "explore" the satisfiable parameter space is easy to express.

One direction to enrich this small query language would be to express that a parameter takes a continuous value in a range or should fullfill a boolean test function. Then we could use techniques such as rapidly exploring random trees to explore "exotic feasability regions".

Another direction to improve the query language is to require ScenarioDescriptions to have a sort of "cost/fitness function" so that we can later build a parameter-space explorer that performs an optimization. We could even extend the query language to bind a parameter to a value which optimize another experiment.
 
I think getVar/param could be re-worked to give errors at compile
time. Now you get a runtime error if you typo a parameter or get the
type wrong. Another mistake is to include parameters in the experiment
that do not have any effect on the `run` action, unless those
parameters are there for doing replicates.

Those might be addressed by doing something like:

    a <- parameter "destination" $ do ...
    run $ print =<< param a

Where the types are something like:

  param :: Data.Tagged.Tagged a Text -> M a
  values :: [T a] -> M (Tagged a Text)
  str :: Text -> T Text
  num :: Double -> T Double

with M being whatever state monad you currently use, and param does
the same thing it always has, except now it knows which type you put
in the values list, and it cannot be called with any string. The third
requirement might be met by requiring -fwarn-unused-matches.

That's one thing I am parted about. From my experience, it is sometimes handy to branch on whether a value is a number or a string (e.g., to say things like 1, 2, 3, or "all"). Somehow, tagged values do not prevent this either. Similarly, I don't know whether I should let users specify any type for their ParameterDescription at the cost of writing serializers/deserializers boilerplate (although we could provide some default useful types as it is the case now).

An alternative strategy is to change your type Step, into an algebraic
data type with a function to convert it into what it is currently.
Before the experiment happens, you can have a function go through that
data to make sure it will succeed with it's getVar/param. This is
called a deep embedding:
<http://www.haskell.org/haskellwiki/Embedded_domain_specific_language>.

That can be an idea, I didn't go that far yet, but I'll keep an eye on it.

Best wishes for this happy new year,
--Lucas

Regards,
Adam

On Mon, Dec 23, 2013 at 4:27 AM, lucas di cioccio
<[hidden email]> wrote:
> Dear all,
>
> I am happy to announce Laborantin. Laborantin is a Haskell library and DSL
> for
> running and analyzing controlled experiments.
>
> Repository: https://github.com/lucasdicioccio/laborantin-hs
> Hackage page: http://hackage.haskell.org/package/laborantin-hs
>
> Laborantin's opinion is that running proper experiments is a non-trivial and
> often overlooked problem. Therefore, we should provide good tools to assist
> experimenters. The hope is that, with Laborantin, experimenters will spend
> more
> time on their core problem while racing through the menial tasks of editing
> scripts because one data point is missing in a plot. At the same time,
> Laborantin is also an effort within the broad open-science movement. Indeed,
> Laborantin's DSL separates boilerplate from the actual experiment
> implementation. Thus, Laborantin could reduce the friction for code and
> data-reuse.
>
> One family of experiments that fit well Laborantin are benchmarks with
> tedious
> setup and teardown procedures (for instance starting, configuring, and
> stopping
> remote machines). Analyses that require measurements from a variety of data
> points in a multi-dimensional parameter space also fall in the scope of
> Laborantin.
>
> When using Laborantin, the experimenter:
>
> * Can express experimental scenarios using a readable and familiar DSL.
>   This feature, albeit subjective, was confirmed by non-Haskeller
> colleagues.
> * Saves time on boilerplate such as writing command-line parsers or
>   encoding dependencies between experiments and analysis results in a
> Makefile.
> * Benefits from auto-documentation and result introspection features when
> one
>   comes back to a project, possibly months or weeks later.
> * Harnesses the power of Haskell type-system to catch common errors at
> compile time
>
> If you had to read one story to understand the pain points that Laborantin
> tries to address, it should be Section 5 of "Strategies for Sound Internet
> Measurement" (V. Paxson, IMC 2004).
>
> I'd be glad to take question and comments (or, even better, code reviews and
> pull requests).
>
> Kind regards,
> --Lucas DiCioccio (@lucasdicioccio on GitHub/Twitter)
>
> _______________________________________________
> Haskell-Cafe mailing list
> [hidden email]
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>


_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe