Suggestions for an empirical master thesis

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Suggestions for an empirical master thesis

Jon Kristensen-3
Hi, everyone!

I'm looking for a master thesis topic that is empirical in nature (like,
statistics, hypothesis testing, etc.).

The work could involve analyzing either package metadata (Cabal
information), code (AST), and/or data from some other source, possibly
comparing similar data from some non-Haskell domain.

As an example, one idea that was suggested to me was to look at usage
aggregation, or like, how much of a given package another package is
actually using (which could be relevant in any orphan-instance discussion).

 From an academic standpoint, it would be good to pick a metric that can
be validated in some way.

Thank you!

Best,
Jon
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Suggestions for an empirical master thesis

Suzen, Mehmet
Jon,

Why don't you do some graph analysis on the entire hackage using the
metadata? Build a graph and analyse its statistical properties.

Best,
-m

On 4 August 2016 at 21:15, Jon Kristensen <[hidden email]> wrote:

> Hi, everyone!
>
> I'm looking for a master thesis topic that is empirical in nature (like,
> statistics, hypothesis testing, etc.).
>
> The work could involve analyzing either package metadata (Cabal
> information), code (AST), and/or data from some other source, possibly
> comparing similar data from some non-Haskell domain.
>
> As an example, one idea that was suggested to me was to look at usage
> aggregation, or like, how much of a given package another package is
> actually using (which could be relevant in any orphan-instance discussion).
>
> From an academic standpoint, it would be good to pick a metric that can be
> validated in some way.
>
> Thank you!
>
> Best,
> Jon
> _______________________________________________
> Haskell-Cafe mailing list
> To (un)subscribe, modify options or view archives go to:
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> Only members subscribed via the mailman list are allowed to post.
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Suggestions for an empirical master thesis

Jon Kristensen-3
Hi, Mehmet!

Thank you for your suggestion!

I think that I will need to describe some concrete goals, and provide
some evidence from literature that the area includes, or is related to,
particular scientific and engineering challenges. I will take a look at
related papers about this, but any thoughts on it would be welcome!

Best,
Jon

On 08/05/2016 02:28 PM, Suzen, Mehmet wrote:

> Jon,
>
> Why don't you do some graph analysis on the entire hackage using the
> metadata? Build a graph and analyse its statistical properties.
>
> Best,
> -m
>
> On 4 August 2016 at 21:15, Jon Kristensen <[hidden email]> wrote:
>> Hi, everyone!
>>
>> I'm looking for a master thesis topic that is empirical in nature (like,
>> statistics, hypothesis testing, etc.).
>>
>> The work could involve analyzing either package metadata (Cabal
>> information), code (AST), and/or data from some other source, possibly
>> comparing similar data from some non-Haskell domain.
>>
>> As an example, one idea that was suggested to me was to look at usage
>> aggregation, or like, how much of a given package another package is
>> actually using (which could be relevant in any orphan-instance discussion).
>>
>>  From an academic standpoint, it would be good to pick a metric that can be
>> validated in some way.
>>
>> Thank you!
>>
>> Best,
>> Jon
>> _______________________________________________
>> Haskell-Cafe mailing list
>> To (un)subscribe, modify options or view archives go to:
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
>> Only members subscribed via the mailman list are allowed to post.

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Suggestions for an empirical master thesis

Damian Nadales
In reply to this post by Jon Kristensen-3
Hi Jon,

Allow me to brainstorm with you. I don't know what is done in the context of code metrics for Haskell programs, but you could try to find correlations (of their lack thereof) between metrics such as (cyclomatic complexity, fan-in/fan-out) and bugs. For this you could use the Github repositories. The number of bugs would have to be normalized using the numbers of users of a project (maybe measured in number of downloads).

Just an idea...

On Thu, Aug 4, 2016 at 10:15 PM, Jon Kristensen <[hidden email]> wrote:
Hi, everyone!

I'm looking for a master thesis topic that is empirical in nature (like, statistics, hypothesis testing, etc.).

The work could involve analyzing either package metadata (Cabal information), code (AST), and/or data from some other source, possibly comparing similar data from some non-Haskell domain.

As an example, one idea that was suggested to me was to look at usage aggregation, or like, how much of a given package another package is actually using (which could be relevant in any orphan-instance discussion).

From an academic standpoint, it would be good to pick a metric that can be validated in some way.

Thank you!

Best,
Jon
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.


_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Suggestions for an empirical master thesis

Doug McIlroy
In reply to this post by Jon Kristensen-3
>
>The number of bugs would have to be normalized using the
numbers of users of a project (maybe measured in number of downloads).

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Suggestions for an empirical master thesis

Doug McIlroy
In reply to this post by Jon Kristensen-3

>The number of bugs would have to be normalized using the
numbers of users of a project (maybe measured in number of downloads).

Do you expect more bugs for frequently downloaded packages,
or fewer downloads for packages with many bug reports
or bug rate diminishing with downloads? It's a slippery
topic.

Doug
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Suggestions for an empirical master thesis

Damian Nadales
On Sun, Aug 7, 2016 at 1:22 AM, Doug McIlroy <[hidden email]> wrote:

>
>
> >The number of bugs would have to be normalized using the
> numbers of users of a project (maybe measured in number of downloads).
>
> Do you expect more bugs for frequently downloaded packages,
> or fewer downloads for packages with many bug reports
> or bug rate diminishing with downloads? It's a slippery
> topic.
>
The question is whether code metrics could serve as a proxy for
code-quality/sustainability. I didn't suggest to look for associations
between downloads and bug-reports, but use the number of users as an
extra variable in the model (to normalize the data). So if we assume
that all the other variables remain the same, does code with better
metrics exhibit better quality features in practice?


>
> Doug
> _______________________________________________
> Haskell-Cafe mailing list
> To (un)subscribe, modify options or view archives go to:
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> Only members subscribed via the mailman list are allowed to post.
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Reply | Threaded
Open this post in threaded view
|

Re: Suggestions for an empirical master thesis

Jon Kristensen-3
That's interesting! Thank you very much!

Best,
Jon

On 08/07/2016 09:06 AM, Damian Nadales wrote:

> On Sun, Aug 7, 2016 at 1:22 AM, Doug McIlroy <[hidden email]> wrote:
>>
>>> The number of bugs would have to be normalized using the
>> numbers of users of a project (maybe measured in number of downloads).
>>
>> Do you expect more bugs for frequently downloaded packages,
>> or fewer downloads for packages with many bug reports
>> or bug rate diminishing with downloads? It's a slippery
>> topic.
>>
> The question is whether code metrics could serve as a proxy for
> code-quality/sustainability. I didn't suggest to look for associations
> between downloads and bug-reports, but use the number of users as an
> extra variable in the model (to normalize the data). So if we assume
> that all the other variables remain the same, does code with better
> metrics exhibit better quality features in practice?
>
>
>> Doug
>> _______________________________________________
>> Haskell-Cafe mailing list
>> To (un)subscribe, modify options or view archives go to:
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
>> Only members subscribed via the mailman list are allowed to post.
> _______________________________________________
> Haskell-Cafe mailing list
> To (un)subscribe, modify options or view archives go to:
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> Only members subscribed via the mailman list are allowed to post.

_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.