Using lzip instead of xz for distributed tarballs

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Using lzip instead of xz for distributed tarballs

Vanessa McHale-2
Hello all,


GHC is distributed as .tar.xz tarballs; I assume this is because it
produces small tarballs. However, xz is ill-suited for archiving due to
its lack of error recovery. Moreover, lzip produces smaller tarballs
with GHC (I tested with ghc-8.8.2-x86_64-deb9-linux.tar) and
decompression takes about the same amount of time.

There's more information on the project page:
https://www.nongnu.org/lzip/lzip.html.

Cheers,
Vanessa McHale


_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

signature.asc (673 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Using lzip instead of xz for distributed tarballs

Ben Gamari-3
Vanessa McHale <[hidden email]> writes:

> Hello all,
>
>
> GHC is distributed as .tar.xz tarballs; I assume this is because it
> produces small tarballs. However, xz is ill-suited for archiving due to
> its lack of error recovery. Moreover, lzip produces smaller tarballs
> with GHC (I tested with ghc-8.8.2-x86_64-deb9-linux.tar) and
> decompression takes about the same amount of time.
>
Indeed I recall seeing the "Why xz is not suitable for archival
purposes" blog post quite a while ago and considered moving away from xz
at the time but wasn't entirely convinced that the benefits would
justify the churn, especially since xz tends to be pretty ubiquitous at
this point while lzip is a fair bit less so.

I'd be happy to hear further reasons why we should switch but I'll admit
that I still don't quite see what switching would buy us; we do have
a few backups spread across the planet so the probability of us having
to rely on the compressor for error recovery pretty small.

Cheers,

- Ben

_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

signature.asc (497 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Using lzip instead of xz for distributed tarballs

Vanessa McHale-2
Would it be plausible to distribute both? That way users would not have to install lzip.

Cheers,
Vanessa McHale

> On Jan 20, 2020, at 4:15 PM, Ben Gamari <[hidden email]> wrote:
>
> Vanessa McHale <[hidden email]> writes:
>
>> Hello all,
>>
>>
>> GHC is distributed as .tar.xz tarballs; I assume this is because it
>> produces small tarballs. However, xz is ill-suited for archiving due to
>> its lack of error recovery. Moreover, lzip produces smaller tarballs
>> with GHC (I tested with ghc-8.8.2-x86_64-deb9-linux.tar) and
>> decompression takes about the same amount of time.
>>
> Indeed I recall seeing the "Why xz is not suitable for archival
> purposes" blog post quite a while ago and considered moving away from xz
> at the time but wasn't entirely convinced that the benefits would
> justify the churn, especially since xz tends to be pretty ubiquitous at
> this point while lzip is a fair bit less so.
>
> I'd be happy to hear further reasons why we should switch but I'll admit
> that I still don't quite see what switching would buy us; we do have
> a few backups spread across the planet so the probability of us having
> to rely on the compressor for error recovery pretty small.
>
> Cheers,
>
> - Ben

_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Reply | Threaded
Open this post in threaded view
|

Re: Using lzip instead of xz for distributed tarballs

Ben Gamari-2
On January 21, 2020 11:44:15 AM EST, Vanessa McHale <[hidden email]> wrote:

>Would it be plausible to distribute both? That way users would not have
>to install lzip.
>
>Cheers,
>Vanessa McHale
>
>> On Jan 20, 2020, at 4:15 PM, Ben Gamari <[hidden email]> wrote:
>>
>> Vanessa McHale <[hidden email]> writes:
>>
>>> Hello all,
>>>
>>>
>>> GHC is distributed as .tar.xz tarballs; I assume this is because it
>>> produces small tarballs. However, xz is ill-suited for archiving due
>to
>>> its lack of error recovery. Moreover, lzip produces smaller tarballs
>>> with GHC (I tested with ghc-8.8.2-x86_64-deb9-linux.tar) and
>>> decompression takes about the same amount of time.
>>>
>> Indeed I recall seeing the "Why xz is not suitable for archival
>> purposes" blog post quite a while ago and considered moving away from
>xz
>> at the time but wasn't entirely convinced that the benefits would
>> justify the churn, especially since xz tends to be pretty ubiquitous
>at
>> this point while lzip is a fair bit less so.
>>
>> I'd be happy to hear further reasons why we should switch but I'll
>admit
>> that I still don't quite see what switching would buy us; we do have
>> a few backups spread across the planet so the probability of us
>having
>> to rely on the compressor for error recovery pretty small.
>>
>> Cheers,
>>
>> - Ben
>
>_______________________________________________
>ghc-devs mailing list
>[hidden email]
>http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

There is indeed precedent for this. IIRC, we distributed both bzip2 and xz tarballs for several years.

I'm not opposed to offering both, the biggest cost is the storage and that is relatively minor. I have opened #17726 to track this.

Cheers,

- Ben
_______________________________________________
ghc-devs mailing list
[hidden email]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs