[Yum] metadata compression
Joshua Bahnsen
archrival at gmail.com
Mon Apr 20 17:33:23 UTC 2009
I don't know the roadmap for yum, so I didn't realize that sqlite files was
the way to go. It does make sense, though.
I won't go into details about the RHEL 3 and RHEL 4 yum setup I have...
I have posted 3 compressed versions of other.xml from rhel-i386-server-5
here:
http://thejoshwa.com/upload/other.xml.7z
http://thejoshwa.com/upload/other.xml.bz2
http://thejoshwa.com/upload/other.xml.gz
All were compressed using the maximum compression available for each (gzip
--best, bzip2 --best, 7z a -t7z -mx=9 -m0=lzma).
You can see the difference for yourself.
Uncompressed 128946449
7zip 2014112
bzip2 22233028
gzip 32705803
On Sun, Apr 19, 2009 at 11:03 PM, James Antill <james-yum at and.org> wrote:
> Joshua Bahnsen <archrival at gmail.com> writes:
>
> > I am creating repository data based on ALL rpms available to a specific
> Red
> > Hat channel (6000 or so per channel)
> >
> > rhel-i386-as-3
> > rhel-i386-es-3
> > rhel-i386-ws-3
> > rhel-i386-as-4
> > rhel-i386-es-4
> > rhel-i386-ws-4
>
> There's little or nothing yum can do for these.
>
> > rhel-i386-client-5
> > rhel-i386-server-5
>
> These we can probably try and help with, but we've been asking and
> waiting for 12+ months for RHN and CentOS to move to generating
> .sqlite files server side. So I wouldn't bet that we can help in the
> general case, quickly. Plus any client side support for lzma probably
> wouldn't get into 5.x until at least 5.5 (more likely 5.6 or 5.7).
> So realistically you are targeting Fedora and 6.x for a change like
> this.
>
> [...]
>
> > With rhel-i386-as-4, other.xml is nearly 300 MB uncompressed, with gzip
> it
> > is 66 MB, with lzma on max compression is 2.4 MB.
>
> Trying to do a mental s/4/5/
>
> Ok, what is the .sqlite size ... what is bzip2 vs. lzma on that?
>
> Can you post your *.xml files somewhere, so we can all see the same
> data? ... I picked some random pieces because I assumed it'd scale
> close to linear. I'm still pretty surprised by 20x differences.
>
> > I'm personally not even concerned with storing the data in sqlite
>
> Then we probably have little to discuss as downloading .sqlite
> instead of .xml is a major win, and moving to generating it is the
> plan for everyone AFAIK (and yum always prefers it). So anything that
> doesn't help .sqlite transfer isn't worth much.
>
> > I will state I have been using 7z for the compression and not lzma from
> the
> > SDK, 7z has much better results.
>
> Fair enough, I just used lzip on CentOS-5, as that was all that came
> up for "yum search lzma" there. I'm by no means a compression expert,
> just trying to get some usable real world data.
>
> --
> James Antill -- james at and.org
> _______________________________________________
> Yum mailing list
> Yum at lists.baseurl.org
> http://lists.baseurl.org/mailman/listinfo/yum
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.baseurl.org/pipermail/yum/attachments/20090420/3328ff30/attachment.htm>
More information about the Yum
mailing list