[Yum] metadata compression
Seth Vidal
skvidal at fedoraproject.org
Mon Apr 20 21:34:50 UTC 2009
On Mon, 20 Apr 2009, James Antill wrote:
> Joshua Bahnsen <archrival at gmail.com> writes:
>
>> I don't know the roadmap for yum, so I didn't realize that sqlite files was
>> the way to go. It does make sense, though.
>> I won't go into details about the RHEL 3 and RHEL 4 yum setup I have...
>>
>> I have posted 3 compressed versions of other.xml from rhel-i386-server-5
>> here:
>>
>> http://thejoshwa.com/upload/other.xml.7z
>> http://thejoshwa.com/upload/other.xml.bz2
>> http://thejoshwa.com/upload/other.xml.gz
>>
>> All were compressed using the maximum compression available for each (gzip
>> --best, bzip2 --best, 7z a -t7z -mx=9 -m0=lzma).
>>
>> You can see the difference for yourself.
>
> Interesting, "7z" appears to be much faster than lzip ... and
> joyfully incompatible with either lzip or lzma (at least in your above
> use case), but the file sizes are close enough. After converting to
> .sqlite I get:
>
> 121M other.xml.sqlite
> 5.0M other.xml.sqlite.7z
> 27M other.xml.sqlite.bz2
> 39M other.xml.sqlite.gz
> 5.5M other.xml.sqlite.lz
> 5.4M other.xml.sqlite.lzma
>
> ...which is interesting, but looking at the data (Eg. primary,
> filelists, etc.) it looks like lzip/lzma/7z/etc. are just seeing
> that for each version of each package most of the changelog is
> identical. This means that's it's only a big win for changelog data,
> and maybe only enough of one while we continue to put every changelog
> entry since the beginning of time into the metadata.
Fedora, at least, should only be putting the last 10 changelog entries in.
-sv
More information about the Yum
mailing list