[Yum-devel] yum on an olpc machine (slooooooooooow)
Panu Matilainen
pmatilai at laiskiainen.org
Mon Dec 18 07:33:55 UTC 2006
On Sat, 16 Dec 2006, seth vidal wrote:
>
> If we were going to get rid of the xml format altogether I would
> recommend: 1. using the sqlitedb's as an optimization and
> non-api-breaking test in 3.0.X or so
>
> 2. figure out what improvements we could make to the db format to make
> searching faster or to make it smaller on disk. This would mean working
> out the right indexes, etc.
>
> 3. See if we could figure out a nice or simple way of providing an
> sql-diff or even a binary-diff for updates to the metadata.
>
> The thing I like about having the xml around though is that it is human
> readable in a pinch. The sqlite is not.
The xml isn't really human readable either unless it's done with
createrepo -p or converted to prettyprint manually :)
Some observations on the subject...
Sqlite dump of the database is roughly the same size as the current xml
data (at least when compressed). So transferring the repodata as sql dump
would keep bandwidth requirements to roughly the same as now. Of course
that would leave some work for the clients still but surely initiating the
database from sql statements is helluva lot faster than parsing those huge
xml files and creating the sql from that. Didn't time it yet though. Raw
sql could serve as "diffs" as well, and is "human readable". Not that I
particularly like the idea of repodata as sql statements, but ...
The overall repodata size could be cut down somewhat by at least couple of
ways:
- Drop the filenames redundancy from primary.xml. It's going to require
of course the full filelists file to be downloaded at all times (diffs
would help a lot of course), but that's what apt and smart need to do
anyway (because both calculate full dependency tree at all times). Only
yum benefits from the primary.xml stuff to some extent, and sooner or
later it needs the full filelists too.
- other.xml is not typically loaded, but it could be made quite a bit
smaller by storing the changelogs just once by source rpm. The
difference is *huge* - eg FC6 SRPMS/repodata/other.xml.gz is roughly ~2M,
but ~6M for i386 and ~8M for x86_64. With that kind of size savings
somebody might even want to use it for something :)
- Panu -
More information about the Yum-devel
mailing list