[Rpm-metadata] metadata layout problems and some history

Klaus Kaempf kkaempf at suse.de
Tue Aug 10 13:53:28 UTC 2010


* seth vidal <skvidal at fedoraproject.org> [Aug 09. 2010 20:11]:
> On Sat, 2010-08-07 at 17:37 +0200, Michael Schroeder wrote:
> > So it takes about a second to convert the 2.8 Mbytes xml file to
> > solv. I guess creating the sqlite database is a bit slower, so you
> > chose to do it on the server and not on the client.
> 
> As the size of this file increases how much time is eaten up? You showed
> a f12 updates repo, it seems.
> 
> Fedora 13 GA for x86_64 - the compressed primary.xml.gz is 7.9M.
> 
> So does that mean the solv-creation is going to take ~3s?

Yes. Compared to all the other stuff (xml download, reading the rpm
db), solv-creation is almost neglectable.

> 
> I don't see the number of pkgs decreasing at any point in time and since
> we were pushing fedora onto some REALLY underpowered boxes we found the
> xml->sqlite conversion can SUCK on some machines (OLPC XO-1's for
> example were horrendous)
> 

Agreed. Thats why we were also thinking about making cached metadata
(solv files) available in the repository in addition to the 'pristine'
xml data.

> 
> 
> > As parsing the data file is not the bottleneck for us, anything that
> > reduces the transfer size is a good thing. A switch to sqlite
> > would hurt us, as the sqlite database is currently bigger than the
> > compressed xml. (This is probably worse if lzma is used
> > instead of gzip.)
> 
> Do you end up storing the xml AND the solv files in the cache dir?

Yes.

> I ask b/c this was another reason for doing server-side-generated
> sqlite b/c we were keeping both in the cache and, again, on the XO-1
> and other quasi-small-disk systems (and most importantly SSDs) we were
> trying to keep the disk writes and disk use as low as possible.

I see.
But wouldn't operating on a data base actually require more disk i/o ?

Parsing xml to solv is one streamed read (of the xml) and one streamed
write (of the solv). Reading the solv is similar, only a couple of
mallocs and reading in big chunks.


Klaus
---
SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg)



More information about the Rpm-metadata mailing list