[Rpm-metadata] Re: Better repodata performance

Jeff Pitman symbiont at berlios.de
Mon Jan 31 04:01:25 UTC 2005


On Monday 31 January 2005 11:08, Alexandre Oliva wrote:
> Generating an xdelta from the previous versions of the .xml.gz files
> to the current versions, along with the relative location of the
> alternate repomd.xml that described them, modified to indicate
> they're deltas between the two given timestamps doesn't sound like
> such a difficult or wasteful thing to do.

This could be driven by an optional parameter to createrepo, which 
provides a list of packages to create a delta with.  If it were fully 
automatic, it would only be a download win for the user.  If it were 
maintainer-driven, it would be a win for both user and repo.

I would rather not utilize xdelta, because you're still regenerating the 
entire thing.  Having xmlets that virtually add/substract as a delta 
against primary.xml.gz would be optimal for both sides of the equation. 

The xml formatting recommended is the way to go. Contents of delta are 
still up in the air.

Another advantage of the delta method, is that the on-disk pickled 
objects (or whatever back-end store is used) could be updated 
incrementally based on xml snippets coming in. Instead of regenerating 
the whole thing over again.

So, anyway, we can talk until our faces are blue.  What we need is a 
candidate for this feature that can run against larger repositories. 
Rawhide and third parties could participate without major borkage with 
current yum.  

Anyway, I'll poke at it and see what materializes.

-- 
-jeff



More information about the Rpm-metadata mailing list