[Yum-devel] Implementing delta metadata

seth vidal skvidal at fedoraproject.org
Tue Sep 27 17:12:50 UTC 2011


On Mon, 2011-09-26 at 07:21 -0400, Zdenek Pavlas wrote:
> Hi!
> 
> Thanks for the interest in improving MD download!
> Just a few thoughts (I don't consider myself experienced
> in the codebase, esp on the createrepo part).
> 
> - Sharding metadata is very likely not an option.
> 
> The per-file overhead (to download, to store, to query) 
> is significant, and to get a significant fraction of files 
> not modified, we'd need quite a lot of them (100+ I guess).
> 
> - Rsync-friendly metadata are IMO better option, but..
> 
> 1) pkgKey values are assigned sequential, so adding/removing
> a package in the middle touches 50% of metadata.
> 
> 2) It's very likely (although I'm not sure) that building
> sqlite DB from scratch from two slightly different inputs
> produces two very different databases that rsync poorly.
> (due to records ending up in different page offsets).
> 
> So, keeping persistent pkgKeys (1), and building
> new metadata database by copying the old one and performing
> a set of insert/delete/updates (2) would help a lot.
> 
> Then there's another issue.. compressed sqlite files
> are currently primary means of metadata distribution,
> but that's likely to change.
> 
> On yum side, there are other problems:
> 
> 3) non-existent rsync:// support in libcurl and urlgrabber.
> 
> Yum would probably have to exec() rsync, and that integrates
> badly (no mirror failovers, different progress meters etc).
> 


rsync:// as a requirement is a non-starter for yum and for most of our
users. Setting up an http server sensibly to do this is trivial. Setting
up rsync is a pain and is not going to happen for many many users.

I do not recommend getting caught up in the idea that rsync will work
for anyone.

-sv




More information about the Yum-devel mailing list