[Rpm-metadata] update patch, take 3

Hans-Peter Jansen hpj at urpla.net
Tue Mar 27 12:40:22 UTC 2007


Hi Mike,

Am Montag, 26. März 2007 22:03 schrieb Mike McLean:
> http://people.redhat.com/mikem/software/createrepo-update5.patch
>
> This patch adds a --update option to createrepo. If that option is
> specified and the directory provided has preexisting repodata, then that
> data is read and compared to the present file list. Only RPMs that have
> been added or changed (detected by change in mtime or size) are scanned;
> the rest of the repodata is recycled.
>
> When the amount of change is slight, this greatly reduces the amount of
> IO for the run. Unchanged files get by with just an os.lstat. In my
> testing this is at least twice as fast as using --cachedir.
>
> In a system like Koji, where repositories are recreated with minor
> updates on a regular basis, these savings can really add up. Please take
> this patch, so that we don't have to choose between performance and
> upstream compatibility.

This sounds like a promising idea, but I would rather make it the default 
behavior in case meta data exists, as I don't like the notion of 
semantically separating initial setup and update modes. Consider my case: 
I'm mirroring many different repos (via rsync) and that python script is 
smart enough to call createrepo after rsync. Your update mode would force 
me to handle the initial setup case separately :-(.

IMO, a better approach would be providing an option to forcefully replace 
the meta data (saving the user an rm -r in rare cases..). 

Wouldn't it also obsolete the cachedir option, if the meta data could be 
regenerated from the repodata directly?

BTW, does anybody use the checkts option in her/his setup? If it's exercised 
hard enough, maybe it deserves the same logic swap: making it default and 
provide a --notscheck option, or even combine it with the --force option 
from above ;-)...

Pete



More information about the Rpm-metadata mailing list