Parallel deltarpm creation
Ian Mcleod
imcleod at redhat.com
Thu Feb 20 17:26:05 UTC 2014
Posting here at James Antill's suggestion.
In his talk at devconf Dennis Gilmore discussed the current bottlenecks
in the Fedora compose/release process. One thing he mentioned was
deltarpm creation.
The current upstream createrepo is single-threaded/single-process for
all deltarpm actions. I've written some code to allow parallel workers
for these tasks, similar to the multi-process workers that can be used
in the initial package XML parsing tasks.
GIT -
https://github.com/imcleod/createrepo/tree/feature/parallel_deltas_full
RPMS -
http://imcleod.fedorapeople.org/createrepo/
The patch adds two options to the command line createrepo and the
associated config object:
--delta-workers - The number of worker processes to use for delta
related tasks
--max-concurrent-delta-rpm-size - The maximum total size of uncompressed
rpm payloads that are actively being processed by makedeltarpm at any
given time.
The deltarpm documentation suggests that its peak RAM use is typically
typically 4x the uncompressed RPM payload size. This is consistent with
my experience. So, a reasonable use case is to set --delta-workers to
the number of CPU cores and --max-concurrent-delta-rpm-size to ~25% of
RAM size (or whatever quantity of memory you want to devote to the
parallel deltas).
For my development stress-test-case I re-created an F20 x86_64
Everything repo with F19 Everything as the "old" rpm source for deltas.
On a 32 core test system this task ran in 8 hours with a single deltarpm
worker versus 20 minutes when all 32 cores were used with a concurrent
size limit of 16 GB. In total this creates about 32,000 drpms. So,
this helps.
Thoughts?
-Ian
More information about the Yum-devel
mailing list