[Yum-devel] [PATCH 3/3] Implement getPackageAsync() and getPackageDone()
zpavlas at redhat.com
Fri Jul 29 12:36:45 UTC 2011
> There is no way to get progress data out, AFAICS.
I can feed the progress through pipe to yum, and handle it
in getIdleProcess(). (multiplexing progress with errorlevels).
> The API both returns before what you've requested is downloaded _and_
can block for an indeterminate amount of time.
Never thought of this as a problem. The blocking is necessary to have
a simple 1-pass API + to limit the number of spawned downloaders.
The progress display and proper timeout handling in the download
helper should prevent being blocked for too long.
> The downloaders are global, so one downloader might talk to
rpmforge.org and redhat.com ...
They're not global, each repo has a separate pool of downloaders.
I think it makes sense to treat repositories independently, e.g.
repo A should not care how many processes download from repo B.
Since every downloader is started in particular repo's package
directory and never chdirs, It could chroot later.
> also means keepalive is going to be interesting.
Keepalives and chroot should work fine with a proper download helper
that doesn't spawn new process for each URL.
> This uses select directly instead of poll, is there some reasoning?
asyncore next (might be the best yet ?:).
The select() loop only considers active downloaders for a particular
repo. I guess the usual number of downloaders is going to be very small,
(usually < 5) as large number of processes ruins connection: keep-alives.
So using more efficient (but more complex to set up) interfaces as poll()
or epoll() is (imho) not necessary.
> If you are stuck trying to solve "the big problem" all at once, and are
sending out this is an update of where you are atm. ... I can
understand, but you'll probably go crazy trying to do it that way (and
maybe take me with you ;).
Hope not ;)
> My suggestion would be to solve a small part of the problem fully. Eg.
get an "almost 100%" patch for urlgrabber.grab() that spawns a single
process, does the download and returns progress info. Then when we've
That's doable, sure. We'd have downloads in a separate process, with 99%
compatible semantics. But because of the blocking API, this can't be
easily parallelized later.
> got that, we can start from that base so it can run 2 procs. at once ...
then ... eventually the pain needed to get it integrated into yum.
You mean, define a new API (on top of the old one), that allows parallel
downloading, and wrap it further up to the point when it's usable
in rpms/drpm's download code, and in metadata download code?
I considered taking the route the other way round. Anyway, all the stuff
in-between must be reimplemented, and that's what bothers me. I can write
a new, straightfdorward code, implementing only the necessary features.
Or, I could try to reuse and patch the old code, keeping all the seemingly
unused bits, and features-to-be. That's to be discussed I think..
More information about the Yum-devel