[Yum-devel] [PATCH 2/4] use pipe instead of tempfiles.. ~5% speedup

James Antill james at fedoraproject.org
Tue Nov 27 18:14:22 UTC 2012


On Fri, 2012-11-23 at 16:28 +0100, Zdeněk Pavlas wrote:
> ---
>  createrepo/__init__.py |   45 ++++++++++++---------------------------------
>  1 files changed, 12 insertions(+), 33 deletions(-)
> 
> diff --git a/createrepo/__init__.py b/createrepo/__init__.py
> index 167d384..622e442 100644
> --- a/createrepo/__init__.py
> +++ b/createrepo/__init__.py
>              for (num, cmdline) in worker_cmd_dict.items():
>                  if not self.conf.quiet:
>                      self.callback.log("Spawning worker %s with %s pkgs" % (num, 
>                                                        len(worker_chunks[num])))
> -                job = subprocess.Popen(cmdline, stdout=subprocess.PIPE,
> -                                        stderr=subprocess.PIPE)
> +                job = subprocess.Popen(cmdline, stdout=subprocess.PIPE)

 One downside to this is that you need to deal with with all the
possible deadlock issues. Eg. you now have:

worker1 => buf1 (stdout) => STDOUT_FILENO
        => STDERR_FILENO
worker2 => buf1 (stdout) => STDOUT_FILENO
        => STDERR_FILENO

reader <= buf1 <= worker1 stdout
       <= buf2 <= worker1 stderr
       <= buf3 <= worker2 stdout
       <= buf4 <= worker2 stderr

...and if we ever block on one side of a worker, while it's blocked on
the other side we deadlock.
 Eg. The automatic flushing that stdout does makes no guarantee that
stream.readline() can ever finish on the reader side, so the reader can
block there until the next flush happens for that worker.



More information about the Yum-devel mailing list