[Yum] Threading the IO
David Farning
dfarning at sbcglobal.net
Thu Feb 26 14:00:07 UTC 2004
I have been looking at adding the capability to perform IO intensive
functions in parallel within thread pools.
There are a number of areas within yum that could take advantage of
these optimizations. The things that jump out sections are file
download and the header check sections.
Option one--Make urlgrabber threaded.
Advantages
--all thread stuff and thread bugs would be in urlgrabber not spread
through out yum
--yum coders (beyond Michel) would not need to be aware of threading
Disadvantages
--threading would only be available to grabber
--currently calls to urlGrabber are of the form
calc (URL, filename)
filename = grabber.urlgrab(URL, filename, **kwargs)
doSomeThingTo(fileName)
Urlgraber would need to modified such that the parameters could be of
the form
string, string -> use unthreaded grabber return filename
(string, string) -> use unthreaded grabber return filename
[(string,string),(string,string)]->use threaded grabber return
[(filename, successFlag)]
calls to urlgrabber would be of the format
for item in list:
listToGrab.append(calc(URL, filename))
fileNames = grabber.urlgrablistTograb, **kwargs))
for fileName, successFlag in fileNames:
if successFlag = False:
Oops()
else
doSomeThingTo(fileName)
Option two--create a generic thread pool
Advantage
--Any parallelizeable functions could run in pool
tp = threadPool() # create pool
tp.init() # initialize pool
for item in list:
tp.addToInQueue(functionToDoDomeThing(item))
output = tp.cleanUp() # ensure all treads have finish and return
[outPut]
Disadvantage
--all coders will need to be aware of thread issues and resulting bug.
David Farning
More information about the Yum
mailing list