[Yum-devel] [Patch] Resolver Performance and Correctness

Jeremy Katz katzj at redhat.com
Tue Jun 5 21:48:10 UTC 2007


On Tue, 2007-06-05 at 13:22 +0200, Florian Festi wrote:
> I put together my patches in my own working dir (as proposed by James Bowes 
> ;)= ) to see if they have the expected performance gain and if they fit 
> together. It turned out that the performance gain is within the expected 
> range and that it is possible to achieve that with increasing the code base 
> by only 20 lines. My test cases report the following timings:
[snip]

Hrmm... something doesn't seem right to me.  Adjusting for today's
changes and applying it gives some pretty bogus looking conflicts and
different broken deps.  The time was also about twice as long.  Didn't
dig much deeper (yet at least).  

> I have the feeling that some tests could benefit even more by merging 
> mytscheck into resolveDeps to reduce the number of ._checkConflicts and 
> _checkFilerequirements calls. There are also some other optimizations I did 
> not yet try out to keep the patch size below one MB. So I expect that 
> further performance improvements are possible.

The other thing that would be nice (and you alluded to below) would be
actually splitting into a patch series.  As it stands right now, it's a
good sized diff and it's hard to analyze pieces independently.  Which
really is important both for review and later bisection in case of
problems.  I know it makes things a lot less fun as you have to do
things like move old code around first and then replace it, but it will
make things a lot clearer.  And fwiw, either using quilt or doing a
local import to a git or hg or bzr repo makes doing things like this a
lot simpler.

> There is a short wrap up of what has been changed:
>   * whatProvides/Requires - returns {po -> [matching Ps/Rs]}
>    * added to PackageSackBase, MetaSack, PackageSack,
>      YumAvailablePackageSqlite

For consistency, it'd be better if these also took the aspo argument and
worked similarly to the rpmdb methods.  And we could potentially make
things look a little cleaner with a wrapper of whatPoProvides and
whatPoRequires -- just calling the underlying method with aspo=True.

>    * Depsolver: .whatInNewProvides, .whatInTsProvides, .whatInTsRequires
>     * rename/move to tsinfo?

Yeah, the naming doesn't seem entirely right here.  And the ts bits
probably should be in the tsInfo.  The formatting of the methods is also
a little off from the normal style used in yum; please don't split

>    * rpmdb.whatProvides returns {po -> [hits]} -> code clean up
>     * still needs param for compatibility -> 2 level deprecation needed!!!
>   * Improved mytscheck
>    * TransactionData.removedmembers - used in _mytscheck to treat
>      removed INSTALL members as Removes and
>      removed REMOVE members as installs

I'm not quite clear on what exactly this is trying to solve...

>     * needs testing/test cases
>    * TransactionData.relatedto and .depends_on converted to sets

Should be fine.  This is the sort of thing that becomes a lot easier to
integrate with separated out patchsets :)

>    * new _checkInstall, _checkRemove,
>     * dropped _provideToPkg and _requiredByPkg
>    * _checkConflicts, _checkFileRequires - work globally
>     * ignoring installed problems and problems in updates properly?

Have to look at these closer to really assess correctness.  And where
one change at a time helps a lot.

>    * dropped Depsolve.deps cache
>    * added local rpms to pkgSack via localSack (cli.py)

Hrmm... what are you trying to accomplish with this?  Other API users
are almost certainly not doing this

>     * adjusted interface of PackageSackBase.matchPackageNames()
>       to other Sacks
>      * Does this break anything?

This should be safe.  Or if it's not, it's already going to be breaking
things :)  But it's been like this since 3.0.x, so probably fine

>   * Prco helpers
>    * Package.matchingPrcos
>    * rpmUtils.miscutils.rangeCompare() - moved code from Package

Again, small changes like this are a lot easier to verify alone but
seems sort of okay.  Although overall, I think we were trying to kill of
rpmUtils, not move more there.

>   * SqliteSack
>    * .prco uses empty tuple as default value
>      to destinct from empty list and avoid multiple db queries
>    * _search_cache

Didn't you previously have a patch to do this with None instead of an
empty tuple?  Also, what memory cost does the _search_cache end up
having?

>   * added better profiling output

Better profiling in the eyes of the beholder... all the outputs have
their uses.  I tend to end up analyzing more than once.  Or using
hotshot2calltree and then using kcachegrind

> Interesting points that need further discussion are:
>   * rpmdb.whatProvides should change its return value. I currently use an
>     additional parameter (as suggested by Jeremy) that could go a way be two
>     deprecation cycles, but this is obviously very ugly.

It's not ideal, but less ugly than breaking all of our (developer) users

>   * PackageSackBase.matchPackageNames() is incompatible to the same
>     methods in other Sack classes. Is changing/rewriting it ok?

As I said, this looks like it should be okay 

Jeremy




More information about the Yum-devel mailing list