[Yum-devel] [Patch] Resolver Performance and Correctness

Jeremy Katz katzj at redhat.com
Tue Jun 12 15:19:38 UTC 2007


On Mon, 2007-06-11 at 18:05 +0200, Florian Festi wrote:
> seth vidal wrote:
> > On Wed, 2007-06-06 at 15:10 +0200, Florian Festi wrote:
> >> Jeremy Katz wrote:
> >>> On Tue, 2007-06-05 at 13:22 +0200, Florian Festi wrote:
> >>>>   * whatProvides/Requires - returns {po -> [matching Ps/Rs]}
> >>>>    * added to PackageSackBase, MetaSack, PackageSack,
> >>>>      YumAvailablePackageSqlite
> >>> For consistency, it'd be better if these also took the aspo argument and
> >>> worked similarly to the rpmdb methods.  And we could potentially make
> >>> things look a little cleaner with a wrapper of whatPoProvides and
> >>> whatPoRequires -- just calling the underlying method with aspo=True.
> >> My point here is that the current return value of 
> >> RpmSack.whatProvides/Requires doesn't make any sense at all. All user within 
> >> yum call .getInstalledPackageObject(pkgtup) right afterwards. And I cannot 
> >> see any reason why that method should not return the pkg objects it builds 
> >> anyway. So IMHO the question is how do we get rid of that method and how can 
> >> we introduce the new ones.
> > 
> > Except we do need to retain some consistency. We've got a lot of diverse
> > callers of the yum module these days. We can't count on being able to
> > track all of them down and fix them anymore.
> 
> I see this change as a way to increase consistency as it offers a "one size 
> fits all" solution for searching in sacks. It is also clear that parts of 
> the API cannot deleted easily. As fixing all callers is not possible there 
> hopefully is a well defined process of removing/changing parts of the API. 
> If not it might be a good time to put one in place right now - like raising 
> DeprecationWarnings for one minor release.

I don't think anyone is disagreeing that it increases consistency and is
therefore good.  It's just the question of the horizon for removing
things.  As Seth says, we have a lot of diverse callers.  And removing
things in minor releases just _isn't_ a good thing.  The way we've been
marking things for future removal is marking them with
YumFutureDeprecationWarning.  But actually removing things marked that
way probably can't happen until we decide to do yum 4.0 (IMHO).

> >> One solution would be to rename all new whatProvides/whatRequires to 
> >> whatPoProvides/whatPoRequires and deprecate RpmSack.whatProvides/Requires.
> >>
> >> I very much dislike the idea of fortifying the old interface by introducing 
> >> the aspo parameter everywhere. But in fact that doesn't matter much. Any 
> >> solution we can agree upon is fine with me.
> >>
> >>>>    * Depsolver: .whatInNewProvides, .whatInTsProvides, .whatInTsRequires
> >>>>     * rename/move to tsinfo?
> >>> Yeah, the naming doesn't seem entirely right here.  And the ts bits
> >>> probably should be in the tsInfo.  The formatting of the methods is also
> >>> a little off from the normal style used in yum; please don't split
> >> The reason why I didn't move them to tsInfo is that this would require a 
> >> much closer coupling between the tsInfo and the Depsolve(r) as these methods 
> >> need to access to Depsolve.pkgSack and Depsolve.rpmSack. Is that wanted?
> >>
> >> Another issue not yet mentioned is Depsolve.whatProvides and 
> >> .doSackFilelistPopulate. They look pretty unnecessary. Can we push that 
> >> functionality down into the SqliteSack? Using a callback if the main app 
> >> really has to care about?
> > 
> >  Maybe I'm missing something about the point here - but is your goal to
> > just cut out and replace all the code currently in yum with new and
> > as-yet untested code? B/c I'm having a hard time understanding why we
> > would want to do this inside 3.2.X.
> 
> Sorry. I should have stated the goal at the beginning: My goal is increasing 
> the performance of the resolver by factor 10 for ~1000 pkg operations. I 
> admit that I should have asked the mailing list if this is really desirable. 
> May be I just waste my own and everyone else time with something no one 
> needs or wants...

Where do you get that we're not interested in improving the performance?
We're just interested in doing it in a way that's sustainable for the
long-term health of the codebase.

> If there are any ideas to achieve a significant speed up without "cut out 
> and replace all the code" I am glad to offer my help. But I currently cannot 
> imagine anything like that.

The key is just that instead of "put together your patches in your own
working dir", we really need to be able to have each set of changes to
stand on its own.  Because if they don't, maybe it's not worth changing
one piece.  And patches can stand on their own for a variety of reasons:
  * makes the code clearer
  * combines commonly used patterns into helpers
  * makes things faster
  * not helpful in and of itself but becomes useful when later things
are applied.
  * correctness -- even at the cost of speed

Like I said in my initial reply -- we really want to be able to bisect
changes going in to find where problems crop up rather than making
wholesale changes.  A little bit more up-front work to get the patches
in to save work in the long run.

Jeremy




More information about the Yum-devel mailing list