[Yum-devel] [PATCH] Use state-aware filename rpmdb index if rpm supports it

Panu Matilainen pmatilai at laiskiainen.org
Wed Aug 31 18:36:33 UTC 2011


On 08/31/2011 08:52 PM, James Antill wrote:
> On Wed, 2011-08-31 at 19:57 +0300, Panu Matilainen wrote:
>> On 08/31/2011 05:59 PM, James Antill wrote:
>>> On Wed, 2011-08-31 at 11:44 +0300, Panu Matilainen wrote:
>>>> This fixes some cases where yum depsolver and rpm>= 4.9.0
>>>> disagree on dependencies on removal (including but not limited
>>>> to the example in BZ 729973), causing "ERROR with transaction check
>>>> vs depsolve" errors. The state-aware file index is currently only
>>>> in rpm.org HEAD but likely to get backported to 4.9.x series.
>>>
>>>    This is altering searchFiles() ... and changing it from "search for any
>>> rpms that own this file" to something a bit like "search for any rpms
>>> that own this file, _and_ that file is marked as 'on disk' by the
>>> rpmdb".
>>>
>>>    Pretty sure we can't do that.
>>
>> Well, that's how rpm behaves now, and unless yum does the same you'll
>> get bugreports from the cases where rpm and yum disagree.
>
>   Maybe, but I'm worried about cases like:
>
> yum install  /usr/bin/blah =>  gets both multilib variants
> yum provides /usr/bin/blah =>  only returns one

Yup, that's exactly what would happen, because there's exactly one 
package whose removal will remove /usr/bin/blah (ie truly owns it)

>> Think about it - how can a package claim to own and provide a file that
>> is not 'on disk'?
>
> 1. Well it has done in rpm for the last N years :).

Sure. It has also done many other absurd things :)

>
> 2. In some ways this is similar to the user doing "rm -f /bin/zsh" and
> then asking if zsh owns that file.

That's a bit different, and does still work the way it always has: that 
somebody *else* removed the file does not change rpm's idea of who owns 
it. But yeah there are similarities.

>
>>>    I guess we could add a new API ... but using that API for file requires
>>> logic, when we'd have to ignore our current caches, seems less than
>>> optimal.
>>
>> Optimal or not, I just think correctness should always come before speed.
>
>   That's far from an absolute rule,

Of course. BTW I need to double-check what exactly happens with the 
caching, IIRC it looked like dropping caches might only be necessary 
when the rpmdb was changed outside of yum in which case it wouldn't be 
that bad, but we'll see...

> as I'm sure you wouldn't argue that
> we need to have the erlang model of 100,000 provides/requires per.
> package ... or even taking this case to the extreme means that we can't
> use "filelists" for anything, so whenever we'd now hit that we should
> (to be correct) download all the package headers and pass them to rpm?

Not perhaps quite so dramatic, but since you brought this up... ;)

I am in fact looking into getting rpm back into the depsolve loop 
somehow. I dont mean going back to the state where all package headers 
are downloaded and fed into rpm, no. It has to be something else. But 
the point is, dependencies are not (and dont have to be) quite as static 
as they appear in the repository metadata, getting rpm back into the 
loop would open up some interesting possibilities.

Yesterday I hacked up some prototype code to turn yum package objects 
into headers that can in fact be fed into rpm and used for dependency 
checking purposes. This isn't particularly useful in itself as various 
useful bits and pieces are missing (from rpm POV) from the repodata, so 
this is just early exploration of possibilities :)

>   I'm not saying this is a bad thing for rpm to be doing ... but it's not
> just a simple bugfix to get to 100% "fixed". Eg. I can well imagine
> people wanting "yum remove foo.x86_64" doing an automatic reinstall of
> foo.i686 instead of deleting anything depending on /usr/bin/foo.

Certainly, the various behaviors around this are nowhere near 100% fixed 
on rpm side either, the patch in question only aligns yum behavior to 
what is implemented in rpm right now. There will be more over time, I'm 
sure :)

>
>>>> Also it requires 'yum clean rpmdb' on first go to wipe previously cached
>>>> data, pointing out another problem: the state-aware files index is dynamic,
>>>> and should not be cached across transactions as a transaction can change
>>>> file states of packages that aren't included in that transaction.
>>>
>>>    How well does that work now (installing foo-1.i686 after foo-1.x86_64
>>> was already installed?)
>>
>> Installing foo.i686 after foo.x86_64 in the normal multilib
>> configuration typically doesn't change anything outside foo.i686, it
>> just gets any shared elf-files marked with "wrong color". It's the other
>> way around where ownership changes: if foo.i686 is installed first, and
>> then foo.x86_64 in another transaction, foo.i686 loses ownership of
>> shared elf-files.
>
>   Right, I meant has rpm got rid of the old problems where you could
> install both packages at the same time but not separately?

Ah, that. Yes, that particular misbehavior has been fixed since rpm >= 
4.6.0.

>>>    And while it's "dynamic" it isn't random, so we could act on the
>>> knowledge of what happens.
>>
>> Maybe, although I can't think of any way to easily find that out inside
>> yum, without duplicating some seriously expensive calculations that rpm
>> does. The whole point of the new state-aware (pseudo) index is to export
>> that data in an "easy" way to API users.
>
>   There are a number of ways we could work around it, from rpm telling us
> which files will go "missing" as part of the transaction ... to just
> ignoring .i686 dup. files when there are .x86_64 packages, and maybe
> some way for rpm to say "something weird happened, drop all file
> caches".
>   On the other side we could fix the file requires lookup APIs in rpm so
> we don't need the cache anyway (I believe Florian was going to look at
> this after the NEVRA+pkgid index went in).

Getting rid of caches (at least ones that are carried across 
transactions) would be by far the best solution, however it gets 
implemented is another story. I need to take a look at what would be 
already possible with the new index iterators that were basically added 
for this very purpose.

	- Panu -


More information about the Yum-devel mailing list