[Yum-devel] rpm verify in YumInstalledPackages

Panu Matilainen pmatilai at laiskiainen.org
Mon Jan 28 09:22:45 UTC 2008


On Sat, 26 Jan 2008, seth vidal wrote:
> On Sat, 2008-01-26 at 13:18 +0200, Panu Matilainen wrote:
>> Yup... Where you will get into trouble is checksumming, taking prelinking
>> into account and that fun, if there are other special cases I don't recall
>> offhand. Rpmlib knows how to deal with those, and the responsibility of
>> fetching the on-disk info should be rpmlib's, not each and every API
>> users'.
>>
>> I've been thinking about the "verify API" on and off, what I have in mind
>> at the moment is a (rpmfi) method that'll give you a new rpmfi object,
>> populated with the on-disk information. With that, all you have to do is
>> to iterate over the header-fi and ondisk-fi objects and compare the data.
>> So verifying a package would look somewhat like this:
>>
>> fi = hdr.fiFromHeader()
>> dfi = fi.onDisk(patterns=[])
>> while ... iterate over both rpmfi objects...:
>>      if fi.group != dfi.group:
>>          problems.append("group mismatch")
>>      if fi.mtime != dfi.mtime:
>>          problems.append("mtime mismatch")
>>      ...
>>
>> This would (should ;) work on both C and Python level pretty much "just
>> like that", without requiring any new data structures and methods to
>> access the data. rpmlib already knows how to fetch all the ondisk info
>> naturally, it just throws away the actual data and gives "modified" vs
>> "not modified" answers. All that's needed is the rpmfiFromDisk()
>> method (on C-level), should be fairly straightforward to lift the existing
>> verification code and stuff it into rpmfi...
>>
>> There you'd have a very simple to use lowlevel "verification API", on top
>> of which you can then build whatever fancy python verification objects if
>> you wish.
>>
>> Thoughts?
>
> The above makes a lot of sense to me. Especially since a number of the
> fields in the rpmfi tuple are, umm, not obvious to discern. Doing a
> simple comparison would be great. Thanks.

The rpmfi tuple (and bunch of others, like in the depsolve callback) is 
just #¤%#¤%¤# as tuples can't be extended or changed without breaking 
every user... Note that you can access the rpmfi data via methods too, eg

fi = h.fiFromHeader()
for x in fi:
     print fi.FN(), fi.FMode(), fi.FFlags()

If you think that's quirky and weird... well I don't disagree ;)
The rpmfi and rpmds "objects" don't have separate iterators on C level 
(unlike db, ts etc items) and in this case the brokenness is visible 
directly to python too as rpmfi is very thinly wrapped for python.

Not to mention there are other issues like integer signedness conversion 
bugs present in the numerical fields, at least FMode() suffers from it 
and probably others too:

for x in fi:
     print fi.FN(), fi.FMode(), os.stat(fi.FN()).st_mode

/usr/bin/telnet -32275 33261
/usr/share/man/man1/telnet.1.gz -32348 33188

Fixed in rpm.org HEAD already, for 4.4.x the only chance is to add casts 
in the bindings (will do...)

> Though, if you really want to make me happy by working on some of the
> python bindings I have a hankering for rpmbuild. :)

What I've been thinking of, and actually already started at some point but 
got side-tracked by variety of other issues is:

The python bindings should be split out of rpm sources to free up the 
development. And rpmbuild should be a separate module from the "core" 
bindings to avoid dragging librpmbuild.so into things like installer 
images needlessly.

The current bindings are so tangled in ugly legacy issues which can't be 
changed without breaking half the world, I'm thinking of opting to 
developing a new set of bindings, designed from the ground up to be 
extensible without breaking and only using public librpm APIs. And 
parallel installable to the current ones to make the transition easier. 
Take what's sane in the current bindings and scratch + redesign the rest 
and once in usable state, mark the in-rpm bindings deprecated (but leave 
around for, sigh, legacy support). Implemented in C where necessary and/or 
makes sense, otherwise in Python.

Since build bindings don't exist ATM, there are no legacy holdups... so 
all that's needed is a design and then just do it ;) yum-devel is not the 
best place to discuss that though... :)

 	- Panu -



More information about the Yum-devel mailing list