[Yum-devel] [PATCH] Fix /var/lib/rpm/Packages mtime race. BZ 973375

Panu Matilainen pmatilai at laiskiainen.org
Thu Jun 20 06:50:29 UTC 2013


On 06/19/2013 06:29 PM, James Antill wrote:
> On Wed, 2013-06-19 at 06:15 -0400, Zdenek Pavlas wrote:
>>> So, again, I'm heavily inclined to just say "stop
>>> doing that" unless we really need to come up with some workaround.
>>
>> Came up with this..  We can prefix rpmdb version with h_nelem
>> loaded from the Berkeley DB header.  This way, most additions
>> and removals should be detected.  But yes, it's ugly.
>>
>> diff --git a/yum/rpmsack.py b/yum/rpmsack.py
>> index 56c3793..96f4fc1 100644
>> --- a/yum/rpmsack.py
>> +++ b/yum/rpmsack.py
>> @@ -31,6 +31,7 @@ from packageSack import PackageSackBase, PackageSackVersion
>>   # For returnPackages(patterns=)
>>   import fnmatch
>>   import re
>> +import struct
>>
>>   from yum.i18n import to_unicode, _
>>   import constants
>> @@ -1157,7 +1158,10 @@ class RPMDBPackageSack(PackageSackBase):
>>                   if fo is None:
>>                       return None
>>                   rpmdbv = fo.readline()[:-1]
>> -                self._have_cached_rpmdbv_data  = rpmdbv
>> +                rpmdbv_nrec, rpmdbv = rpmdbv.split(':', 1)
>> +                nrec = struct.unpack('<88xI', open(rpmdbfname).read(92))[0]
>
>   If Panu signs off on this, I don't mind ... but please put the nrec
> getter line in a function :).

I would've hoped you dont need me to tell you that making decisions 
based on stuff read out of somebody elses private, undocumented file 
format at an offset that seems to contain something useful is not a good 
idea.

The format of /var/lib/rpm/Packages is private to rpm and yum has no 
business poking into it directly. That aside, the exact underlying 
format of BDB databases is private to BDB and can and does change every 
now and then so it depends on the version rpm was linked against.

Also h_nelem does not represent the number of entries in the database, 
its the *estimated* size of the *hash table*:

http://docs.oracle.com/cd/E17076_03/html/api_reference/C/dbget_h_nelem.html
http://docs.oracle.com/cd/E17076_03/html/api_reference
/C/dbset_h_nelem.html

If there were a fast (ie without having to walk the entire db) reliable 
way of pulling the number of entries in the database, rpm would export 
it. Hysterical as it is, BDB does not have one, at least for the hash 
database.

For something that tells you whether the contents have *really* changed 
and is reasonably fast, calculate a hash/checksum from one of the 
indexes. This is probably a huge overkill for what you need, but just as 
an example:

ts = rpm.ts()
h = hashlib.sha1()
ii = ts.dbIndex('sha1header')
for s in ii:
     h.update(s)
     for (dboffset, dbfileno) in ii.instances():
         h.update('%s' % dboffset)
print h.hexdigest() # profit!

For a more lightweight version you could just calculate a numeric hash 
from the dboffset values (on eg sha1header or name index) which will 
change whenever packages are removed or installed or --rebuilddb is used 
(none of which the number of packages will reliably indicate).

	- Panu -



More information about the Yum-devel mailing list