[Yum-devel] [PATCH] Fix /var/lib/rpm/Packages mtime race. BZ 973375
Panu Matilainen
pmatilai at laiskiainen.org
Thu Jun 20 06:50:29 UTC 2013
On 06/19/2013 06:29 PM, James Antill wrote:
> On Wed, 2013-06-19 at 06:15 -0400, Zdenek Pavlas wrote:
>>> So, again, I'm heavily inclined to just say "stop
>>> doing that" unless we really need to come up with some workaround.
>>
>> Came up with this.. We can prefix rpmdb version with h_nelem
>> loaded from the Berkeley DB header. This way, most additions
>> and removals should be detected. But yes, it's ugly.
>>
>> diff --git a/yum/rpmsack.py b/yum/rpmsack.py
>> index 56c3793..96f4fc1 100644
>> --- a/yum/rpmsack.py
>> +++ b/yum/rpmsack.py
>> @@ -31,6 +31,7 @@ from packageSack import PackageSackBase, PackageSackVersion
>> # For returnPackages(patterns=)
>> import fnmatch
>> import re
>> +import struct
>>
>> from yum.i18n import to_unicode, _
>> import constants
>> @@ -1157,7 +1158,10 @@ class RPMDBPackageSack(PackageSackBase):
>> if fo is None:
>> return None
>> rpmdbv = fo.readline()[:-1]
>> - self._have_cached_rpmdbv_data = rpmdbv
>> + rpmdbv_nrec, rpmdbv = rpmdbv.split(':', 1)
>> + nrec = struct.unpack('<88xI', open(rpmdbfname).read(92))[0]
>
> If Panu signs off on this, I don't mind ... but please put the nrec
> getter line in a function :).
I would've hoped you dont need me to tell you that making decisions
based on stuff read out of somebody elses private, undocumented file
format at an offset that seems to contain something useful is not a good
idea.
The format of /var/lib/rpm/Packages is private to rpm and yum has no
business poking into it directly. That aside, the exact underlying
format of BDB databases is private to BDB and can and does change every
now and then so it depends on the version rpm was linked against.
Also h_nelem does not represent the number of entries in the database,
its the *estimated* size of the *hash table*:
http://docs.oracle.com/cd/E17076_03/html/api_reference/C/dbget_h_nelem.html
http://docs.oracle.com/cd/E17076_03/html/api_reference
/C/dbset_h_nelem.html
If there were a fast (ie without having to walk the entire db) reliable
way of pulling the number of entries in the database, rpm would export
it. Hysterical as it is, BDB does not have one, at least for the hash
database.
For something that tells you whether the contents have *really* changed
and is reasonably fast, calculate a hash/checksum from one of the
indexes. This is probably a huge overkill for what you need, but just as
an example:
ts = rpm.ts()
h = hashlib.sha1()
ii = ts.dbIndex('sha1header')
for s in ii:
h.update(s)
for (dboffset, dbfileno) in ii.instances():
h.update('%s' % dboffset)
print h.hexdigest() # profit!
For a more lightweight version you could just calculate a numeric hash
from the dboffset values (on eg sha1header or name index) which will
change whenever packages are removed or installed or --rebuilddb is used
(none of which the number of packages will reliably indicate).
- Panu -
More information about the Yum-devel
mailing list