[Rpm-metadata] Request-For-Ideas: requires statistics and thoughts on making our metadata smaller

Seth Vidal skvidal at fedoraproject.org
Fri Oct 30 21:00:36 UTC 2009


I'm not entirely sure why I started looking at this but I started looking 
out the number of requires in rawhide (i686) and then at what provided 
those requires most of the time.

Summary version:
211011 Requires in rawhide
71359 are provided by glibc.

8165 packages provide all the requirements for all 23823 pkgs in the 
distro.

The top 20 requirements and the top 20 providing packages are here:
http://skvidal.fedorapeople.org/misc/top-20-requires-and-providers.txt

For any pkg which has a Requires that is provided by glibc, on average 
that package has 7 more Requires that are provided by glibc.

What this means for our metadata is that if we can find a way to reduce 
how many duplicate glibc requirements we store in either the pkgs and/or 
in the repodata that we can trim down our repodata size by a fairly good 
amount.


Run this script to see for yourself:

http://skvidal.fedorapeople.org/misc/requires-frequency.py

I think there are some reasonable assumptions we can make in our repodata 
which might help out the size of the metadata for xfer purposes and how 
many items we have to traverse.

I'm curious what folks think as to how we might be able to make this 
better.

It is worth noting that the major rpm-based distros all use the same 
naming for their 'glibc' pkg

-sv



More information about the Rpm-metadata mailing list