[Yum-devel] group DB

Florian Festi ffesti at redhat.com
Wed Jun 3 15:33:00 UTC 2009


Hi everybody!

I very much appreciate the effort to improve the handling of groups in 
yum. Nevertheless I think this effort is too short sighted. The 
mechanisms we use to handle packages still date back to the time where 
Fedora  had 1500 packages. We've see a growth (in number of packages) by 
a factor ten since then and I expect another factor ten soonish. I 
belief it is time to sit back and rethink the whole package selection 
topic from the beginning. And while better "group" (what ever this 
means) handling will be part of it it surely will not be sufficient to 
solve the problems we are currently running into.

Right now we mix several things in the comps groups:

1. Grouping similar thing so users can select the ones he likes (Office 
applications, Browsers, Games, ...)
2. Preselected installation patterns with one of each kind (GNOME, KDE)
3. Offer related packages that the user might also be interested.
4. Hide packages that the user is most likely not interested in (libs, 
system core)
5. Add a more powerful selection layer (conditionals - mainly for 
language support) (anaconda only)

Additionally to mixing thinks up comps is not scaling with the number of 
packages.
Right now we only have 64 groups (when not counting the language 
groups). This is simply not enough to structure 10k packages and surely 
not enough for 100k. So IMHO we need to think how we can further 
decentralize the grouping process and encourage people  to take care of 
special application domains.

Another area of great pain is language support. There are hundreds of 
packages waiting to be split up by language if there only was a way to 
handle the huge amount of new sub packages (yes, we'll reach the 100k 
packages much faster if this happens).

Another thing that has been silently shifting is the ration between the 
distribution and the typical installation. While it used to be 
(guessing) 800/1500 it is now 1500/15000. The burden of the meta data is 
getting heavier and heavier compared to the actual install data. This is 
probably not becoming critical soon but something we should keep an eye on.

But back to the package selection: IMHO we need to reinvent and 
reimplement the package selection from scratch within the next two years 
- the earlier the better. First step should be collecting all the use 
cases and finding out what level of control is desirable for the 
different meta data pieces.

Florian




More information about the Yum-devel mailing list