[Yum-devel] Requesting feedback on some plans for yum

Seth Vidal skvidal at fedoraproject.org
Mon Jun 7 15:08:27 UTC 2010



On Sun, 6 Jun 2010, Hedayat Vatankhah wrote:
> 
> hmmm... it needs thinking about what information will help users. An example would be some metadata about
> the package type (GUI application, console application, etc for normal users, -devel packages for
> developers, other use cases). For example, IMHO people rarely look for library packages (excluding -devel
> packages): they usually need applications, or (e.g. developers) are interested in library development
> packages.
> This is an important area for thinking by itself.

I would encourage you to not just think about users when you think about 
the metadata - a lot of applications use this metadata so working on the 
basis that only users will need to know about certain things is going to 
be overly limiting.

>
>       Now - with those out of the way let me suggest a few specific tasks:
>
>       1. change the filelist metadata - break the dirs up by paths so if we know the file is in
>       /usr/lib - we don't have to download all of /usr/share to get it. You can do some good
>       statistical analysis of all the files in fedora rawhide, for example, and figure out the best
>       way to break up the filelists into smaller chunks
> 
> Yes it certainly needs some statistic analysis. I was thinking that the first letters (e.g. first two
> letters) of the hash of each file path might bring better results and distribute the files better among
> different chunks.

you'll end up with a massive amount of files in /usr since that is where, 
I'd bet, 90% of the files are.

> This is more like what I'm thinking about (this is only the server side layout, as mentioned in the blog
> post):
>    repodata/
>         repomd.xml
>         packagelist.sqlite  <-- nevra + required checksums + some other really needed data
>         info/        <-- package summaries and descriptions

               you have to provide some sort of index file for these so we 
can provide a checksum for that index file in repomd.xml. That way we can 
verify and rely on the results

>         provides.sqlite  <-- provides. (might be split into smaller parts like file lists if can grow)

provides aren't that big, really.
For all of rawhide, uncompressed, they are 2.8M, total. 105296 entries for 
17073 pkgs.

>         requires/     <-- (might also contain conflicts/obsoletes if they are usually required at the same
> time)
>             package_1_full_name.requires 
>             ...
>         conflicts_obsoletes/    <-- if not merged in the requirements files
>             package_1_full_name.confobs

Something MAYBE worth doing is this - for each pkg - in packagelist.sqlite 
- only mention if they have any obsoletes/conflicts - that way we can do a 
shorthand lookup to see if we even need to bother fetching those other 
files.

              
... >         filelists.xml <-- index file to point to the files-by-path
>         filelists/
>                 ?!   <-- depends on the way of splitting file lists, TBD

I'm dumping out a list of all files in rawhide and I'll see if I can 
generate some statistics by dir and post them.


-sv


More information about the Yum-devel mailing list