[Rpm-metadata] two other areas needed

Jeff Licquia licquia at progeny.com
Wed Oct 8 21:19:50 UTC 2003

On Tue, 2003-10-07 at 17:13, Adrian Likins wrote:
> On Sat, Oct 04, 2003 at 04:01:42AM -0400, seth vidal wrote:
> >  handful of files idea - this is adrian's - the idea is to have 3 or 4
> > files which house all the data. The first file maybe lists the
> > channels/repositories and checksums on them - that way if that file has
> > changed you know if you need to get the others. The second is the file I
> > posted a little bit ago - the main package information file. The third
> > is a file containing the complete list of all the files for every
> > package. 
> 	I was thinking that file #2 would probabaly only be
> something like:
> name version release epoch arch size headersize [url]
> for each package in the channel/repo/dir/whatever
> (url in brackets since it wouldnt be needed if you stick
> all the rpms in the same dir, but adding it would be
> theoretically more flexible).

I've been chewing on this for a bit.  It's an interesting proposal.

What I wonder about with the scheme as described is that you're
extremely limited in specifying per-package metadata.  If it's not in
the RPM header, you've got to stuff it into a file with a rather brittle
syntax that will be difficult to extend, especially as you consider the
number of package managers you're dealing with.

To my mind, that's one of the strengths of the XML file Seth posted: you
can extend it when you need to without breaking other people's stuff. 
Of course, if you rely on the RPM header for most of your metadata, the
index file doesn't have to be as comprehensive.

>From an apt perspective, Seth's file also matches current practice a bit

> For an update only case, the win is you grab maybe
> 20k of data, see what files you want to update,
> grab the needed headers, solve deps, etc. Then fetch
> the rest of the packages (skipping the header, since
> you already have it). 

If we're already talking about byte ranges, then a monolithic file can
accomodate a similar update strategy as well.  Publish a file with name,
version, start offset, and end offset alongside the metadata file.  The
downloader downloads this file first, figures which package metadata has
changed, grabs just that metadata via byte ranges, and reconstructs the

Personally, I see disadvantages in only downloading headers you care
about.  But software that works this way could do the same thing, by
skipping the byte ranges of packages not installed on the system.

More information about the Rpm-metadata mailing list