[Yum-devel] [RFC] Questions about sqlitesack and list vs. generators

James Antill james.antill at redhat.com
Mon Dec 10 20:09:29 UTC 2007


On Mon, 2007-12-10 at 14:41 -0500, seth vidal wrote:
> On Mon, 2007-12-10 at 14:12 -0500, James Antill wrote:
> > A couple of questions about sqlite functions and usage:
> > 
> > 1. Does anyone know why returnPackages() re-runs the _excluded() method
> > on cached content? AFAICS we just redo excluding for all pkgs for
> > nothing.
> > 
> 
> I think it is b/c we can update the exclude list whenever and from
> wherever. It's not just a one time event, necessarily.

 Hmm, ok. It's worth noting though that the current simplePkgList() only
runs it on the first generation of the data.

> > 2. AFAICS simplePkgList would be better off written like:
> > 
> >     def simplePkgList(self):
> >         """returns a list of pkg tuples (n, a, e, v, r) from the sack"""
> > 
> >         simplelist = []
> >         for pkg in self.returnPackages():
> >                 simplelist.append((pkg.name, pkg.arch, pkg.epoch, pkg.version, pkg.release))
> >         return simplelist
> > 
> > ...this saves about half a second (20-25%) for pretty much all the
> > commands, as the above loop is instant in comparison to the executeSQL()
> > version[1].
> 
> is that before or after the pkglist has been made?

 Atm. I just replaced simplePkgList() with above and commented the call
out of buildIndexes(), measuring before and after CPU/real times for
"list foo" and "search foo". So it's after. AFAICT it's all sqlite
overhead.

> > [1] The main Fedora repo. is the main problem here, due to all the
> > items ... and given how little that changes I'm almost tempted to try
> > putting another layer of caching in there just for that, almost.
> 
> The smarter move in fedora would be to set the metadata timeout for the
> primary repository to something much higher so it is checked for much
> less often. Since it doesn't change much (if ever) setting it to a 2 day
> value probably wouldn't hurt anyone.

 Yeh, that's a good idea. Whacking the cache up in fedora-release would
be pretty useful, I think.

 Although I wasn't talking about just network caching. This is purely
local parsing overhead, we hit that repo. and get a list of packages for
basically every operation and it takes ~0.5 sec each time.
 I know Florian looked at things we could do to speed up sqlite
operations, so maybe he knows something we could do, but that probably
wasn't for scans of entire tables.
 But as I said, it's _almost_ enough for me to look at doing something
atm. :)

-- 
James Antill <james.antill at redhat.com>
Red Hat
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://lists.baseurl.org/pipermail/yum-devel/attachments/20071210/f0c13358/attachment.pgp 


More information about the Yum-devel mailing list