[Yum-devel] [RFC] dbversion 10

seth vidal skvidal at linux.duke.edu
Tue Apr 10 20:24:06 UTC 2007


On Tue, 2007-04-10 at 16:11 -0400, Jeremy Katz wrote:
> On Tue, 2007-04-10 at 11:43 -0400, James Bowes wrote:
> > Jeremy Katz wrote:
> > > On Mon, 2007-04-09 at 18:41 -0400, James Bowes wrote:
> > >> So here, for your amusement and subject to your general mockery, are two
> > >> patches that modify the database in such a way as to break compatibility
> > >> with existing code.
> > > [snip]
> > >> Some things that could be done but probably shouldn't:
> > >> - Convert the values from ints to strings as they are pulled out of the
> > >> database. I'd rather not do this for performance reasons, but it would
> > >> mean that the API is not broken (see epoch related code in the yum patch)
> > > 
> > > The advantage, though, is that the API is not broken.  I think this is
> > > pretty compelling, even if it's at the cost of some performance.  As
> > > more and more people build tools on top of the API, we have to be more
> > > and more aware of this.  Because if we're constantly changing the API,
> > > then the value of the yum API drops substantially.
> > > 
> > > It might be interesting to see what the performance difference is if we
> > > just leave epoch as a string rather than an int and do the rest of the
> > > changes.  I have a hunch that it wouldn't be that different.  And API
> > > consistency is then far less painful.
> > 
> > I can run some numbers, unless someone else wants to do it. I feel that
> > representing numbers as numbers is the more correct thing, but we can
> > always do that later.
> 
> Yeah, but if we have to use it as a string in the code, then it's
> probably going to be faster to just keep it as a string in the db.  
> 
> > >> - Store the pkgId as raw data rather than a printable string. This would
> > >> save 20 bytes per pkgId, but pysqlite makes it very difficult to do
> > >> anything with this value once you've gotten it out of the db.
> > > 
> > > How so?
> > 
> > You store the value as text, and pysqlite will complain because it's not
> > unicode. So you store it as a blob, and pysqlite will return it as a
> > buffer, which can't be used as a hash key, so you have to convert it to
> > a string before you can use it. The code is then icky. I'd be happy to
> > dig up what I have done so far with it, so that others can make it
> > better. Come to think of it, if we're not breaking the API, then every
> > time you take a pkgId out of the database, or put one in (say, for a
> > query), you're going to need to do some magic to go from 20 to 40 bytes.
> 
> Ewww :)
> 
> > And if we're gonna change the db, we mayaswell get as many changes in at
> > once.
> > 
> > Now, if the api isn't broken, who's up for putting this in pre 3.2? Then
> > we can just forget about generating dbv9 and dbv10 at the same time, and
> > leave that problem for later.
> 
> If we're careful not to break API, then I'd be okay with going with it
> for pre 3.2 as long as we got it in very soon and got a package into
> rawhide for a little testing prior to F7 test4.  

I think we'll be chasing down places where we've done str(epoch) or
int(epoch) w/o really thinking appropriately and had it magically work
in the past. I'm not opposed to it for 3.2 but time seems tight.

-sv





More information about the Yum-devel mailing list