[Yum-devel] yum on an olpc machine (slooooooooooow)

Panu Matilainen pmatilai at laiskiainen.org
Sat Dec 16 11:48:24 UTC 2006


On Sat, 16 Dec 2006, Paul Nasrat wrote:
>
>> The only way I can think of would be a different format so we don't have
>> to parse the xml or pre-parsing the metadata into a sqlite db. This
>> would make downloads of the metadata larger but maybe it would be faster
>> for operations.
>>
>> For example - fedora extras:
>> -rw-r--r--  1 root root 1.6M Dec 16 03:06 primary.xml.gz
>> -rw-r--r--  1 root root 2.2M Dec 16 03:09 primary.xml.sqlite.bz2
>>
>> bzipped the primary xml sqlite db is 2.2M vs 1.6M for the xml itself.
>
> The reason I didn't go this route for  FC5 anaconda is that it's just
> the same problem as having hdlist, etc.  Multiple versions of the same
> metadata, the problem we were trying to avoid by moving to repodata.
> I'd strongly argue this is the wrong approach.

+1

I suggest looking closer at where the time is *really* spent. Remember the
libxml2 "slowness" which turned out to be something in the way things are
copied between C and python? Is it really the xml parsing where most of
the time is spent, or is it something else like sqlite interactions or...?
Parsing those xml files sure isn't cheap, but it's not *that* slow in
C/C++ - I'd look for other places first.

Here's one easy target for optimization (the time difference is
consistent over successive runs):

[root at turre yum]# yum clean dbcache
Loading "installonlyn" plugin
3 cache files removed
[root at turre yum]# time ./yummain.py -C --disablerepo='*'
--enablerepo='core' makecache
Loading "installonlyn" plugin
Setting up repositories
################################################## 2931/2931
################################################## 2931/2931
################################################## 2931/2931
Metadata Cache Created

real    0m9.509s
user    0m6.634s
sys     0m0.664s
[root at turre yum]# yum clean dbcache
Loading "installonlyn" plugin
3 cache files removed
[root at turre yum]# time ./yummain.py -d0 -C --disablerepo='*'
--enablerepo='core' makecache

real    0m8.093s
user    0m6.003s
sys     0m0.469s

---

9.5 vs 8.0 seconds is one helluva big difference in percentage just to
tell the user "something is happening". This is with a reasonably fast
display adapter, I could imagine OLPC suffers even more from this. Didn't
try it, but simply making the progress callbacks (well, writing to screen)
less frequent should shave off quite an amount of time. The user doesn't
*really* need to know we're now processing exactly 1654th of 2001 records,
a rough idea of making progress (every 5/10 percent update for example) is
quite enough.

         - Panu -




More information about the Yum-devel mailing list