[Yum-devel] [PATCH] Fix main speed issue in to_xml(), slows down new createrepo a lot. BZ 716235.

James Antill james at and.org
Wed Nov 14 18:04:45 UTC 2012


 The problem is that in _ugly_utf8_string_hack() we go through each byte
of the data that isn't in unicode, unless it's already in a unicode()
object. This "fix" just tries to convert bytes into a utf-8 unicode()
object, and if that works then skips the expensive byte mangling.
 I'm not 100% sure this fixes the original issues the byte mangling was
added for, but on my random test of 2.5k pkgs. this makes new createrepo
--update go from ~1:25 to ~0:35.
---
 yum/misc.py |    8 +++-----
 1 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/yum/misc.py b/yum/misc.py
index a0bac7b..072c99b 100644
--- a/yum/misc.py
+++ b/yum/misc.py
@@ -910,12 +910,10 @@ def _ugly_utf8_string_hack(item):
         return item
     
     # this handles any bogon formats we see
-    du = False
     try:
-        x = unicode(item, 'ascii')
-        du = True
+        return unicode(item, 'utf-8')
     except UnicodeError:
-        encodings = ['utf-8', 'iso-8859-1', 'iso-8859-15', 'iso-8859-2']
+        encodings = ['iso-8859-1', 'iso-8859-15', 'iso-8859-2']
         for enc in encodings:
             try:
                 x = unicode(item, enc)
@@ -938,7 +936,7 @@ def _ugly_utf8_string_hack(item):
     for char in item:
         if ord(char) in bad_small_bytes:
             pass # Just ignore these bytes...
-        elif not du and ord(char) > 127:
+        elif ord(char) > 127:
             newitem = newitem + '?' # byte by byte equiv of escape
         else:
             newitem = newitem + char
-- 
1.7.6.5



More information about the Yum-devel mailing list