[Rpm-metadata] createrepo/utils.py

James Antill james at linux.duke.edu
Wed Apr 16 15:24:25 UTC 2008


 createrepo/utils.py |    9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

New commits:
commit ff90a8502f7f302f4d866a58771befc1f7b9ebcd
Author: James Antill <james at and.org>
Date:   Wed Apr 16 11:22:40 2008 -0400

    Talk to libxml maintainer ... tweak

diff --git a/createrepo/utils.py b/createrepo/utils.py
index 1af6b94..1dc3b0c 100644
--- a/createrepo/utils.py
+++ b/createrepo/utils.py
@@ -94,11 +94,14 @@ def utf8String(string):
                 if x.encode(enc) == string:
                     return x.encode('utf-8')
     newstring = ''
-    # Allow BS, HT, LF, VT, FF, CR
-    bad_small_bytes = range(0, 8) + range(14, 32)
+    # Kill bytes (or libxml will die) not in the small byte portion of:
+    #  http://www.w3.org/TR/REC-xml/#NT-Char
+    # we allow high bytes, if it passed the utf8 check above. Eg.
+    # good chars = #x9 | #xA | #xD | [#x20-...]
+    bad_small_bytes = range(0, 8) + [11, 12] + range(14, 32)
     for char in string:
         if ord(char) in bad_small_bytes:
-            newstring = newstring + '?'
+            pass # Just ignore these bytes...
         elif not du and ord(char) > 127:
             newstring = newstring + '?'
         else:



More information about the Rpm-metadata mailing list