[Rpm-metadata] createrepo/utils.py

Luke Macken lmacken at redhat.com
Wed Apr 16 16:58:14 UTC 2008


On Wed, Apr 16, 2008 at 12:16:13PM -0400, Luke Macken wrote:
> Or, if there is a reason to try falling back to ['iso-8859-1', 'iso-8859-15', 'iso-8859-2']
> encodings, we could probably do something like this:
>
>     def utf8String(string):
>         """hands back a unicoded string"""
>         if string is None:
>             return u''
>         elif isinstance(string, unicode):    
>             return string
>         try:
>             x = unicode(string, 'utf-8')
>         except UnicodeError:
>             encodings = ['iso-8859-1', 'iso-8859-15', 'iso-8859-2']
>             for enc in encodings:
>                 try:
>                     x = unicode(string, enc)
>                     break
>                 except UnicodeError:
>                     pass
>             x = unicode(string, 'utf-8', errors='replace')
>         return x

Oops, the last unicode conversion attempt should probably be within the
else of the for loop.  Something like this,

     def utf8String(string):
         """hands back a unicoded string"""
         if string is None:
             return u''
         elif isinstance(string, unicode):    
             return string
         try:
             x = unicode(string, 'utf-8')
         except UnicodeError:
             encodings = ['iso-8859-1', 'iso-8859-15', 'iso-8859-2']
             for enc in encodings:
                 try:
                     x = unicode(string, enc)
                     break
                 except UnicodeError:
                     pass
             else:
                 x = unicode(string, 'utf-8', errors='replace')
         return x



More information about the Rpm-metadata mailing list