[Rpm-metadata] createrepo/utils.py
Toshio Kuratomi
a.badger at gmail.com
Thu Apr 17 18:03:04 UTC 2008
Luke Macken wrote:
> Ah, I see. Ok then, my examples are assuming the method wanted to
> return a "unicoded" string, like the docstring says. Since this is not
> the case, my examples are not valid. (Although, if we were handling
> unicode properly within the program they would be. Ideally we should be
> decoding (str->unicode) early, and encoding (unicode->str) late. It
> seems like here we are never decoding, and encoding all of the freaking
> time. Pain and suffering ensues :)
>
> After discussion on IRC, I agree with Toshio and find his solution to be
> the sanest.
>
Note that the patch I posted here also fixes the following problems with
the current git HEAD:
1) Passing in a unicode object with control characters (aka
"small bytes") let's those control characters through.
2) Passing in a non-ascii byte string with control characters let's
those control characters through.
2a) This includes utf-8. The present git HEAD only strips control
characters if the source string is ascii.
I'm pretty sure it's faster as well since the patch:
1) Does its iteration in C builtins (zip() and translate()) instead of
in python
2) Avoids having to call ord(char) from within a loop.
but that's theoretical until someone does a benchmark :-)
-Toshio
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: OpenPGP digital signature
Url : http://lists.baseurl.org/pipermail/rpm-metadata/attachments/20080417/b2885775/attachment.pgp
More information about the Rpm-metadata
mailing list