[Rpm-metadata] createrepo: initial comments and a UTF-8 patch

seth vidal skvidal at phy.duke.edu
Sat Jul 24 14:23:16 UTC 2004


> It does not currently do a decent job in UTF-8'ifying content.  Not that
> it would generate broken XML, but for example the UTF-8 "ä" in my
> surname turns in to two "?"s, when it's already UTF-8 in a RPM header!
> 
> Patch attached.  This is a simplified version of what I use in fancix,
> and the idea originates to decode() Skip Montanaro's query.py at
> http://manatee.mojam.com/~skip/python/query.py

my only concern is that it works with some of the suse and pld rpms.
When I tested them before it was very difficult to guess what encoding
they were in. I'll take a look again, thanks.


> The other gotcha in the patch is that when adding content to a libxml2
> tree, one does not need to XML escape it.  AFAIK that happens
> automatically correctly at serialization time.  XML escaping would be
> only needed when printing stuff directly somewhere outside of the libxml
> objects; that is not currently done so I nuked xmlCleanString()
> altogether.  While at it, I added explicit encodings to serialize()
> calls.

That only happens if you do serialize the whole thing, not just a node.
You can't serialize the whole thing b/c it would grow in memory use w/o
bound. That's why I use xmlCleanString().


> With this patch applied, the output is improved quite a bit here.  Add
> new encodings to the list in utf8String() if you like.
> 
> Issue 2:
> 
> The name "author" attribute in <changelog> is not a very good choice
> IMO.  RPM defines it as the "name" of the changelog entry.  It is very
> common that for RPMs the author attribute will contain stuff like "John
> Doe &lt;john at doe dot com&gt; - 2.6.8-0.1", ie. it's not only the
> author -> suggesting changing "author" to "name" unless it causes too
> much problems.

There is no standard for the 'author' field and it is what rpm calls it
for the changelog. I think just dumping the output as it occurs in the
rpm and letting the client program mangle it would be best.



> Issue 3:
> 
>   $ createrepo .
>   [...]
>   Saving Primary metadata
>   Saving file lists metadata
>   Saving other metadata
>   $ echo foo > repodata/foo.txt
>   $ createrepo .
>   [...]
>   Saving Primary metadata
>   Saving file lists metadata
>   Saving other metadata
>   Could not remove old metadata dir: .olddata
>   Error was [Errno 39] Directory not empty: '.olddata'
>   $ createrepo .
>   Old data directory exists, please remove: .olddata
> 
> Bug or feature?

feature, I think. Why would you be putting more data into the repodata
dir?

-sv





More information about the Rpm-metadata mailing list