[Yum-devel] [UG] url encoding advice

Michael Stenner mstenner at linux.duke.edu
Wed Dec 21 19:42:05 UTC 2005


On Wed, Dec 21, 2005 at 01:41:27AM -0500, seth vidal wrote:
> I think the most expected behavior is, of course, that urlgrabber will
> magically handle all items. :)

In this case, that truly would be magic.  The fundamental problem is
that there's no way to know if a url is encoded.  Let's play the
guessing game:

  http://place.com/foo%20bar            probably encoded
  http://place.com/foo%bar              probably NOT encoded
  http://place.com/foo%25bar            uh... hard to say

> I'm guessing that they have a regex looking for certain items to
> determine if its encoded. I'm sure it's not error-free, ie:
> 
> what happens if http://place.com/foo%20bar is the raw url. Do wget and
> curl vomit about it?

At least wget assumes that is encoded (according to the bug
submitter).  Since the encoding is always of the from "%xx" where "xx"
is a hex byte, then it might be reasonable to use something of the
form:  /%[0-9a-fA-F][0-9a-fA-F]/ as a regex (although better
constructed).  It's probably not THAT error-prone.

					-Michael
-- 
  Michael D. Stenner                            mstenner at ece.arizona.edu
  ECE Department and Optical Sciences Center                520-626-1619
  University of Arizona                                         ECE 524G



More information about the Yum-devel mailing list