[Yum-devel] URLGrabber and escaping characters in http request
Toshio Kuratomi
a.badger at gmail.com
Mon Sep 24 04:16:00 UTC 2012
On Sun, Sep 23, 2012 at 09:14:20PM +0100, andrea wrote:
> Hi,
>
> I'm using URLGrabber in Fedora 17.
>
> I want to get this url
>
> http://rai-i.akamaihd.net/i/20120920/unpostoalsole-2009201220.35.00_,600,800,1200,1500,.mp4.csmil/master.m3u8
>
> But URLGrabber instead tries to get
>
> http://rai-i.akamaihd.net/i/20120920/unpostoalsole-2009201220.35.00_%2C600%2C800%2C1200%2C1500%2C.mp4.csmil/master.m3u8
>
> which fails.
>
> I got that from wireshark.
>
> If I try to use wget or curl, they pass the url unescaped which then works fine.
>
> Any idea how to make it work.
>
Reading the RFC for URIs ( http://www.ietf.org/rfc/rfc2396.txt ), the server
probably should unescape %2C to be a comma. However, you can probably work
around the server's problem with something like this:
import urllib
from urlgrabber.grabber import URLParser, URLGrabber
myurl = 'http://rai-i.akamaihd.net/i/20120920/unpostoalsole-2009201220.35.00_,600,800,1200,1500,.mp4.csmil/master.m3u8'
class MyParser(URLParser):
def quote(self, parts):
print 'here'
(scheme, host, path, parm, query, frag) = parts
path = urllib.quote(path, safe='/,')
return (scheme, host, path, parm, query, frag)
def test(url=myurl):.
mygrabber = URLGrabber()
mygrabber.opts.urlparser = MyParser()
mygrabber.urlgrab(url)
-Toshio
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <http://lists.baseurl.org/pipermail/yum-devel/attachments/20120923/4a1e28eb/attachment.asc>
More information about the Yum-devel
mailing list