[Yum-devel] urlgrabber socket timeouts

Ryan Tomayko rtomayko at gmail.com
Fri Oct 8 06:53:45 UTC 2004


I looked into this a bit. There are two methods for establishing a
timeout. First, there's an instance method for sockets: settimeout.
This lets you set the timeout on a socket after it is created. The
second is, as you mentioned, the global socket.setdefaulttimeout,
which sets the timeout globally for all sockets created after the call
is made.

I was hoping to use the instance level method because it seems a bit
more safe to me. We can control exactly which sockets get a timeout
set and which do not. This proved to be very very hard. The actual
socket creation is buried deep in httplib.py and ftplib.py. urllib2
doesn't really care much about the socket so it doesn't expose it up
to the calling application. So, unless urllib2 is enhanced or we
rethink how we're opening connections, socket-level timeouts are a bit
beyond us.

Now, since we are completely single threaded at this point, it may be
possible to use the global setdefaulttimeout to get the exact same
effect as the instance level settimeout. e.g. the following two snips
should yield equivelant results if only a single thread is creating
sockets at a time:

Snip 1: Using settimeout

import socket
sock = socket.socket(..)
sock.settimeout(10.0)
sock.connect(..)

Snip 2: Using setdefaulttimeout

import socket
old_to = socket.getdefaulttimeout()
socket.setdefaulttimeout(10.0)
try:
  sock = sock.socket(...)
finally:
  socket.setdefaulttimeout(old_to)
sock.connect()

Using setdefaulttimeout is a bit more of a kludge but believe me, this
is much less kludgy than trying to inject Handler and HTTPConnection
subclasses into urllib2.

I'll commit something with this shortly, just wanted to dump my
findings out here to see if anyone can spot something I'm missing.

Ryan

On Wed, 29 Sep 2004 01:44:47 -0400, seth vidal <skvidal at phy.duke.edu> wrote:
> 
> > You're not in left field at all.  The only issue is that this is only
> > available in 2.3.  In previous versions, you'd need to include a third
> > party module.
> 
> Works for me - let's do a check for the function, if it's not there,
> then it doesn't get set, sucks to be running python 2.2. :)
> 
> > I agree that this is a good way to go.  It doesn't completely solve
> > timeout problems though because tcp timeouts won't always save you
> > from a stupid/slow server.  A tcp timeout happens when one side sends
> > a request but doesn't get an acknowledgement.  It's quite possible
> > (and common in ssh or imap connections) to simply have no requests
> > made for a very long time.  In that case, you'd also need higher-level
> > timeouts, which I was looking into a couple weeks ago before I got
> > swamped in work.
> 
> My tests:
> 
> set the timeout to 30s
> start downloading something big
> login to server, set -j DROP on the iptables from the downloading host
> wait 30s, it times out
> restart the download
> login to server, set -j DROP on the iptables from the downloading host
> wait 20s
> unset the -j DROP
> it continues the download
> set the -j DROP
> wait 30s, it times out.
> 
> Repeat the above with -j REJECT and turning off webserver.
> Now I know that's not all of the possible situations but I'd be willing
> to bet it's a good hunk of them.
> 
> >
> > In short, I'd have no problem implementing this as an "only >=2.3"
> > feature.  However, we should probably be clear from the start that we
> > will eventually remove the checks, meaning: if you want to use new
> > urlgrabbers with old pythons for a long time, you should simply not
> > use this option.
> 
> Fine by me.
> 
> This would make a lot of people happy, I'm certain.
> -sv
> 
> 
> 
> 
> _______________________________________________
> Yum-devel mailing list
> Yum-devel at lists.linux.duke.edu
> https://lists.dulug.duke.edu/mailman/listinfo/yum-devel
>



More information about the Yum-devel mailing list