問題描述
我正在使用 urllib2 從 ftp 和 http 服務器加載文件.
某些服務器僅支持每個 IP 一個連接.問題是,urllib2 不會立即關閉連接.查看示例程序.
從 urllib2 導入 urlopen從時間導入睡眠url = 'ftp://user:pass@host/big_file.ext'定義加載文件(網址):f = urlopen(url)加載 = 0而真:數據 = f.read(1024)如果數據 == '':休息已加載 += len(數據)f.close()#睡眠(1)print('已加載 {0}'.format(已加載))加載文件(網址)加載文件(網址)
代碼從僅支持 1 個連接的 ftp 服務器加載兩個文件(此處兩個文件相同).這將打印以下日志:
已加載 463675266回溯(最近一次通話最后):文件conection_test.py",第 20 行,在 <module>加載文件(網址)文件connection_test.py",第 7 行,在 load_file 中f = urlopen(url)文件/usr/lib/python2.6/urllib2.py",第 126 行,在 urlopenreturn _opener.open(網址,數據,超時)文件/usr/lib/python2.6/urllib2.py",第 391 行,打開響應 = self._open(請求,數據)_open 中的文件/usr/lib/python2.6/urllib2.py",第 409 行'_open',請求)_call_chain 中的文件/usr/lib/python2.6/urllib2.py",第 369 行結果 = 函數(*args)文件/usr/lib/python2.6/urllib2.py",第 1331 行,在 ftp_openfw = self.connect_ftp(用戶,密碼,主機,端口,目錄,req.timeout)文件/usr/lib/python2.6/urllib2.py",第 1352 行,在 connect_ftpfw = ftpwrapper(用戶、密碼、主機、端口、目錄、超時)__init__ 中的文件/usr/lib/python2.6/urllib.py",第 854 行self.init()文件/usr/lib/python2.6/urllib.py",第 860 行,在 initself.ftp.connect(self.host,self.port,self.timeout)文件/usr/lib/python2.6/ftplib.py",第 134 行,在連接中self.welcome = self.getresp()文件/usr/lib/python2.6/ftplib.py",第 216 行,在 getresp 中提高error_temp,respurllib2.URLError: <urlopen 錯誤 ftp 錯誤: 421 來自您的 Internet 地址的連接太多.>
所以第一個文件被加載,第二個文件失敗,因為第一個連接沒有關閉.
但是當我在 f.close()
之后使用 sleep(1)
時不會發生錯誤:
已加載 463675266已加載 463675266
有什么辦法可以強制關閉連接,以免第二次下載失敗?
原因確實是文件描述符泄漏.我們還發現,使用 jython 時,問題比使用 cpython 時要明顯得多.一位同事提出了這個解決方案:
<上一頁>fdurl = urllib2.urlopen(req,timeout=self.timeout)realsock = fdurl.fp._sock.fp._sock** # 我們想稍后關閉真實"套接字req = urllib2.Request(url, header)嘗試:fdurl = urllib2.urlopen(req,timeout=self.timeout)除了 urllib2.URLError,e:打印urlopen 異常",erealsock.close()fdurl.close()修復很丑陋,但確實有效,不再有打開的連接太多".
I'm using urllib2 to load files from ftp- and http-servers.
Some of the servers support only one connection per IP. The problem is, that urllib2 does not close the connection instantly. Look at the example-program.
from urllib2 import urlopen
from time import sleep
url = 'ftp://user:pass@host/big_file.ext'
def load_file(url):
f = urlopen(url)
loaded = 0
while True:
data = f.read(1024)
if data == '':
break
loaded += len(data)
f.close()
#sleep(1)
print('loaded {0}'.format(loaded))
load_file(url)
load_file(url)
The code loads two files (here the two files are the same) from an ftp-server which supports only 1 connection. This will print the following log:
loaded 463675266
Traceback (most recent call last):
File "conection_test.py", line 20, in <module>
load_file(url)
File "conection_test.py", line 7, in load_file
f = urlopen(url)
File "/usr/lib/python2.6/urllib2.py", line 126, in urlopen
return _opener.open(url, data, timeout)
File "/usr/lib/python2.6/urllib2.py", line 391, in open
response = self._open(req, data)
File "/usr/lib/python2.6/urllib2.py", line 409, in _open
'_open', req)
File "/usr/lib/python2.6/urllib2.py", line 369, in _call_chain
result = func(*args)
File "/usr/lib/python2.6/urllib2.py", line 1331, in ftp_open
fw = self.connect_ftp(user, passwd, host, port, dirs, req.timeout)
File "/usr/lib/python2.6/urllib2.py", line 1352, in connect_ftp
fw = ftpwrapper(user, passwd, host, port, dirs, timeout)
File "/usr/lib/python2.6/urllib.py", line 854, in __init__
self.init()
File "/usr/lib/python2.6/urllib.py", line 860, in init
self.ftp.connect(self.host, self.port, self.timeout)
File "/usr/lib/python2.6/ftplib.py", line 134, in connect
self.welcome = self.getresp()
File "/usr/lib/python2.6/ftplib.py", line 216, in getresp
raise error_temp, resp
urllib2.URLError: <urlopen error ftp error: 421 There are too many connections from your internet address.>
So the first file is loaded and the second fails because the first connection was not closed.
But when i use sleep(1)
after f.close()
the error does not occurr:
loaded 463675266
loaded 463675266
Is there any way to force close the connection so that the second download would not fail?
The cause is indeed a file descriptor leak. We found also that with jython, the problem is much more obvious than with cpython. A colleague proposed this sollution:
fdurl = urllib2.urlopen(req,timeout=self.timeout) realsock = fdurl.fp._sock.fp._sock** # we want to close the "real" socket later req = urllib2.Request(url, header) try: fdurl = urllib2.urlopen(req,timeout=self.timeout) except urllib2.URLError,e: print "urlopen exception", e realsock.close() fdurl.close()
The fix is ugly, but does the job, no more "too many open connections".
這篇關于關閉 urllib2 連接的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!