日韩精品www,91精品一区二区三区在线观看,在线免费播放av

本文介紹了如何使用 Python 多處理池處理 tarfile?的處理方法，對大家解決問題具有一定的參考價值，需要的朋友們下面隨著小編來一起學習吧！

問題描述

我正在嘗試使用 multiprocessing.Pool 處理 tar 文件的內容.我能夠在多處理模塊中成功使用 ThreadPool 實現，但希望能夠使用進程而不是線程，因為它可能會更快并消除為 Matplotlib 處理多線程環境所做的一些更改.我收到一個錯誤，我懷疑與進程不共享地址空間有關，但我不確定如何修復它:

I'm trying to process the contents of a tarfile using multiprocessing.Pool. I'm able to successfully use the ThreadPool implementation within the multiprocessing module, but would like to be able to use processes instead of threads as it would possibly be faster and eliminate some changes made for Matplotlib to handle the multithreaded environment. I'm getting an error that I suspect is related to processes not sharing address space, but I'm not sure how to fix it:

Traceback (most recent call last):
  File "test_tarfile.py", line 32, in <module>
    test_multiproc()
  File "test_tarfile.py", line 24, in test_multiproc
    pool.map(read_file, files)
  File "/ldata/whitcomb/epd-7.1-2-rh5-x86_64/lib/python2.7/multiprocessing/pool.py", line 225, in map
    return self.map_async(func, iterable, chunksize).get()
  File "/ldata/whitcomb/epd-7.1-2-rh5-x86_64/lib/python2.7/multiprocessing/pool.py", line 522, in get
    raise self._value
ValueError: I/O operation on closed file

實際的程序更復雜，但這是我正在做的一個重現錯誤的示例:

The actual program is more complicated, but this is an example of what I'm doing that reproduces the error:

from multiprocessing.pool import ThreadPool, Pool
import StringIO
import tarfile

def write_tar():
    tar = tarfile.open('test.tar', 'w')
    contents = 'line1'
    info = tarfile.TarInfo('file1.txt')
    info.size = len(contents)
    tar.addfile(info, StringIO.StringIO(contents))
    tar.close()

def test_multithread():
    tar   = tarfile.open('test.tar')
    files = [tar.extractfile(member) for member in tar.getmembers()]
    pool  = ThreadPool(processes=1)
    pool.map(read_file, files)
    tar.close()

def test_multiproc():
    tar   = tarfile.open('test.tar')
    files = [tar.extractfile(member) for member in tar.getmembers()]
    pool  = Pool(processes=1)
    pool.map(read_file, files)
    tar.close()

def read_file(f):
    print f.read()

write_tar()
test_multithread()
test_multiproc()

我懷疑當 TarInfo 對象被傳遞到另一個進程但父 TarFile 不是時出現問題，但我不確定如何修復它在多進程情況下.我可以在不必從 tarball 中提取文件并將它們寫入磁盤的情況下執行此操作嗎?

I suspect that the something's wrong when the TarInfo object is passed into the other process but the parent TarFile is not, but I'm not sure how to fix it in the multiprocess case. Can I do this without having to extract files from the tarball and write them to disk?

推薦答案

您沒有將 TarInfo 對象傳遞給其他進程，而是將 tar.extractfile 的結果傳遞給其他進程(member) 進入另一個進程，其中 member 是一個 TarInfo 對象.extractfile(...) 方法返回一個類似文件的對象，其中包括一個 read() 方法，該方法對您打開的原始 tar 文件進行操作tar = tarfile.open('test.tar').

You're not passing a TarInfo object into the other process, you're passing the result of tar.extractfile(member) into the other process where member is a TarInfo object. The extractfile(...) method returns a file-like object which has, among other things, a read() method which operates upon the original tar file you opened with tar = tarfile.open('test.tar').

但是，您不能在另一個進程中使用來自一個進程的打開文件，您必須重新打開該文件.我用這個替換了你的 test_multiproc():

However, you can't use an open file from one process in another process, you have to re-open the file. I replaced your test_multiproc() with this:

def test_multiproc():
    tar   = tarfile.open('test.tar')
    files = [name for name in tar.getnames()]
    pool  = Pool(processes=1)
    result = pool.map(read_file2, files)
    tar.close()

并添加了這個:

def read_file2(name):
    t2 = tarfile.open('test.tar')
    print t2.extractfile(name).read()
    t2.close()

并且能夠讓您的代碼正常工作.

and was able to get your code working.

這篇關于如何使用 Python 多處理池處理 tarfile?的文章就介紹到這了，希望我們推薦的答案對大家有所幫助，也希望大家多多支持html5模板網！

【網站聲明】本站部分內容來源于互聯網,旨在幫助大家更快的解決問題，如果有圖片或者內容侵犯了您的權益，請聯系我們刪除處理，感謝您的支持！

久久久久久久av_日韩在线中文_看一级毛片视频_日本精品二区_成人深夜福利视频_武道仙尊动漫在线观看

如何使用 Python 多處理池處理 tarfile?

問題描述

推薦答案

相關文檔推薦