問題描述
我正在嘗試使用 Python 3 讀取包含圖像(視頻)的 12 位二進制文??件.
I am trying to read 12-bit binary files containing images (a video) using Python 3.
要讀取類似的文件,但以 16 位編碼,以下方法非常有效:
To read a similar file but encoded in 16 bits, the following works very well:
import numpy as np
images = np.memmap(filename_video, dtype=np.uint16, mode='r', shape=(nb_frames, height, width))
其中 filename_video 是文件,nb_frames 是可以從另一個文件中讀取的視頻的高度和寬度特征.運行良好"是指快速:在我的計算機上讀取 140 幀的 640x256 視頻大約需要 1 毫秒.
where filename_video is the file and nb_frames, height, and width characteristics of the video that can be read from another file. By 'working very well' I mean fast: reading a 640x256 video that has 140 frames takes about 1 ms on my computer.
據我所知,當文件以 12 位編碼時,我不能使用它,因為沒有 uint12 類型.所以我要做的是讀取一個 12 位文件并將其存儲在一個 16 位 uint 數組中.以下內容取自 (Python:讀取 12 位打包二進制圖像),作品:
As far as I know I cannot use this when the file is encoded in 12 bits because there is no uint12 type. So what I am trying to do is to read a 12-bit file and store it in a 16-bit uint array. The following, taken from (Python: reading 12 bit packed binary image), works:
with open(filename_video, 'rb') as f:
data=f.read()
images=np.zeros(int(2*len(data)/3),dtype=np.uint16)
ii=0
for jj in range(0,int(len(data))-2,3):
a=bitstring.Bits(bytes=data[jj:jj+3],length=24)
images[ii],images[ii+1] = a.unpack('uint:12,uint:12')
ii=ii+2
images = np.reshape(images,(nb_frames,height,width))
但是,這非常慢:使用我的機器讀取只有 5 幀的 640x256 視頻大約需要 11.5 秒.理想情況下,我希望能夠像使用 memmap 讀取 8 或 16 位文件一樣有效地讀取 12 位文件.或者至少不會慢 10^5 倍.我怎樣才能加快速度?
However, this is very slow: reading a 640x256 video thas has only 5 frames takes about 11.5 s with my machine. Ideally I would like to be able to read 12-bit files as efficiently as I can read 8 or 16-bit files using memmap. Or at least not 10^5 times slower. How could I speed things up ?
這是一個文件示例:http://s000.tinyupload.com/index.php?file_id=26973488795334213426(nb_frames=5,高度=256,寬度=640).
Here is a file example: http://s000.tinyupload.com/index.php?file_id=26973488795334213426 (nb_frames=5, height=256, width=640).
推薦答案
我的實現與@max9111 提出的實現略有不同,它不需要調用 unpackbits
.
I have a slightly different implementation from the one proposed by @max9111 that doesn't require a call to unpackbits
.
它通過將中間字節切成兩半并使用 numpy 的二進制操作直接從三個連續的 uint8
創建兩個 uint12
值.在下文中,data_chunks
假定為二進制字符串,其中包含任意數量的 12 位整數的信息(因此其長度必須是 3 的倍數).
It creates two uint12
values from three consecutive uint8
directly by cutting the middle byte in half and using numpy's binary operations. In the following, data_chunks
is assumed to be a binary string containing the information for an arbitrary number number of 12-bit integers (hence its length must be a multiple of 3).
def read_uint12(data_chunk):
data = np.frombuffer(data_chunk, dtype=np.uint8)
fst_uint8, mid_uint8, lst_uint8 = np.reshape(data, (data.shape[0] // 3, 3)).astype(np.uint16).T
fst_uint12 = (fst_uint8 << 4) + (mid_uint8 >> 4)
snd_uint12 = ((mid_uint8 % 16) << 8) + lst_uint8
return np.reshape(np.concatenate((fst_uint12[:, None], snd_uint12[:, None]), axis=1), 2 * fst_uint12.shape[0])
我對其他實現進行了基準測試,事實證明這種方法在大約 5 Mb 的輸入上速度提高了大約 4 倍:read_uint12_unpackbits
每個循環 65.5 ms ± 1.11 ms(平均值 ± 標準差,7 次運行,每次 10 個循環)read_uint12
每個循環 14 ms ± 513 μs(平均值 ± 標準偏差,7 次運行,每次 100 個循環)
I benchmarked with the other implementation and this approach proved to be ~4x faster on a ~5 Mb input:
read_uint12_unpackbits
65.5 ms ± 1.11 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
read_uint12
14 ms ± 513 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)
這篇關于Python:讀取 12 位二進制文??件的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!