問(wèn)題描述
以下代碼的時(shí)間非常奇怪:
I am getting really weird timings for the following code:
import numpy as np
s = 0
for i in range(10000000):
s += np.float64(1) # replace with np.float32 and built-in float
- 內(nèi)置浮點(diǎn):4.9 秒
- float64:10.5 秒
- float32:45.0 秒
- 函數(shù)調(diào)用
- numpy 與 python float 的轉(zhuǎn)換
- 對(duì)象的創(chuàng)建
為什么 float64
比 float
慢兩倍?為什么 float32
比 float64 慢 5 倍?
Why is float64
twice slower than float
? And why is float32
5 times slower than float64?
有什么辦法可以避免使用 np.float64
的懲罰,并讓 numpy
函數(shù)返回內(nèi)置 float
而不是 <代碼>float64?
Is there any way to avoid the penalty of using np.float64
, and have numpy
functions return built-in float
instead of float64
?
我發(fā)現(xiàn)使用 numpy.float64
比 Python 的 float 慢很多,而 numpy.float32
甚至更慢(即使我在 32 位機(jī)器上)).
I found that using numpy.float64
is much slower than Python's float, and numpy.float32
is even slower (even though I'm on a 32-bit machine).
numpy.float32
在我的 32 位機(jī)器上.因此,每次我使用各種 numpy 函數(shù)(例如 numpy.random.uniform
)時(shí),我都會(huì)將結(jié)果轉(zhuǎn)換為 float32
(以便以 32 位精度執(zhí)行進(jìn)一步的操作).
numpy.float32
on my 32-bit machine. Therefore, every time I use various numpy functions such as numpy.random.uniform
, I convert the result to float32
(so that further operations would be performed at 32-bit precision).
有沒(méi)有辦法在程序或命令行中的某處設(shè)置單個(gè)變量,并使所有 numpy 函數(shù)返回 float32
而不是 float64
?
Is there any way to set a single variable somewhere in the program or in the command line, and make all numpy functions return float32
instead of float64
?
編輯#1:
numpy.float64 在算術(shù)計(jì)算中比 float 慢 10 倍.太糟糕了,即使在計(jì)算之前轉(zhuǎn)換為浮點(diǎn)數(shù)并返回,程序運(yùn)行速度也快了 3 倍.為什么?有什么辦法可以解決嗎?
numpy.float64 is 10 times slower than float in arithmetic calculations. It's so bad that even converting to float and back before the calculations makes the program run 3 times faster. Why? Is there anything I can do to fix it?
我想強(qiáng)調(diào),我的時(shí)間安排不是由于以下任何原因:
I want to emphasize that my timings are not due to any of the following:
我更新了我的代碼,以更清楚地說(shuō)明問(wèn)題所在.使用新代碼,我似乎看到使用 numpy 數(shù)據(jù)類(lèi)型會(huì)帶來(lái)十倍的性能損失:
I updated my code to make it clearer where the problem lies. With the new code, it would seem I see a ten-fold performance hit from using numpy data types:
from datetime import datetime
import numpy as np
START_TIME = datetime.now()
# one of the following lines is uncommented before execution
#s = np.float64(1)
#s = np.float32(1)
#s = 1.0
for i in range(10000000):
s = (s + 8) * s % 2399232
print(s)
print('Runtime:', datetime.now() - START_TIME)
時(shí)間是:
- float64:34.56 秒
- float32:35.11 秒
- 浮動(dòng):3.53 秒
為了它,我也試過(guò)了:
從日期時(shí)間導(dǎo)入日期時(shí)間將 numpy 導(dǎo)入為 np
from datetime import datetime import numpy as np
START_TIME = datetime.now()
s = np.float64(1)
for i in range(10000000):
s = float(s)
s = (s + 8) * s % 2399232
s = np.float64(s)
print(s)
print('Runtime:', datetime.now() - START_TIME)
執(zhí)行時(shí)間為13.28 s;實(shí)際上,將 float64
轉(zhuǎn)換為 float
并返回比按原樣使用要快 3 倍.盡管如此,轉(zhuǎn)換還是要付出代價(jià),因此總體而言,與純 Python float
相比,它的速度要慢 3 倍以上.
The execution time is 13.28 s; it's actually 3 times faster to convert the float64
to float
and back than to use it as is. Still, the conversion takes its toll, so overall it's more than 3 times slower compared to the pure-python float
.
我的機(jī)器是:
- 英特爾酷睿 2 雙核 T9300 (2.5GHz)
- WinXP Professional(32 位)
- ActiveState Python 3.1.3.5
- Numpy 1.5.1
編輯 #2:
感謝您的回答,他們幫助我了解如何處理這個(gè)問(wèn)題.
Thank you for the answers, they help me understand how to deal with this problem.
但我仍然想知道為什么下面的代碼使用 float64
比使用 float
慢 10 倍的確切原因(也許基于源代碼).
But I still would like to know the precise reason (based on the source code perhaps) why the code below runs 10 times slow with float64
than with float
.
編輯#3:
我在 Windows 7 x64 (Intel Core i7 930 @ 3.8GHz) 下重新運(yùn)行代碼.
I rerun the code under the Windows 7 x64 (Intel Core i7 930 @ 3.8GHz).
同樣,代碼是:
from datetime import datetime
import numpy as np
START_TIME = datetime.now()
# one of the following lines is uncommented before execution
#s = np.float64(1)
#s = np.float32(1)
#s = 1.0
for i in range(10000000):
s = (s + 8) * s % 2399232
print(s)
print('Runtime:', datetime.now() - START_TIME)
時(shí)間是:
- float64:16.1s
- float32:16.1 秒
- 浮動(dòng):3.2 秒
現(xiàn)在兩個(gè) np
浮點(diǎn)數(shù)(64 或 32)都比內(nèi)置 float
慢 5 倍.盡管如此,還是有很大的不同.我想弄清楚它是從哪里來(lái)的.
Now both np
floats (either 64 or 32) are 5 times slower than the built-in float
. Still, a significant difference. I'm trying to figure out where it comes from.
編輯結(jié)束
推薦答案
總結(jié)
如果算術(shù)表達(dá)式同時(shí)包含 numpy
和內(nèi)置數(shù)字,則 Python 算術(shù)運(yùn)行速度較慢.避免這種轉(zhuǎn)換幾乎可以消除我報(bào)告的所有性能下降.
If an arithmetic expression contains both numpy
and built-in numbers, Python arithmetics works slower. Avoiding this conversion removes almost all of the performance degradation I reported.
詳情
請(qǐng)注意,在我的原始代碼中:
Note that in my original code:
s = np.float64(1)
for i in range(10000000):
s = (s + 8) * s % 2399232
float
和 numpy.float64
類(lèi)型混合在一個(gè)表達(dá)式中.也許 Python 必須將它們?nèi)哭D(zhuǎn)換為一種類(lèi)型?
the types float
and numpy.float64
are mixed up in one expression. Perhaps Python had to convert them all to one type?
s = np.float64(1)
for i in range(10000000):
s = (s + np.float64(8)) * s % np.float64(2399232)
如果運(yùn)行時(shí)沒(méi)有改變(而不是增加),這表明 Python 確實(shí)在幕后做了什么,從而解釋了性能拖累.
If the runtime is unchanged (rather than increased), it would suggest that's what Python indeed was doing under the hood, explaining the performance drag.
實(shí)際上,運(yùn)行時(shí)間下降了 1.5 倍!這怎么可能?Python 可能要做的最糟糕的事情難道不是這兩次轉(zhuǎn)換嗎?
Actually, the runtime fell by 1.5 times! How is it possible? Isn't the worst thing that Python could possibly have to do was these two conversions?
我真的不知道.也許 Python 必須動(dòng)態(tài)檢查什么需要轉(zhuǎn)換成什么,這需要時(shí)間,并且被告知要執(zhí)行哪些精確的轉(zhuǎn)換可以使其更快.也許,一些完全不同的機(jī)制用于算術(shù)(根本不涉及轉(zhuǎn)換),并且它恰好在不匹配的類(lèi)型上非常慢.閱讀 numpy
源代碼可能會(huì)有所幫助,但這超出了我的技能范圍.
I don't really know. Perhaps Python had to dynamically check what needs to be converted into what, which takes time, and being told what precise conversions to perform makes it faster. Perhaps, some entirely different mechanism is used for arithmetics (which doesn't involve conversions at all), and it happens to be super-slow on mismatched types. Reading numpy
source code might help, but it's beyond my skill.
無(wú)論如何,現(xiàn)在我們顯然可以通過(guò)將轉(zhuǎn)換移出循環(huán)來(lái)加快速度:
Anyway, now we can obviously speed things up more by moving the conversions out of the loop:
q = np.float64(8)
r = np.float64(2399232)
for i in range(10000000):
s = (s + q) * s % r
正如預(yù)期的那樣,運(yùn)行時(shí)間大幅減少:又減少了 2.3 倍.
As expected, the runtime is reduced substantially: by another 2.3 times.
公平地說(shuō),我們現(xiàn)在需要稍微更改 float
版本,將文字常量移出循環(huán).這會(huì)導(dǎo)致輕微的 (10%) 減速.
To be fair, we now need to change the float
version slightly, by moving the literal constants out of the loop. This results in a tiny (10%) slowdown.
考慮到所有這些變化,代碼的 np.float64
版本現(xiàn)在只比等效的 float
版本慢 30%;可笑的 5 倍性能損失已基本消失.
Accounting for all these changes, the np.float64
version of the code is now only 30% slower than the equivalent float
version; the ridiculous 5-fold performance hit is largely gone.
為什么我們?nèi)匀豢吹?30% 的延遲?numpy.float64
數(shù)字占用與 float
相同的空間,所以這不是原因.對(duì)于用戶定義的類(lèi)型,算術(shù)運(yùn)算符的解析可能需要更長(zhǎng)的時(shí)間.當(dāng)然不是主要問(wèn)題.
Why do we still see the 30% delay? numpy.float64
numbers take the same amount of space as float
, so that won't be the reason. Perhaps the resolution of the arithmetic operators takes longer for user-defined types. Certainly not a major concern.
這篇關(guān)于numpy float:比算術(shù)運(yùn)算中內(nèi)置的慢 10 倍?的文章就介紹到這了,希望我們推薦的答案對(duì)大家有所幫助,也希望大家多多支持html5模板網(wǎng)!