問題描述
考慮以下 PHP 代碼:
Consider the following PHP Code:
//Method 1
$array = array(1,2,3,4,5);
foreach($array as $i=>$number){
$number++;
$array[$i] = $number;
}
print_r($array);
//Method 2
$array = array(1,2,3,4,5);
foreach($array as &$number){
$number++;
}
print_r($array);
這兩種方法完成相同的任務,一種是分配引用,另一種是基于鍵重新分配.我想在我的工作中使用好的編程技術,我想知道哪種方法是更好的編程實踐?或者這是其中一件并不重要的事情?
Both methods accomplish the same task, one by assigning a reference and another by re-assigning based on key. I want to use good programming techniques in my work and I wonder which method is the better programming practice? Or is this one of those it doesn't really matter things?
推薦答案
由于得分最高的答案表明第二種方法在各方面都更好,我覺得有必要在這里發布答案.誠然,按引用循環性能更高,但并非沒有風險/陷阱.
底線,一如既往:X 或 Y 哪個更好",你能得到的唯一真實答案是:
Since the highest scoring answer states that the second method is better in every way, I feel compelled to post an answer here. True, looping by reference is more performant, but it isn't without risks/pitfalls.
Bottom line, as always: "Which is better X or Y", the only real answers you can get are:
- 這取決于你在做什么/你在做什么
- 哦,兩者都可以,如果你知道自己在做什么
- X 適合這樣,Y 適合所以
- 不要忘記 Z,即使如此...(X、Y 或 Z 哪個更好" 是同一個問題,因此適用相同的答案:視情況而定,兩者都是好的,如果...)
- It depends on what you're after/what you're doing
- Oh, both are OK, if you know what you're doing
- X is good for Such, Y is better for So
- Don't forget about Z, and even then ...("which is better X, Y or Z" is the same question, so the same answers apply: it depends, both are ok if...)
盡管如此,正如 Orangepill 所示,參考方法提供了更好的性能.在這種情況下,性能與代碼之間的權衡更不容易出錯,更易于閱讀/維護.一般來說,人們認為使用更安全、更可靠且更易于維護的代碼會更好:
Be that as it may, as Orangepill showed, the reference-approach offers better performance. In this case, the tradeoff one of performance vs code that is less error-prone, easier to read/maintan. In general, it's considered better to go for safer, more reliable, and more maintainable code:
'調試的難度是最初編寫代碼的兩倍.因此,如果您盡可能聰明地編寫代碼,根據定義,您就不夠聰明來調試它.— 布賴恩·克尼漢
'Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.' — Brian Kernighan
我想這意味著必須考慮第一種方法最佳實踐.但這并不意味著應該始終避免使用第二種方法,因此接下來是在 foreach
中使用引用時必須考慮的缺點、陷阱和怪癖循環:
I guess that means the first method has to be considered best practice. But that doesn't mean the second approach should be avoided at all time, so what follows here are the downsides, pitfalls and quirks that you'll have to take into account when using a reference in a foreach
loop:
范圍:
首先,PHP 并不是像 C(++)、C#、Java、Perl 或(幸運的是)ECMAScript6 那樣真正的塊作用域……這意味著 $value
變量一旦循環結束,不會取消設置.當按引用循環時,這意味著對您正在迭代的任何對象/數組的最后一個值的引用是浮動的.應該會想起等待發生的事故"這句話.
考慮以下代碼中 $value
和隨后的 $array
會發生什么:
Scope:
For a start, PHP isn't truly block-scoped like C(++), C#, Java, Perl or (with a bit of luck) ECMAScript6... That means that the $value
variable will not be unset once the loop has finished. When looping by reference, this means a reference to the last value of whatever object/array you were iterating is floating around. The phrase "an accident waiting to happen" should spring to mind.
Consider what happens to $value
, and subsequently $array
, in the following code:
$array = range(1,10);
foreach($array as &$value)
{
$value++;
}
echo json_encode($array);
$value++;
echo json_encode($array);
$value = 'Some random value';
echo json_encode($array);
此代碼段的輸出將是:
[2,3,4,5,6,7,8,9,10,11]
[2,3,4,5,6,7,8,9,10,12]
[2,3,4,5,6,7,8,9,10,"Some random value"]
換句話說,通過重用 $value
變量(它引用數組中的最后一個元素),您實際上是在操作數組本身.這使得代碼容易出錯,調試困難.與此相反:
In other words, by reusing the $value
variable (which references the last element in the array), you're actually manipulating the array itself. This makes for error-prone code, and difficult debugging. As opposed to:
$array = range(1,10);
$array[] = 'foobar';
foreach($array as $k => $v)
{
$array[$k]++;//increments foobar, to foobas!
if ($array[$k] === ($v +1))//$v + 1 yields 1 if $v === 'foobar'
{//so 'foobas' === 1 => false
$array[$k] = $v;//restore initial value: foobar
}
}
可維護性/防白癡:
當然,您可能會說懸空引用很容易解決,而且您是對的:
Maintainability/idiot-proofness:
Of course, you might say that the dangling reference is an easy fix, and you'd be right:
foreach($array as &$value)
{
$value++;
}
unset($value);
但是在您用引用編寫了前 100 個循環之后,您真的相信您不會忘記取消設置單個引用嗎?當然不是!unset
已經在循環中使用的變量非常罕見(我們假設 GC 會為我們處理它),所以大多數時候,你不會打擾.當涉及引用時,這是令人沮喪、神秘的錯誤報告或移動值的來源,在這種情況下,您正在使用復雜的嵌套循環,可能有多個引用......恐怖,恐怖.
此外,隨著時間的推移,誰能說下一個處理您代碼的人不會忘記 unset
?誰知道呢,他甚至可能不知道引用,或者看到您無數的 unset
調用并認為它們是多余的,這是您偏執的標志,然后將它們全部刪除.評論本身對你沒有幫助:他們需要被閱讀,并且每個使用你的代碼的人都應該被徹底介紹,也許讓他們閱讀有關該主題的完整文章.鏈接文章中列出的示例很糟糕,但我見過更糟糕的情況:
But after you've written your first 100 loops with references, do you honestly believe you won't have forgotten to unset a single reference? Of course not! It's so uncommon to unset
variables that have been used in a loop (we assume the GC will take care of it for us), so most of the time, you don't bother. When references are involved, this is a source of frustration, mysterious bug-reports, or traveling values, where you're using complex nested loops, possibly with multiple references... The horror, the horror.
Besides, as time passes, who's to say that the next person working on your code won't foget about unset
? Who knows, he might not even know about references, or see your numerous unset
calls and deem them redundant, a sign of your being paranoid, and delete them all together. Comments alone won't help you: they need to be read, and everyone working with your code should be thoroughly briefed, perhaps have them read a full article on the subject. The examples listed in the linked article are bad, but I've seen worse, still:
foreach($nestedArr as &$array)
{
if (count($array)%2 === 0)
{
foreach($array as &$value)
{//pointless, but you get the idea...
$value = array($value, 'Part of even-length array');
}
//$value now references the last index of $array
}
else
{
$value = array_pop($array);//assigns new value to var that might be a reference!
$value = is_numeric($value) ? $value/2 : null;
array_push($array, $value);//congrats, X-references ==> traveling value!
}
}
這是一個簡單的旅行值問題示例.我沒有編造這個,順便說一句,我遇到了歸結為這個的代碼......老實說.除了發現錯誤和理解代碼(參考文獻變得更加困難)之外,在這個例子中仍然很明顯,主要是因為它只有 15 行長,即使使用寬敞的 Allman 編碼風格......現在想象一下在代碼中使用的這個基本結構實際上做一些更復雜、更有意義的事情.祝調試成功.
This is a simple example of a traveling value problem. I did not make this up, BTW, I've come across code that boils down to this... honestly. Quite apart from spotting the bug, and understanding the code (which has been made more difficult by the references), it's still quite obvious in this example, mainly because it's a mere 15 lines long, even using the spacious Allman coding style... Now imagine this basic construct being used in code that actually does something even slightly more complex, and meaningful. Good luck debugging that.
副作用:
人們常說函數不應該有副作用,因為副作用(理所當然地)被認為是代碼氣味.盡管 foreach
是一種語言結構,而不是一個函數,但在您的示例中,應該應用相同的思維方式.當使用太多引用時,你太聰明了,不利于自己,并且可能會發現自己不得不單步執行循環,只是為了知道什么變量引用了什么,什么時候引用.
第一種方法沒有這個問題:你有鑰匙,所以你知道你在數組中的位置.更重要的是,使用第一種方法,您可以對值執行任意數量的操作,而無需更改數組中的原始值(無副作用):
side-effects:
It's often said that functions shouldn't have side-effects, because side-effects are (rightfully) considered to be code-smell. Though foreach
is a language construct, and not a function, in your example, the same mindset should apply. When using too many references, you're being too clever for your own good, and might find yourself having to step through a loop, just to know what is being referenced by what variable, and when.
The first method hasn't got this problem: you have the key, so you know where you are in the array. What's more, with the first method, you can perform any number of operations on the value, without changing the original value in the array (no side-effects):
function recursiveFunc($n, $max = 10)
{
if (--$max)
{
return $n === 1 ? 10-$max : recursiveFunc($n%2 ? ($n*3)+1 : $n/2, $max);
}
return null;
}
$array = range(10,20);
foreach($array as $k => $v)
{
$v = recursiveFunc($v);//reassigning $v here
if ($v !== null)
{
$array[$k] = $v;//only now, will the actual array change
}
}
echo json_encode($array);
這會生成輸出:
[7,11,12,13,14,15,5,17,18,19,8]
如您所見,第一個、第七個和第十個元素已更改,其他元素未更改.如果我們使用循環引用重寫這段代碼,循環看起來小很多,但輸出會有所不同(我們有副作用):
As you can see, the first, seventh and tenth elements have been altered, the others haven't. If we were to rewrite this code using a loop by reference, the loop looks a lot smaller, but the output will be different (we have a side-effect):
$array = range(10,20);
foreach($array as &$v)
{
$v = recursiveFunc($v);//Changes the original array...
//granted, if your version permits it, you'd probably do:
$v = recursiveFunc($v) ?: $v;
}
echo json_encode($array);
//[7,null,null,null,null,null,5,null,null,null,8]
為了解決這個問題,我們要么創建一個臨時變量,要么調用函數 tiwce,要么添加一個鍵,然后重新計算 $v
的初始值,但這只是愚蠢的(這增加了修復不應該被破壞的東西的復雜性):
To counter this, we'll either have to create a temporary variable, or call the function tiwce, or add a key, and recalculate the initial value of $v
, but that's just plain stupid (that's adding complexity to fix what shouldn't be broken):
foreach($array as &$v)
{
$temp = recursiveFunc($v);//creating copy here, anyway
$v = $temp ? $temp : $v;//assignment doesn't require the lookup, though
}
//or:
foreach($array as &$v)
{
$v = recursiveFunc($v) ? recursiveFunc($v) : $v;//2 calls === twice the overhead!
}
//or
$base = reset($array);//get the base value
foreach($array as $k => &$v)
{//silly combine both methods to fix what needn't be a problem to begin with
$v = recursiveFunc($v);
if ($v === 0)
{
$v = $base + $k;
}
}
無論如何,添加分支、臨時變量和你有什么,而不是打敗重點.首先,它引入了額外的開銷,這將侵蝕參考文獻最初為您提供的性能優勢.
如果您必須向循環添加邏輯,以修復不應該修復的問題,您應該退后一步,考慮一下您正在使用哪些工具.9/10 次,您為這項工作選擇了錯誤的工具.
Anyway, adding branches, temp variables and what have you, rather defeats the point. For one, it introduces extra overhead which will eat away at the performance benefits references gave you in the first place.
If you have to add logic to a loop, to fix something that shouldn't need fixing, you should step back, and think about what tools you're using. 9/10 times, you chose the wrong tool for the job.
至少對我來說,第一種方法的最后一個令人信服的論點很簡單:可讀性.如果您正在做一些快速修復或嘗試添加功能,則引用運算符 (&
) 很容易被忽略.您可能會在運行良好的代碼中創建錯誤.更重要的是:因為它運行良好,您可能不會徹底測試現有功能因為沒有已知問題.
由于您忽略了操作員而發現進入生產的錯誤可能聽起來很愚蠢,但您不會是第一個遇到這種情況的人.
The last thing that, to me at least, is a compelling argument for the first method is simple: readability. The reference-operator (&
) is easily overlooked if you're doing some quick fixes, or try to add functionality. You could be creating bugs in the code that was working just fine. What's more: because it was working fine, you might not test the existing functionality as thoroughly because there were no known issues.
Discovering a bug that went into production, because of your overlooking an operator might sound silly, but you wouldn't be the first to have encountered this.
注意:
自 5.4 以來,在調用時通過引用傳遞已被刪除.對可能會發生變化的特性/功能感到厭煩.數組的標準迭代多年來沒有改變.我想這就是您可以稱之為經過驗證的技術".它按照它在罐頭上所說的做,并且是更安全的做事方式.那么如果它更慢呢?如果速度是一個問題,您可以優化代碼,然后引入對循環的引用.
編寫新代碼時,請選擇易于閱讀、最安全的選項.優化可以(而且確實應該)等到一切都經過嘗試和測試.
Note:
Passing by reference at call-time has been removed since 5.4. Be weary of features/functionality that is subject to changes. a standard iteration of an array hasn't changed in years. I guess it's what you could call "proven technology". It does what it says on the tin, and is the safer way of doing things. So what if it's slower? If speed is an issue, you can optimize your code, and introduce references to your loops then.
When writing new code, go for the easy-to-read, most failsafe option. Optimization can (and indeed should) wait until everything's tried and tested.
和往常一樣:過早的優化是萬惡之源.并且為工作選擇合適的工具,而不是因為它是新的和閃亮的.
這篇關于在 foreach 循環中什么更好...使用 &符號或基于鍵重新分配?的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!