問題描述
所以我在這里問了另一個相關(guān)問題:java string hash function with avalanche effect,但我現(xiàn)在有一個不同的相關(guān)問題.
So I asked another related question here: java string hash function with avalanche effect, but I have a different, related question now.
我在那個問題中確定的是 String 的 hashCode() 函數(shù)沒有雪崩效應(yīng).這意味著,例如,如果我有字符串k1"、k2"、k3",并且我在每個字符串上調(diào)用 hashCode(),則返回的值將是連續(xù)的.
What I established in that question was that the hashCode() function for String does not have an avalanche effect. This means, for example, that if I have strings "k1", "k2", "k3", and I call hashCode() on each, the values returned will be contiguous.
現(xiàn)在,根據(jù)我對數(shù)據(jù)結(jié)構(gòu) 101 的回憶,我的印象是這是一件壞事.因為假設(shè) HashMap 通過以下算法選擇存儲桶:
Now, based on my recollection of data structures 101, I was under the impression that this is a bad thing. Because assuming that HashMap chooses buckets by an algorithm something like:
class HashMap {
private int capacity;
private int chooseBucket(String key) {
return key.hashCode() % capacity;
}
}
這意味著相似的密鑰存儲在連續(xù)的桶中,導(dǎo)致更高的沖突率,將 big-O 查找時間從 O(1) 降級為......誰知道有多糟糕......可能比 O 更糟(日志 n).
It would mean that similar keys are stored in contiguous buckets, leading to a higher rate of collisions, degrading big-O lookup time from O(1) to be...who knows how bad...maybe worse than O(log n).
我對第一個問題的回答類型大致是這里不需要雪崩效應(yīng)"、它僅適用于加密哈希函數(shù)"和字符串的 hashCode 實現(xiàn)速度很快,并且適用于小哈希圖".
The types of answers I got to my first question were along the lines of 'avalanche effect isn't needed here', 'it's only for cryptography hash functions', and 'the hashCode implementation for strings is fast and works well for small hash maps'.
這讓我很困惑.所有數(shù)據(jù)結(jié)構(gòu)在很小的時候都很快.Sun 不會提供適用于 large 數(shù)據(jù)集的默認(rèn) hashCode 函數(shù)嗎?那時 HashMap 的性能真的很重要,不是嗎?
Which confuses me. All data structures are fast when they're small. Wouldn't Sun provide a default hashCode function that will work well for large data sets? That's when the performance of HashMap really matters anyway, isn't it?
或者,我錯過了什么?請賜教.
Or, am I missing something? Please enlighten me.
推薦答案
將密鑰存儲在連續(xù)的桶中不會導(dǎo)致性能下降.將密鑰存儲在 same 存儲桶中(例如,chaining).使用鏈?zhǔn)浇鉀Q哈希沖突時:
Storing keys in contiguous buckets does not cause performance degradation. Storing keys in the same bucket (e.g., chaining) does. When using chaining to resolve hash collisions:
- 最壞的情況:每個哈希值都是相同的,所以所有元素最終都在同一個桶中,在這種情況下你會得到 O(n) 的性能(假設(shè)鏈?zhǔn)擎湵?
- 最佳情況:每個哈希值都不同,因此每個元素最終都位于不同的存儲桶中,因此您可以獲得預(yù)期的 O(1) 性能.
用于哈希表(等)的哈希代碼不需要需要 雪崩效應(yīng).
Hash codes for use in hash tables (and the like) do not need an avalanche effect.
這篇關(guān)于hashCode、實現(xiàn)以及與 HashMap 的關(guān)系的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網(wǎng)!