問題描述
有沒有辦法嚴格確保每個Hashmap桶的條目數(shù)不篡改Java中的object.hashcode()
函數(shù)?
Is there a way to strictly ensure the number of entries per Hashmap bucket without tampering the the object.hashcode()
function in Java?
負載因子是一個平均值:(條目數(shù))/(桶數(shù)).本質(zhì)上,假設我有一個容量為 1000 的 Hashmap.為了這個示例,假設我使用 1 的負載因子.我將要存儲在 HashMap 中的 100 個對象具有錯誤的哈希碼函數(shù),它總是返回每個對象的值相同.當我存儲完 100 個對象后,它們都將映射到同一個 HashMap 存儲桶,我最終會獲得 LinkedList 的性能.負載因子將保持沉默,因為 100 個條目/1000 個桶 = 0.1 <1. 現(xiàn)在如果我放置 1 M 個相同的對象會發(fā)生什么.因為永遠不會觸發(fā) LF,所以永遠不會調(diào)整 HashMap 的大小(無論如何都不會使用).
The Load Factor is an average: (# of entries) / (# of buckets). In essence, let's say I have a Hashmap of capacity 1000. For the sake of this example, say I use a Load Factor of 1. The 100 objects I'm going to be storing in the HashMap have bad hashcode function which always return the same value for every object. When I'm done storing 100 objects, they will all map of the same HashMap bucket and I eventually end up with LinkedList performance. The Load Factor will sit silent because 100 entries / 1000 buckets = 0.1 < 1. Now what happens if I put 1 M of the same objects. The HashMap will never be resized (no use anyways) as the LF will never be triggered.
我知道這在現(xiàn)實世界中并不常見,但我想提高我的理解.HashMap 有沒有辦法防止這種情況發(fā)生,或者至少從結(jié)構(gòu)本身得到一些警告?
I know this is an uncommon scenario in real world but would like to improve my understanding. Is there a way in HashMap to prevent this or at least get some warning from the structure itself?
推薦答案
HashMap
總是會根據(jù) key 的 hash code 計算出使用哪個桶.如果每個鍵具有相同的哈希碼,它們都將映射到同一個桶.如果不提供更好的 hashCode()
實現(xiàn),您將無法阻止您描述的行為.
A HashMap
will always calculate which bucket to use based on the key's hash code. If each key has the same hash code, they will all map to the same bucket. You cannot prevent the behavior you described without providing a better hashCode()
implementation.
您可以查看使用開放尋址的 Map 實現(xiàn)(例如 Trove 的 THashMap代碼>).他們總是每個桶只有一個條目.但是性能不會提高,它們只是以不同的方式處理沖突,而且它們也無法解決您的根本問題:哈希碼錯誤.
You could look at Map implementations that use open addressing (e.g. Trove's THashMap
). They will always have just one entry per bucket. But the performance will not improve, they just deal with collisions in a different way, and they also won't solve your root problem : a bad hash code.
這篇關于確保每個 Hashmap 桶/槽一個值的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網(wǎng)!