Java HashMap为什么通过(n - 1) & hash 获取哈希桶数组下标？

看过HashMap源码人应该都知道HashMap是如何根据hash值来计算哈希桶数组下标的，就是通过(n - 1) & hash来计算的，那么为什么用的是位运算而不是取模运算(hash % n)呢？

 if ((p = tab[i = (n - 1) & hash]) == null)
            tab[i] = newNode(hash, key, value, null);

一. 位运算与取模运算时间比较

public class Test {
	int a = 1;
	int number = 100000; // 数据集数量，初始定义为十万

	// 位运算
	public long bitwise() {
		long start = System.currentTimeMillis();

        //从十万开始，一直到Integer的最大值，计算所需时间
		for (int i = number; i > 0; i++) {  
			a &= i;
		}
		long end = System.currentTimeMillis();
		long time = end - start;
		System.out.println("位运算时间为：" + time + "ms");
		return time;
	}

	// 取模运算
	public long module() {
		long start = System.currentTimeMillis();
		for (int i = number; i > 0; i++) {
			a %= i;
		}
		long end = System.currentTimeMillis();
		long time = end - start;
		System.out.println("取模运算时间为：" + time + "ms");
		return time;
	}

	public static void main(String[] args) {
		Test t = new Test();
		t.bitwise();
		t.module();
	}
}

测试结果：

从测试结果我们可以看出，如果数据集足够的大，那么取模运算的时间将会是位运算时间的十几倍（每个人的测试结果都可能不一样，但位运算的时间远远小于取模运算）。这只是一方面，如果数量集足够大的话，那么HashMap的初始容量肯定不够，这也就触发了HashMap的扩容机制。在jdk1.7中，HashMap需要重新生成链表，重新计算hash值，这就更大大的增加了运行时间，将会从十几倍有可能变成几十倍或者上百倍。从运行时间上上来看，位运算效率要远远高于取模运算。(ps:jdk1.8对resize()进行了优化，在此不说，因为我还不懂，想了解的可以自行百度)。

二.位运算是如何保证索引不越界

讲到这，我们也就要想想为什么HashMap的容量是2的n次幂？两者之间有着千丝万缕的联系。

当 n 是2的次幂时， n - 1 通过二进制表示即尾端一直都是以连续1的形式表示的。当(n - 1) 与 hash 做与运算时，会保留hash中后 x 位的 1，这样就保证了索引值不会超出数组长度。

同时当n为2次幂时，会满足一个公式：(n - 1) & hash = hash % n。

Java HashMap为什么通过(n - 1) & hash 获取哈希桶数组下标？

猜你喜欢