Java集合学习2——JDK1.7 HashMap详解

1.介绍
HashMap是一个存储键值对（key-value）的容器，底层数据结构为数组+链表的数据结构。由于并不是线程安全，所以适合单线程操作，但速度相对而言也更快了。容器内部充分使用位运算，加快运算速度。HashMap相对复杂，在学习之前，我们先了解一下前置知识。

2.前置知识

按位与（&）：1&1=1，0&0=0，0&1=0，1&0=0，总结为对应二进制位都为1时，结果为1.（用于根据类似于取余的操作，但效率更高）
异或（^）：1^1=0，0^1=1，1^0=1，1^1=0，总结为同为0，异为1。（用于重写hashcode）

3.HashMap结构图
这里写图片描述

结构总结：

这是一个数字和链表组合的方式，添加数据使用的是头插法
默认如果key=null，那么这个键值对将会存储在index=0位置的链表中。

4.重要参数

 DEFAULT_INITIAL_CAPACITY = 1 << 4  默认散列表大小，为16
 MAXIMUM_CAPACITY = 1 << 30 散列表最大值
 DEFAULT_LOAD_FACTOR = 0.75f 默认加载因子
 size   容器中键值对个数
 threshold  扩容的临界值
 loadFactor  散列表的加载因子
 modCount   操作数

5.链表节点类解析

这个类中主要有两个方法比较重要，一个为构造方法，另一个为添加方法，这两个方法在HashMap中都会经常用到。而这个类也重写了equals和hashmap方法。同时还调用到一个非常重要的扩容方法，我在这一小节会详细介绍

构造函数

//我们可以发现这一个节点类的构造函数中有四个参数
//hash，k，v，n。hash是哈希值，k和v为键值对，
//n为指向下一个节点的指针。
        Entry(int h, K k, V v, Entry<K,V> n) {
            value = v;
            next = n;
            key = k;
            hash = h;
        }

添加节点


 void addEntry(int hash, K key, V value, int bucketIndex) {
         //判断size是否大于等于扩容临界值，
         //且散列表该位置不为空，如果都成立，
         //那么就进行扩容
        if ((size >= threshold) && (null != table[bucketIndex])) {
            //扩容操作
            resize(2 * table.length);
            //如果key=null，则hash值直接为0，否则计算hash值
            hash = (null != key) ? hash(key) : 0;
            //根据hash值获取在散列表的位置
            bucketIndex = indexFor(hash, table.length);
        }
        //执行添加节点操作
        createEntry(hash, key, value, bucketIndex);
}


    //在这方法中调用了众多方法，我会一一解释。

获取hash值的方法

 //此方法计算出hash值，充分运用位运算，不必深究，只需知道获取的是一个整型的数就可以
     final int hash(Object k) {
        int h = hashSeed;
        if (0 != h && k instanceof String) {
            return sun.misc.Hashing.stringHash32((String) k);
        }

        h ^= k.hashCode();

        // This function ensures that hashCodes that differ only by
        // constant multiples at each bit position have a bounded
        // number of collisions (approximately 8 at default load factor).
        h ^= (h >>> 20) ^ (h >>> 12);
        return h ^ (h >>> 7) ^ (h >>> 4);
    }

获取散列表索引的方法

static int indexFor(int h, int length) {
        // assert Integer.bitCount(length) == 1 : "length must be a non-zero power of 2";
        //通过按位与操作代替%，效率更高
        return h & (length-1);
    }

总结：这个方法十分有意思，如果大家细心的话会发现一个问题，那就是为什么DEFAULT_INITIAL_CAPACITY（默认散列表大小）、MAXIMUM_CAPACITY（散列表允许最大值）和扩容的倍数都为偶数。为什么不是奇数，这就和这个方法有直接关系。首先先看一个对比的例子

十进制	二进制	结果
7&5	111&101	5
7&4	111&010	4
8&5	1000&101	0
8&4	1000&100	0

a&b位置用a和b代指
从这个表中我们可以发现有趣的规律，如果b的值为偶数的话，那么无论a为何值，结果必为偶数。如果b的值为奇数的话，结果可能为奇数或偶数。究其原因就是b为偶数则二进制最后一位为0，0和0或者1按位与操作都为0。b为奇数时，最后一位为1，1和0或者1按位与时，可以为0或者1。
将这个思想带入到indexFor()方法中会发现，只有在数组长度为2的幂次方时，才能保证length-1为奇数。这样就是保证返回的index索引位置为整个散列表的任意值，如果length-1为偶数，那么简单的来说这个散列表的利用率只有整个散列表的一半。在源码中，这种重大的错误时不允许存在的。

创建一个节点，加入到链表中

    void createEntry(int hash, K key, V value, int bucketIndex) {
        //将该位置原来的节点赋值为e
        Entry<K,V> e = table[bucketIndex];
        //该位置放入新节点，并且新节点的next指向e
        table[bucketIndex] = new Entry<>(hash, key, value, e);
        //size+1
        size++;
    }

扩容操作

    //扩容方法
    void resize(int newCapacity) {
        //将原散列表命名为oldtable
        Entry[] oldTable = table;
        //获取链表长度
        int oldCapacity = oldTable.length;
        //如果链表长度等于最大值，则将扩容临界值设为MAX_VALUE(二进制31个1)
        if (oldCapacity == MAXIMUM_CAPACITY) {
            threshold = Integer.MAX_VALUE;
            return;
        }
        //建立新数组，大小为newCapacity（传入参数）
        Entry[] newTable = new Entry[newCapacity];
        //将数据导到新数组
        transfer(newTable, initHashSeedAsNeeded(newCapacity));
        //table指向新数组
        table = newTable;
        //扩容临界值设置为数组长度*系数
        threshold = (int)Math.min(newCapacity * loadFactor, MAXIMUM_CAPACITY + 1);
    }


 //将旧数组赋值给新数组
  void transfer(Entry[] newTable, boolean rehash) {

        int newCapacity = newTable.length;
        for (Entry<K,V> e : table) {
            while(null != e) {
                //next指针指向e（table[i]）的下一个节点
                Entry<K,V> next = e.next;
                //这块没看懂？？？
                if (rehash) {
                    e.hash = null == e.key ? 0 : hash(e.key);
                }
                //确定新表中e的索引
                int i = indexFor(e.hash, newCapacity);
                 //运用头插法，将链表中的元素挨个插入
                e.next = newTable[i];
                newTable[i] = e;
                e = next;
            }
        }
    }

6.增（put）

public V put(K key, V value) {
        //如果table为空，对其初始化
        if (table == EMPTY_TABLE) {
            inflateTable(threshold);
        }
        //如果key值为null，将其插入table[0]中
        if (key == null)
            return putForNullKey(value);
        //计算hash，得到索引
        int hash = hash(key);
        int i = indexFor(hash, table.length);
        //如果找到相同key值，则覆盖
        for (Entry<K,V> e = table[i]; e != null; e = e.next) {
            Object k;
            if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
                V oldValue = e.value;
                e.value = value;
                e.recordAccess(this);
                return oldValue;
            }
        }
        //操作数+1
        modCount++;
        //添加这个节点
        addEntry(hash, key, value, i);
        return null;
    }

    /**
     * Offloaded version of put for null keys
     */
    private V putForNullKey(V value) {
        //遍历table[0]的链表，如果存在key=null，则覆盖value
        for (Entry<K,V> e = table[0]; e != null; e = e.next) {
            if (e.key == null) {
                V oldValue = e.value;
                e.value = value;
                e.recordAccess(this);
                return oldValue;
            }
        }
        //操作数+1
        modCount++;
        //如果没有，则插入一个key=null的节点
        addEntry(0, null, value, 0);
        return null;
    }

7.删除

//删除指定key值节点
public V remove(Object key) {
        Entry<K,V> e = removeEntryForKey(key);
        return (e == null ? null : e.value);
    }

     //删除方法的具体实现
    final Entry<K,V> removeEntryForKey(Object key) {
        if (size == 0) {
            return null;
        }
        //根据key计算hash，得到索引
        int hash = (key == null) ? 0 : hash(key);
        int i = indexFor(hash, table.length);
        Entry<K,V> prev = table[i];
        Entry<K,V> e = prev;
        //遍历这个链表，存在两种情况，删除节点为头结点，删除节点为头节点以外节点。对这两种情况进行不同处理
        while (e != null) {
            Entry<K,V> next = e.next;
            Object k;
            if (e.hash == hash &&
                ((k = e.key) == key || (key != null && key.equals(k)))) {
                modCount++;
                size--;
                if (prev == e)
                    table[i] = next;
                else
                    prev.next = next;
                e.recordRemoval(this);
                return e;
            }
            prev = e;
            e = next;
        }

        return e;
    }

8.查

 public V get(Object key) {
        //key值为空，指定方法获取value
        if (key == null)
            return getForNullKey();
        //指定方法获取节点
        Entry<K,V> entry = getEntry(key);
        //节点为null返回空，否则返回节点value
        return null == entry ? null : entry.getValue();
    }
    //处理获取key=null的情况
    private V getForNullKey() {
        if (size == 0) {
            return null;
        }
        //遍历table[0],存在就获取value，不存在返回null
        for (Entry<K,V> e = table[0]; e != null; e = e.next) {
            if (e.key == null)
                return e.value;
        }
        return null;
    }

    final Entry<K,V> getEntry(Object key) {
        if (size == 0) {
            return null;
        }
        //遍历该keyhash所属链表，找到后返回该节点，否则返回空
        int hash = (key == null) ? 0 : hash(key);
        for (Entry<K,V> e = table[indexFor(hash, table.length)];
             e != null;
             e = e.next) {
            Object k;
            if (e.hash == hash &&
                ((k = e.key) == key || (key != null && key.equals(k))))
                return e;
        }
        return null;
    }

9.总结
我在写的时候就在思考，为什么有增删查但是没有改，看来源码我发现，put本身就带有改的功能。key在同一个散列表中是不允许存在两个的。这篇文章还有一定的不足，hashMap链表成环的问题还没有解释，还有就是Fail-Fast机制处理线程安全问题，我会在之后进行更新。

Java集合学习2——JDK1.7 HashMap详解

猜你喜欢