HashSet中add底层方法详述

本文链接： https://blog.csdn.net/weixin_45104211/article/details/99130106

一、学习本章你将学习到连接到的新方法：

1、HashSet中的成员变量：map、hash

    private transient HashMap<E,Object> map;

2、HashSet中的无参构造方法

3、HashMap中put（）方法详解

4、put()方法中putval的作用以及Hash()方法的作用

5、putval方法中resize（）方法的作用与返回值

6、putval()方法中构造节点的方法newCode（）方法

7、putVal（）方法中第二个if针对三种情形的处理方式

二、详解add的底层代码

1、创建HashSet集合时：

    public HashSet() {
        map = new HashMap<>();
    }

上面提到，执行创建对象，相当于调用无参构造方法HashSet，之后创建一个HashMap对象为map（HashSet中的成员变量）赋值；

2、使用HashSet中的add方法：

    public boolean add(E e) {
        return map.put(e, PRESENT)==null;
    }

可以看出，返回值是boolean型，并且执行了map中的put()方法,其中传入的e即是集合中的泛型，而PRESENT则是一个常量，下面分析put方法：

    public V put(K key, V value) {
        return putVal(hash(key), key, value, false, true);
    }

一、put方法返回的是putVal方法的返回值，可以看到其中的形参中调用了hash方法，下面详解hash（）方法的作用：

    static final int hash(Object key) {
        int h;
        return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
    }

该方法的返回值由key.hashCode()方法返回值的不同决定的，既可以这样理解：只要key.hashCode()不同，返回值就不相同：（思考：hashCode（）方法的作用？）

下面引入toString()方法：可以返回一个引用型对象的地址（默认条件下可以不写）：注意：String类型的变量已将改写该方法，所以输出的是字符串

    public String toString() {
        return getClass().getName() + "@" + Integer.toHexString(hashCode());
    }

此时，我们理解到hashCode就是地址的十进制表达方式，而我们正常看到的地址是Integer.toHexString(hashCode())方法为我们转换成了十六进制的数：

public class Test {
	public static void main(String[] args) {
		Test test=new Test();
		System.out.println(test.toString());
		System.out.println(Integer.toHexString(test.hashCode()));
	}
}

执行结果如下：moon.Test@52e922
52e922

注：对于正常的String方法，其中已经改写了hashCode方法，此时结论是：只要是对象中的字符串相同，那么hashCode的返回值相同：

    public int hashCode() {
        int h = hash;
        if (h == 0 && value.length > 0) {
            char val[] = value;

            for (int i = 0; i < value.length; i++) {
                h = 31 * h + val[i];
            }
            hash = h;
        }
        return h;
    }

证明如下：

public class Test {
	public static void main(String[] args) {
		String str="Tom";
		String str2=new String("Tom");
		System.out.println(str==str2);
		System.out.println(str.hashCode());
		System.out.println(str2.hashCode());
	}
}

执行结果为：false
84274
84274

执行结果说明：hashCode结果相同，而地址不同

二、putVal方法：

先看一下的底层代码分析：

    final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
        if ((p = tab[i = (n - 1) & hash]) == null)
            tab[i] = newNode(hash, key, value, null);
        else {
            Node<K,V> e; K k;
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;
            else if (p instanceof TreeNode)
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {
                for (int binCount = 0; ; ++binCount) {
                    if ((e = p.next) == null) {
                        p.next = newNode(hash, key, value, null);
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);
                        break;
                    }
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
                    p = e;
                }
            }
            if (e != null) { // existing mapping for key
                V oldValue = e.value;
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
      }

第一个if（）处：

    if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;

1、resize（）方法：第一次赋值时为成员变量：table 和局部变量 tab赋予相同的地址（这两个属性为一个节点（Node）的集合）

    final Node<K,V>[] resize() {
        Node<K,V>[] oldTab = table;
        int oldCap = (oldTab == null) ? 0 : oldTab.length;
        int oldThr = threshold;
        int newCap, newThr = 0;
        if (oldCap > 0) {
            if (oldCap >= MAXIMUM_CAPACITY) {
                threshold = Integer.MAX_VALUE;
                return oldTab;
            }
            else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                     oldCap >= DEFAULT_INITIAL_CAPACITY)
                newThr = oldThr << 1; // double threshold
        }
        else if (oldThr > 0) // initial capacity was placed in threshold
            newCap = oldThr;
        else {               // zero initial threshold signifies using defaults
            newCap = DEFAULT_INITIAL_CAPACITY;
            newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
        }
        if (newThr == 0) {
            float ft = (float)newCap * loadFactor;
            newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                      (int)ft : Integer.MAX_VALUE);
        }
        threshold = newThr;
        @SuppressWarnings({"rawtypes","unchecked"})
            Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
        table = newTab;
        if (oldTab != null) {
            for (int j = 0; j < oldCap; ++j) {
                Node<K,V> e;
                if ((e = oldTab[j]) != null) {
                    oldTab[j] = null;
                    if (e.next == null)
                        newTab[e.hash & (newCap - 1)] = e;
                    else if (e instanceof TreeNode)
                        ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                    else { // preserve order
                        Node<K,V> loHead = null, loTail = null;
                        Node<K,V> hiHead = null, hiTail = null;
                        Node<K,V> next;
                        do {
                            next = e.next;
                            if ((e.hash & oldCap) == 0) {
                                if (loTail == null)
                                    loHead = e;
                                else
                                    loTail.next = e;
                                loTail = e;
                            }
                            else {
                                if (hiTail == null)
                                    hiHead = e;
                                else
                                    hiTail.next = e;
                                hiTail = e;
                            }
                        } while ((e = next) != null);
                        if (loTail != null) {
                            loTail.next = null;
                            newTab[j] = loHead;
                        }
                        if (hiTail != null) {
                            hiTail.next = null;
                            newTab[j + oldCap] = hiHead;
                        }
                    }
                }
            }
        }
        return newTab;
    }

我们在代码中找到这几行分析:

newCap = DEFAULT_INITIAL_CAPACITY;

Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];

table = newTab

return newTab；

static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16

可以看出resize的作用就是把table 和局部变量 tab赋予相同的地址（这两个属性为一个节点（Node）的集合），并且第一次确定数组的长度：n=16;

第二个if（）语句：

        if ((p = tab[i = (n - 1) & hash]) == null)
            tab[i] = newNode(hash, key, value, null);

括号中判断这次输入的数据是否是已经存在过：我们知道，通过hash的值根据hashCode（）返回值不同而不同，这也就是说，每添加一个新的地址key,就会执行if()语句后面的方法，再到最后执行：

 ++modCount;
        if (++size > threshold)
            resize();
        afterNodeInsertion(evict);
        return null;
    }

这样就在成功添加了一个数据：并且在put方法中输出true

此时不成立的条件下，会自动将相同上一次输出的集合元素赋值给p这个变量，所以此处的p实则找到与本次hash相同的变量

下面探究当数据的hash相同时，else执行的操作：

    else {
            Node<K,V> e; K k;
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;
    }

下面分为三种情况讨论：

一、当直接添加字符串：

public class Test {
	public static void main(String[] args) {
		HashSet<String> set =new HashSet<>();
		set.add("Tom");
		set.add("Tom");
	}
}

此时由于直接添加字符串，添加的key地址相同，hash也相同，即else后面if（表达式为真），此时执行将p赋值给e，然后执行下一步：、

        if (e != null) { // existing mapping for key
                V oldValue = e.value;
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
        }

此时的两个if（）均为真；所以用新的key替代旧的key,且返回非null ，put（）返回false；

二、当Stringl类采用对象赋值时；

import java.util.HashSet;
public class Test {
	public static void main(String[] args) {
		HashSet<String> set =new HashSet<>();
		set.add("Tom");
		String name =new String("Tom");
		System.out.println(set.add(name));
	}
}

上面提到过，String类已经修改过hashCode()方法了，所以此时的hash还是相同的，此时我们直接进行这一段代码的比较：

    else {
            Node<K,V> e; K k;
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;
    }

前后的Hash虽然相同，但是——地址不相同：在进行两者equals的对比；由于String中的equals也经过重写，所以此时也是：true，之后又是最后的覆盖内容：

        if (e != null) { // existing mapping for key
                V oldValue = e.value;
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
        }

此时为false；

三、输出的是非String类新建对象时

import java.util.HashSet;
public class Test {
	public static void main(String[] args) {
		HashSet<String> set =new HashSet<>();
		set.add(new Test());
		set.add(new Test());
	}
}

此时由于地址不同，会存在直接录入；

三、在创建项目时：

此时可以将hashCode（）与equals ( )进行重写方法，使项目变得简单，这也是我们学习底层代码的原因：

public class Student {
	public String id;
	public String name; 
	public Student(String id, String name) {
		this.id = id;
		this.name = name;
		
	}
	@Override
	public int hashCode() {
		return id.hashCode();
	}
	@Override
	public boolean equals(Object obj) {
		if(obj instanceof Student) {
			Student student =(Student)obj;
			return this.id.equals(student.id);
		}
		return false;
	}
}

如图所示:执行顺序，有可能存在泛型过大而导致不同对象ID重复并且录入的现象：

import java.util.HashSet;
public class Test {
	public static void main(String[] args) {
		HashSet<Object> set =new HashSet<>();
		Student student=new Student("100", null);
		Dog dog=new Dog();
		dog.id="100";
		set.add(dog);
		set.add(student);
	}
}

此时执行代码：

if (p.hash == hash &&
((k = p.key) == key || (key != null && key.equals(k))))
e = p;
else if (p instanceof TreeNode)
e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
else {
for (int binCount = 0; ; ++binCount) {
    if ((e = p.next) == null) {
        p.next = newNode(hash, key, value, null);
        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
            treeifyBin(tab, hash);
        break;
    }
    if (e.hash == hash &&
        ((k = e.key) == key || (key != null && key.equals(k))))
        break;
    p = e;
}
}

此时由于Student类中的改写，会出现以下情况：在判断第if时输出的是false,之后执行到下面的for循环中，将集合中的每个元素进行遍历，当出现先集合中出现空位时，将这个新值添加进去；

本节的内容到此结束：谢谢

HashSet中add底层方法详述

猜你喜欢