[Data structure] HashMap and HashSet

Table of contents

1、HashMap

1.1 Introduction to HashMap 

1.2 Part of the source code of HashMap 

1.2.1 HashMap property 

1.2.2 Construction method of hash table

1.2.3 HashMap method

2、HashSet


1、HashMap

1.1 Introduction to HashMap 

  • The bottom layer of HashMap is a hash bucket, which is the structure of an array plus a linked list. When the length of the array reaches 64 and the length of the linked list reaches 8, it will be converted into a red-black tree
  • Because the bottom layer of HashMap is a hash bucket, its insertion/deletion/find time complexity is O(1)
  •  HashMap is not ordered about keys

HashMap does not inherit the SortedMap interface, so the key does not need to re-CompareTO method, so HashMap is not about the order of the key

  • Calculate the hash address through the hash function  

HashMap does not inherit the SortedMap interface, so the key does not need to use the CompareTO method again, and the key cannot be compared, so how to calculate the hash address through the hash function?

Answer: The hash value can be obtained through the hashCode method, so as to calculate the hash address through the hash function

  • When the key in HashMap is a custom type, the custom type needs to override the equals and hashCode methods 

 Why do custom types override equals?

Answer: Because the key cannot be compared, the key in the HashMap must be unique, so rewrite equals to determine whether it is the same as other keys in the HashMap

Why do custom types override hashCode?

Suppose we now have the following code:

public class Student {
    private String name;

    public Student(String name) {
        this.name = name;
    }
}
class Demo {
    public static void main(String[] args) {
        Student student1 = new Student("张三");
        Student student2 = new Student("张三");
        System.out.println(student1.hashCode());
        System.out.println(student2.hashCode());
    }
}

 Print result:

Answer: Even though the name is the same but the hash value obtained by the hashCode method is different, then the position obtained by the hash function is also different. Since the positions are different, how can we judge whether there is the same key in the same linked list? gone

If it is like the picture above, the uniqueness of the key cannot be determined 

 Rewrite the code of hashCode:

import java.util.Objects;

public class Student {
    private String name;

    public Student(String name) {
        this.name = name;
    }

    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (o == null || getClass() != o.getClass()) return false;
        Student student = (Student) o;
        return Objects.equals(name, student.name);
    }

    @Override
    public int hashCode() {
        return Objects.hash(name);
    }
}

class Demo {
    public static void main(String[] args) {
        Student student1 = new Student("张三");
        Student student2 = new Student("张三");
        System.out.println(student1.hashCode());
        System.out.println(student2.hashCode());
    }
}

 operation result:

Only when the acquired hashCode value is the same can it be determined that the position is the same, then it can be judged whether there is the same key

So now there are two questions:

Question 1: If the hashCode is the same, the equals must be the same?

Answer: Not necessarily, because the hashCode is only determined to be in the same location, but a linked list is stored in the same location. There are many nodes in the linked list, and the keys in the nodes are also different, so there is no guarantee that the equals must be the same

Question 2: If the equals are the same, the hashCode must be the same? 

Answer: Definitely, because the same description of equals means the same key, so the location obtained by hashCode is also the same 

1.2 Part of the source code of HashMap 

1.2.1 HashMap property 

Default initial capacity:

Maximum capacity: 

Default load factor: 

Prerequisites for treeization:

 When the length of the linked list is 8 and the length of the array is 64, it will be converted into a red-black tree

 De-tree (red-black tree to linked list):

 The main storage structure of HashMap is an array of Node<K,V> type:

Each storage unit of the array stores a linked list

 Node is the node structure of the linked list

Record the number of elements in HashMap:

 Record the load factor of the current hash table:

By comparing the load factor value of the current hash table with the default load factor value, it is judged whether expansion is required 

1.2.2 Construction method of hash table

1. No parameter construction 

The first construction method has no parameter construction method, and directly assigns the default load factor to the current load factor

2. Construction method 

The second construction method is a construction method with parameters, which assigns the default load factor to the current load factor, and constructs a new Map according to the incoming Map

3. Construction method 

The third construction method is a parameter construction method, specifying capacity and custom load factor 

4. Construction method 

The fourth construction method is a parameter construction method, which will call the third construction method through this, and pass in the specified capacity and default load factor

1.2.3 HashMap method

When we use the construction method without specifying the capacity, we can also successfully construct a HashMap. At that time, we did not specify the capacity, so the array capacity was empty.

But when we insert elements, we can also insert them successfully. Why? 

Answer: Since no capacity was created during construction, the success of inserting elements must be related to the put method, so let's find the answer from the put method

put method:

Entering the put method, we can find that the put method calls the putValue method. When calling the putValue method, the first parameter calls the hash method and passes in the key we want to insert. Then let's first understand what the hash method does, and then look at the putValue method

hash method:

In the hash method, it is judged whether the key will be null. If it is null, it will return 0. Otherwise, the hash value obtained by calling the hash method of the key is assigned to h (if the key is a custom type, the hashCode method must be rewritten, otherwise it will be Call the hashCode method of Object), and then perform XOR operation on h and h >>> 16 to get the result and return

Why should h and h >>> 16 get the value obtained by XOR operation?

Answer: Because both the lower 16 bits and the upper 16 bits of h can be involved in the operation, the results obtained in this way are more uniform

putValue method: 

The putValue method is more complicated, let's analyze it in sections

First define the array tab of Node<K,V> type, the variable p of Node<K,V> type, the variables n and i of int type. Assign the table to tab, judge whether the tab is empty or whether the length of the tab array is 0, and if so, call the resize method to initialize the array. The length of the initial hash table is 16, and the critical threshold is 16 * 0.75 = 12, that is, the array will expand when there are 12 elements

What should be paid attention to when HashMap is expanded?

Answer: Re-hash the address of the elements inside 

Then let's look down at the content of the putValue method 

 Calculate the subscript of the insertion position of the insertion node in the array and put it into i, if it is empty, insert it directly

Otherwise, it means that this position is not empty, judge whether the hash value of the current subscript element is the same as the hash value of the element we want to insert, if they are the same, judge whether the node key exists, and overwrite the current element if it exists. If the node key does not exist, it will judge whether the position of subscript i is a red-black tree node, and if so, insert the element in the way of red-black tree. If the position of the subscript i is a linked list, the node will be inserted by tail insertion, and it will be judged whether to convert it into a red-black tree when inserting

The elements in the red-black tree are comparable, so what if the custom type does not override CompareTo?

Answer: It will be compared according to the hash value of the key, so even if it is converted into a red-black tree, it will not be in order with respect to the key. Or it may be just that the structure under a subscript position in the hash table is a red-black tree, and the others may still be linked lists

 ++modCount is the self-increment of the number of effective elements, and it is judged whether the array needs to be expanded after the node is inserted

2、HashSet 

The bottom layer of HashSet is HashMap, but HashSet is a pure key model, while HashMap is a key-value pair model 

Through the construction method of HashSet, it can be proved that the bottom layer of HashSet is HashMap

add method:

 When inserting elements into HashSet, value will give a default value

Guess you like

Origin blog.csdn.net/m0_66488562/article/details/128853746