12. The secrets you don't know in Redis-the realization principle of the five basic structures SortedSet

Preface

SortedSet (zset) ordered collection can be regarded as maintaining a sequence value for each element in the collection on the basis of the Set collection: score, which allows the elements in the collection to be sorted according to the score, so it is classic and practical Scenarios such as: candidates are ranked by score, a certain game player is ranked by score, a certain data ranking on the homepage of the website, the latest comments are ranked by time, and so on.

Redis is an in-memory database. It also needs to consider memory overhead while ensuring the speed of reading and writing. For SortedSet ordered sets, it needs to maintain an order value. For the underlying implementation of ordered sets, you can choose: arrays, linked lists, Structures such as balanced trees or red-black trees, but SortedSet did not choose these structures. The performance of inserting and deleting elements in the array is very poor, and the query of the linked list is slow. Although the query efficiency of the balanced tree or the red-black tree is high, the performance of the tree needs to be maintained when inserting and deleting elements, and the implementation is extremely complicated.

Therefore, the bottom layer of SortedSet uses a new type of data structure— 跳跃表

skiplist skip list principle

The performance of the jump table is comparable to the red-black tree, and it is much simpler to implement than the red-black tree. So what is a jump table? Before understanding the jump list, let's take a look at the following linked list.

[External link image transfer failed. The source site may have an anti-leech link mechanism. It is recommended to save the image and upload it directly (img-77seSZ0O-1615940744288)(sortedset.assets/1615907166172.png)]

If we want to query a node with a value of 13, for the above singly linked list, I need to traverse the nodes from the front to the back, and calculate that the performance is very poor. How can I improve the query speed? We know that even an ordered linked list cannot be changed for binary search, unless we turn this linked list into a structure like a red-black tree, but the realization of a red-black tree is too troublesome. So, what if I treat this linked list like this?

[External link image transfer failed. The source site may have an anti-leech link mechanism. It is recommended to save the image and upload it directly (img-HWqiGo40-1615940744291)(sortedset.assets/1615907457430.png)]

I extract the elements in the first level of the linked list every two elements upwards to form the second level of the linked list, as shown in the figure above, if I search for elements, I will first find 13 from the top level, and when I find 18 When it is greater than 13, return to 10, go to the next level to find, and then find 13. You count this time the number of searches is almost half of the previous singly linked list, which greatly saves the query time. What if I take another layer up?

[External link image transfer failed. The source site may have an anti-leech link mechanism. It is recommended to save the image and upload it directly (img-EOwpcGLF-1615940744294)(sortedset.assets/1615907763389.png)]

According to the rule just now, we extract one more layer. Is the number of searches this time reduced again? In fact, this data structure is the storage structure of the "jump table". In fact, you can find that its query performance is comparable to the red-black tree, but it is much simpler to implement than the red-black tree.

SortedSet low-level implementation

The bottom layer of SortedSet uses two storage structures: Ziplist compressed list and "jump table". There are two configurations in the Redis configuration file:

  • zset-max-ziplist-entries 128: When zset uses a compressed list, the maximum number of elements. The default value is 128.
  • zset-max-ziplist-value 64: When zset uses a compressed list, the maximum string length of each element. The default value is 64.

When zset inserts the first element, it will judge the following two conditions, whether the value of zset-max-ziplist-entries is equal to 0; zset-max-ziplist-value is less than the length of the string to be inserted, and Redis will satisfy any condition The skip list will be used as the bottom implementation, otherwise the compressed list will be used as the bottom implementation. See the source code: t_zset.c

void zaddGenericCommand(client *c, int flags) {
    
    
 ...省略...
 if (zobj == NULL) {
    
    
        if (xx) goto reply_to_client; /* No key + XX option: nothing to do. */
        if (server.zset_max_ziplist_entries == 0 ||
            server.zset_max_ziplist_value < sdslen(c->argv[scoreidx+1]->ptr))
        {
    
    
            zobj = createZsetObject();/ *创建跳跃表*/
        } else {
    
    
            zobj = createZsetZiplistObject(); / *创建压缩列表 */
        }
        dbAdd(c->db,key,zobj);
    }
}

Under normal circumstances, zset-max-ziplist-entries will not be configured to 0, and the string length of the element will not be too long, so when creating an ordered set, the underlying implementation of the compressed list is used by default. When a new element is inserted into zset, the following two conditions will be judged: the number of elements in zset is greater than zset_max_ziplist_entries; the string length of the inserted element is greater than zset_max_ziplist_value. When any condition is met, Redis will convert the underlying implementation of zset from a compressed list to a jump list, see the zsetAdd function in t_zset.c

if (zzlLength(zobj->ptr) > server.zset_max_ziplist_entries ||
                sdslen(ele) > server.zset_max_ziplist_value)
     zsetConvert(zobj,OBJ_ENCODING_SKIPLIST);/* 转跳跃表 */

It is worth noting that after zset is converted to a skip list, even if the elements are gradually deleted, it will not be converted to a compressed list again.

Structure of skiplist

The hop table mainly consists of: hop table node, head node, tail node, node number, and maximum node level, as follows: See server.h for source code

typedef struct zskiplist {
    
    
    struct zskiplistNode *header, *tail;//跳表节点 ,头节点 , 尾节点
    unsigned long length;//节点数量
    int level;//目前表内节点的最大层数
} zskiplist;

typedef struct zset {
    
    
    dict *dict;
    zskiplist *zsl;
} zset;

Explanation:

  • header: points to the head node of the jump table, the head node is a special node of the jump table, and the number of elements in the level array is 64. The head node does not store any member and score values ​​in the ordered set, the ele value is NULL, and the score value is 0; it is not included in the total length of the jump table. When the head node is initialized, the forward of 64 elements all point to NULL, and the span value is all 0.
  • tail: point to the tail node of the jump list
  • length: the length of the hop table, indicating the total number of nodes except the head node
  • level: The height of the largest node in the jump table.

The structure of zskiplist is shown in the figure:
Insert picture description here

zskiplistNode structure

//跳表节点
typedef struct zskiplistNode {
    
    
    sds ele;//用于存储字符串类型的数据
    double score;//分值
    struct zskiplistNode *backward;//后向指针
    struct zskiplistLevel {
    
    //节点所在的层
        struct zskiplistNode *forward;//前向指针
        unsigned int span;//该层向前跨越的节点数量
    } level[];  //节点层结构 数组,每次创建一个跳表节点时,都会随机生成一个[1,32]之间的值作为level数组的大小。
} zskiplistNode;

Explanation:

  • ele: used to store string type data

  • backward: backward pointer, it can only point to the previous node at the bottom of the current node, the head node and the first node-backward points to NULL, which is used when traversing the jump table from backward to forward.

  • score: used to store the sorted score

  • level: It is a flexible array. The length of the array of each node is different. When generating the jump table node, a value from 1 to 64 is randomly generated. The larger the value, the lower the probability of occurrence.

    • forward: point to the next node in this layer, and the forward of the tail node points to NULL.
    • span: The number of elements between the node pointed to by the forward and this node. The larger the span value, the more nodes will be skipped

The ele of each node of the jump table stores the member value of the ordered set, and the score stores the member score value. The scores of all nodes are sorted from small to large. When the scores of the members of the ordered set are the same, the nodes will be sorted in the lexicographical order of the members.

The article is over, I hope it will help you, if you like it, please give a good comment! ! !

Guess you like

Origin blog.csdn.net/u014494148/article/details/114915016