Redis of Learning Sharing (Phase 1): Using Hash Type to Save Memory

opening

The previous shared content is relatively scattered knowledge points, not systematic. In the future weekly sharing, I will try my best to connect each article, so I decided to make a column called "Learning Sharing". This is the first in the series.

"Learning Sharing" is published every Monday or Tuesday. Most of these contents come from my notes in the usual study process. The note warehouse is on Github: studeyang/technotes . Among them, the content that I think is in-depth and helpful to my work will be published in this column in the form of an article. The content will first appear on my official account , Nuggets and Toutiao , and will also be maintained on Github: studeyang/leanrning-share .

review

In the previous article "Redis's String type, it takes up so much memory" , we used the String type to store the image ID and the image storage object ID, and found that the two Long type IDs occupied 68 bytes of memory. For the specific verification process, I will post it for your review.

1. View the initial memory usage of Redis.

127.0.0.1:6379> info memory
# Memory
used_memory:871840

2. Then insert 10 pieces of data.

10.118.32.170:0> set 1101000060 3302000080
10.118.32.170:0> set 1101000061 3302000081
10.118.32.170:0> set 1101000062 3302000082
10.118.32.170:0> set 1101000063 3302000083
10.118.32.170:0> set 1101000064 3302000084
10.118.32.170:0> set 1101000065 3302000085
10.118.32.170:0> set 1101000066 3302000086
10.118.32.170:0> set 1101000067 3302000087
10.118.32.170:0> set 1101000068 3302000088
10.118.32.170:0> set 1101000069 3302000089

3. Check the memory again.

127.0.0.1:6379> info memory
# Memory
used_memory:872528

It can be seen that 688 bytes of memory are used to store 10 pictures. A picture ID and picture storage object ID record uses an average of 68 bytes.

This is the scene we talked about last time.

And there is still a thought question: since the String type takes up so much memory, do you have a good solution to save memory?

Today, let's talk about it in detail.

What data structure can be used to save memory?

Redis provides a very memory-efficient data structure called a compressed list (ziplist). It is a sequential data structure composed of a series of specially encoded contiguous memory blocks. A compressed list can contain multiple nodes, and each node can store a byte array or an integer value.

The meanings of each part of the compression list are as follows.

  • zlbytes: Indicates the number of bytes of memory occupied by the compressed list.
  • zltail: Indicates how many bytes the tail node of the compressed list table is from the starting address.
  • zllen: Indicates the number of nodes contained in the compressed list.
  • entry: Each node of the compressed list.
  • zlend: Special value 0xFF(decimal 255) used to mark the end of a compressed list.

For example, zlbytesthe value of the compressed list 0x50is (80 in decimal), which means that the compressed list occupies 80 bytes; the zltailvalue 0x3c(60 in decimal), means that if there is a pointer pointing to the starting address of the compressed list p, then just use the pointer pto add Offset 60, you can calculate entry3the address of the end node of the list; zllenthe value 0x3(decimal is 3), means that the compressed list has three nodes.

The reason why the compressed list can save memory is that it uses a series of consecutive entries to save data. The metadata of each entry includes the following parts.

  • prevlen, indicating the length of the previous entry. prev_len has two values: 1 byte or 5 bytes. If the length of the previous entry is less than 254 bytes, take the value of 1 byte; otherwise, take the value of 5 bytes;

  • encoding: Indicates the encoding method, 1 byte;

  • len: Indicates its own length, 4 bytes;

  • data: save the actual data.

Due to the memory-saving characteristics of ziplist, the underlying implementation of hash key (Hash), list key (List) and ordered set key (Sorted Set) initialization all use ziplist.

Let's see if we can use the Sorted Set type to save.

First of all, when using the Sorted Set type to save data, the first problem we face is: in the case where a key corresponds to a value, how should we use the set type to save this single-value key-value pair?

We know that the elements of Sorted Set have member value and score value, we can split the picture ID into two parts for saving. The specific method is to use the first 7 digits of the picture ID as the key of the Sorted Set, the last 3 digits of the picture ID as the member value, and the ID of the picture storage object as the score value.

When there are fewer elements in the Sorted Set, Redis will use a compressed list for storage, which can save memory space. However, when inserting data, the Sorted Set needs to be sorted by the size of the score value, and its performance is poor.

Therefore, although the Sorted Set type can be used to save the picture ID and picture storage object ID, it is not the best option.

What about the List type?

The List type is not very suitable for the one-to-one scenario of storing picture ID and picture storage object ID. We can use the Hash type.

Use the Hash type

Still use the above method of splitting and saving in two parts, use the first 7 digits of the picture ID as the key of the Hash set, and use the last 3 digits of the picture ID as the value of the Hash set.

For the data 060, the corresponding code 11000000 will be selected; similarly, the code corresponding to the data 3302000080 is 11100000.

Why is the corresponding code this? Not very clear here? It doesn't matter, this does not affect your understanding of the content of this article. If you are interested, you can check the source code yourself.

Some entries store the last 3 digits (4 bytes) of a picture ID, and some entries store the storage object ID (8 bytes). At this time, the prev_len of each entry only needs 1 byte, because each The length of the previous entry of the entry is less than 254 bytes. In this way, the memory size occupied by the last 3 bits of a picture ID is 8 bytes (1+1+4+4); the memory size occupied by a storage object ID is 14 bytes (1+1+4+8 =14), the actual allocation of 16 bytes.

The memory occupied by 10 pictures is: ziplist 4(zlbytes) + 4(zltail) + 2(zllen) + 8*10(entry) + 16*10(entry) + 1(zlend) = 251 bytes.

Combined with the global hash table, the memory usage of each part is as follows:

10 images occupy 32(dictEntry) + 8(key) + 16(redisObject) + 251 = 307 bytes.

This saves twice the memory compared to storing the result of the String type 688 .

We also verify it through the following actual combat.

127.0.0.1:6379> info memory
# Memory
used_memory:871872
127.0.0.1:6379> hset 1101000 060 3302000080 061 3302000081 ...
(integer) 1
127.0.0.1:6379> info memory
# Memory
used_memory:872152

280 bytes are actually used.

However, you may ask here, must the picture ID 1101000060 be folded into 7+3, that is, 1101000+060? Split it into 5+5, that is, 11010+00060, okay?

Do you have to store keys in 7+3?

The answer is yes.

Two underlying data structures of the Redis Hash type, one is a compressed list and the other is a hash table. The Hash type sets the threshold for saving data in the compression list. Once the threshold is exceeded, the Hash type will use the hash table to save the data.

If the number of elements we write in the Hash collection exceeds hash-max-ziplist-entries(default 512), or the size of a single element written exceeds hash-max-ziplist-value(default 64 bytes), Redis will automatically convert the implementation structure of the Hash type from a compressed list to for the hash table. In terms of saving memory, hash tables are not as efficient as compressed lists.

In order to use the compressed list to save memory, we generally need to control the number of elements stored in the Hash collection. Therefore, we only use the last 3 digits of the image ID as the key of the Hash collection, which ensures that the number of elements in the Hash collection does not exceed 1000. At the same time, we set it to 1000, so that the Hash collection can always use hash-max-ziplist-entriescompression list to save memory space.

References

  • For some commands in this article, refer to the rookie tutorial: https://www.runoob.com/redis/redis-tutorial.html
  • Geek Time "Redis Core Technology and Practical Combat"
  • Book "Redis Design and Implementation"
  • Compressed list: https://redisbook.readthedocs.io/en/latest/compress-datastruct/ziplist.html
  • Hash table: http://redisbook.com/preview/object/hash.html

related articles

Maybe you are also interested in the following article.

Guess you like

Origin blog.csdn.net/yang237061644/article/details/128911854