A comprehensive explanation LRU algorithm

Conceptual understanding

1.LRU is Least Recently Used acronym, that is the least recently used page replacement algorithm for the virtual page storage management services is to make decisions based on the use of the paged memory. Unable to predict future usage of each page, can only use the "recent past" as the "near future" approximation, therefore, LRU algorithm is to be phased out page least recently used.

2. The operating system curriculum has learned, is not enough memory in the scene, out of the old content strategy. LRU ... Least Recent Used, eliminate the least frequently used. Little more than two can be added, because the computer architecture, the largest and most reliable storage is a hard disk, it is a great capacity, and the content may be cured, but the access speed is slow, so it is necessary to use the contents loaded into memory; memory fast, but the capacity is limited, and the content is lost after power failure, and to further improve the performance of the concept, as well as inside the CPU L1 Cache, L2 Cache like. Because the place faster, the higher its unit costs, smaller capacity, new content constantly being loaded, old content will certainly have to be eliminated, so there is the use of such a background.

LRU principle

You can use a special stack to save the page numbers of the pages currently in use. When a new process to access a page, the page number will be pressed into the top of the stack, the other page numbers to the bottom of the stack shift, if not enough memory, remove the bottom of the page number will stack. In this way, the stack is always the most recently visited page number, and the bottom of the stack is the page number of least recently accessed pages.

In general standard textbooks in the operating system, will be used to demonstrate the following manner LRU principle, assuming only accommodate three memory page size, in the order of access pages 70120304 in. Suppose the stack memory in accordance with the access time manner described in the above, a recently accessed, the following is the furthest time of access, the LRU work this way.

Here Insert Picture Description
But if we own a design based LRU cache, so many possible design problems, this memory are sorted according to access time, there will be a lot of memory copy operation, so performance is certainly not acceptable.

So how do you design a LRU cache, making the insertion and removal of both O (1), we need to access order to maintain them, but can not be sorted in memory of a real reaction, there is a solution is to use a doubly linked list.

Realize LRU based HashMap and doubly linked lists

Java in LinkedHashmap of hash chain has been well achieved, to note that this is not thread safe, in order to achieve thread-safe, we need to add the synchronized modifier.

The overall design concept that may be used HashMap stored key, this can be done save time and get key are O (1), the Value and HashMap directed LRU doubly linked list of nodes Node implemented, as shown in FIG.
Here Insert Picture Description
RU storage is achieved based on a doubly linked list, the following diagram illustrates how it works. Wherein h represents doubly linked list header, t representative of the tail. First, pre-LRU set capacity, if the memory is full, can be eliminated by O (1) time off the tail of a doubly linked list of every new and accessing data, it can increase efficiency by O (1) of the new node head, or the mobile node to an existing team head.

The following shows, the default size, change in the process of storing and accessing stored LRU 3. To simplify FIG complexity, are not shown in FIG HashMap change portion, the figure only illustrates changes LRU doubly linked list. We LRU cache on this sequence of operations is as follows:

save(“key1”, 7)

save(“key2”, 0)

save(“key3”, 1)

save(“key4”, 2)

get(“key2”)

save(“key5”, 3)

get(“key2”)

save(“key6”, 4)

LRU doubly linked list corresponding portion changes as follows:

Here Insert Picture Description
To sum up the core operating steps of:

  1. save (key, value), corresponding to the first node is found in the HashMap Key, if a node exists, the node updates the value, and the movement of this node queue head. If there is no need to construct a new node, the node and try stuffed team head, LRU if space is insufficient, left behind by the tail out of the end node, while removing Key in the HashMap.
  2. get(key),通过 HashMap 找到 LRU 链表节点,把节点插入到队头,返回缓存的值。

完整基于 Java 的代码参考如下

class DLinkedNode {
	String key;
	int value;
	DLinkedNode pre;
	DLinkedNode post;
}

LRU Cache

public class LRUCache {
   
    private Hashtable<Integer, DLinkedNode>
            cache = new Hashtable<Integer, DLinkedNode>();
    private int count;
    private int capacity;
    private DLinkedNode head, tail;
 
    public LRUCache(int capacity) {
        this.count = 0;
        this.capacity = capacity;
 
        head = new DLinkedNode();
        head.pre = null;
 
        tail = new DLinkedNode();
        tail.post = null;
 
        head.post = tail;
        tail.pre = head;
    }
 
    public int get(String key) {
 
        DLinkedNode node = cache.get(key);
        if(node == null){
            return -1; // should raise exception here.
        }
 
        // move the accessed node to the head;
        this.moveToHead(node);
 
        return node.value;
    }
 
 
    public void set(String key, int value) {
        DLinkedNode node = cache.get(key);
 
        if(node == null){
 
            DLinkedNode newNode = new DLinkedNode();
            newNode.key = key;
            newNode.value = value;
 
            this.cache.put(key, newNode);
            this.addNode(newNode);
 
            ++count;
 
            if(count > capacity){
                // pop the tail
                DLinkedNode tail = this.popTail();
                this.cache.remove(tail.key);
                --count;
            }
        }else{
            // update the value.
            node.value = value;
            this.moveToHead(node);
        }
    }
    /**
     * Always add the new node right after head;
     */
    private void addNode(DLinkedNode node){
        node.pre = head;
        node.post = head.post;
 
        head.post.pre = node;
        head.post = node;
    }
 
    /**
     * Remove an existing node from the linked list.
     */
    private void removeNode(DLinkedNode node){
        DLinkedNode pre = node.pre;
        DLinkedNode post = node.post;
 
        pre.post = post;
        post.pre = pre;
    }
 
    /**
     * Move certain node in between to the head.
     */
    private void moveToHead(DLinkedNode node){
        this.removeNode(node);
        this.addNode(node);
    }
 
    // pop the current tail.
    private DLinkedNode popTail(){
        DLinkedNode res = tail.pre;
        this.removeNode(res);
        return res;
    }
}

Redis的LRU实现

如果按照HashMap和双向链表实现,需要额外的存储存放 next 和 prev 指针,牺牲比较大的存储空间,显然是不划算的。所以Redis采用了一个近似的做法,就是随机取出若干个key,然后按照访问时间排序后,淘汰掉最不经常使用的,具体分析如下:

为了支持LRU,Redis 2.8.19中使用了一个全局的LRU时钟,server.lruclock,定义如下,

#define REDIS_LRU_BITS 24
unsigned lruclock:REDIS_LRU_BITS; /* Clock for LRU eviction */

默认的LRU时钟的分辨率是1秒,可以通过改变REDIS_LRU_CLOCK_RESOLUTION宏的值来改变,Redis会在serverCron()中调用updateLRUClock定期的更新LRU时钟,更新的频率和hz参数有关,默认为100ms一次,如下,

#define REDIS_LRU_CLOCK_MAX ((1<<REDIS_LRU_BITS)-1) /* Max value of obj->lru */
#define REDIS_LRU_CLOCK_RESOLUTION 1 /* LRU clock resolution in seconds */
 
void updateLRUClock(void) {
    server.lruclock = (server.unixtime / REDIS_LRU_CLOCK_RESOLUTION) &
                                                REDIS_LRU_CLOCK_MAX;
}

server.unixtime是系统当前的unix时间戳,当 lruclock 的值超出REDIS_LRU_CLOCK_MAX时,会从头开始计算,所以在计算一个key的最长没有访问时间时,可能key本身保存的lru访问时间会比当前的lrulock还要大,这个时候需要计算额外时间,如下,

/* Given an object returns the min number of seconds the object was never
 * requested, using an approximated LRU algorithm. */
unsigned long estimateObjectIdleTime(robj *o) {
    if (server.lruclock >= o->lru) {
        return (server.lruclock - o->lru) * REDIS_LRU_CLOCK_RESOLUTION;
    } else {
        return ((REDIS_LRU_CLOCK_MAX - o->lru) + server.lruclock) *
                    REDIS_LRU_CLOCK_RESOLUTION;
    }
}

Redis支持和LRU相关淘汰策略包括,

  • volatile-lru 设置了过期时间的key参与近似的lru淘汰策略
  • allkeys-lru 所有的key均参与近似的lru淘汰策略

当进行LRU淘汰时,Redis按如下方式进行的,

......
            /* volatile-lru and allkeys-lru policy */
            else if (server.maxmemory_policy == REDIS_MAXMEMORY_ALLKEYS_LRU ||
                server.maxmemory_policy == REDIS_MAXMEMORY_VOLATILE_LRU)
            {
                for (k = 0; k < server.maxmemory_samples; k++) {
                    sds thiskey;
                    long thisval;
                    robj *o;
 
                    de = dictGetRandomKey(dict);
                    thiskey = dictGetKey(de);
                    /* When policy is volatile-lru we need an additional lookup
                     * to locate the real key, as dict is set to db->expires. */
                    if (server.maxmemory_policy == REDIS_MAXMEMORY_VOLATILE_LRU)
                        de = dictFind(db->dict, thiskey);
                    o = dictGetVal(de);
                    thisval = estimateObjectIdleTime(o);
 
                    /* Higher idle time is better candidate for deletion */
                    if (bestkey == NULL || thisval > bestval) {
                        bestkey = thiskey;
                        bestval = thisval;
                    }
                }
            }
            ......

Redis会基于server.maxmemory_samples配置选取固定数目的key,然后比较它们的lru访问时间,然后淘汰最近最久没有访问的key,maxmemory_samples的值越大,Redis的近似LRU算法就越接近于严格LRU算法,但是相应消耗也变高,对性能有一定影响,样本值默认为5。

实际运用

1. LRU算法也可以用于一些实际的应用中,如你要做一个浏览器,或类似于淘宝客户端的应用的就要用到这个原理。大家都知道浏览器在浏览网页的时候会把下载的图片临时保存在本机的一个文件夹里,下次再访问时就会,直接从本机临时文件夹里读取。但保存图片的临时文件夹是有一定容量限制的,如果你浏览的网页太多,就会一些你最不常使用的图像删除掉,只保留最近最久使用的一些图片。这时就可以用到LRU算法 了,这时上面算法里的这个特殊的栈就不是保存页面的序号了,而是每个图片的序号或大小;所以上面这个栈的元素都用Object类来表示,这样的话这个栈就可以保存的对像了。

2.漫画理解

Here Insert Picture Description
Here Insert Picture Description
Here Insert Picture Description
用户信息当然是存在数据库里。但是由于我们对用户系统的性能要求比较高,显然不能每一次请求都去查询数据库。

所以,在内存中创建了一个哈希表作为缓存,每次查找一个用户的时候先在哈希表中查询,以此提高访问性能。
Here Insert Picture Description
问题出现:
Here Insert Picture Description
解决:
Here Insert Picture Description
Here Insert Picture Description
Here Insert Picture Description
Here Insert Picture Description
Here Insert Picture Description
什么是哈希链表呢?

我们都知道,哈希表是由若干个Key-Value所组成。在“逻辑”上,这些Key-Value是无所谓排列顺序的,谁先谁后都一样。

在哈希链表当中,这些Key-Value不再是彼此无关的存在,而是被一个链条串了起来。每一个Key-Value都具有它的前驱Key-Value、后继Key-Value,就像双向链表中的节点一样。

这样一来,原本无序的哈希表拥有了固定的排列顺序。
Here Insert Picture Description
Here Insert Picture Description
让我们以用户信息的需求为例,来演示一下LRU算法的基本思路:

1.假设我们使用哈希链表来缓存用户信息,目前缓存了4个用户,这4个用户是按照时间顺序依次从链表右端插入的。

2.此时,业务方访问用户5,由于哈希链表中没有用户5的数据,我们从数据库中读取出来,插入到缓存当中。这时候,链表中最右端是最新访问到的用户5,最左端是最近最少访问的用户1。

3. Next, the business side access to the user 2, user 2 data hash chain exists, how do we do? We put the user 2 is removed from the node between its predecessor and successor nodes, re-inserted into the list at the far right. At this time, the list becomes the extreme right users access to the latest 2, most users still left a least recently accessed.
Here Insert Picture Description
4. Next, the user service requests a modification of the 4 information. By the same token, we have 4 users move from their original location to the far right of the list, and the value of the user information updates. At this time, the list is up to date access to the far right of the user 4, the far left is still the least recently accessed by the user 1.

Here Insert Picture Description
5. Later, the business side to change the taste, and access user 6, user 6 is not in the cache, you need to be inserted into the hash chain. Assuming this time buffer capacity has reached the limit, you must delete the least recently accessed data, then the hash chain is located in the far left of the user 1 will be deleted, and then inserted into the far right of the user 6.
Practical solution: the use of redis cache, here is understand the principle!

Reference article:
https://zhuanlan.zhihu.com/p/34133067
https://kuaibao.qq.com/s/20181105B0AS5O00?refer=cp_1026
https://blog.csdn.net/luoweifu/article/details/8297084

Published 107 original articles · won praise 14 · views 40000 +

Guess you like

Origin blog.csdn.net/belongtocode/article/details/102989685