Introduction and implementation of distributed locks (with source code)

In many scenarios, in order to ensure the final consistency of data, we need many technical solutions to support it, such as distributed transactions and distributed locks. So what exactly is a distributed lock, what business scenarios are distributed locks used in, and how to implement distributed locks?

1. Why use distributed locks

When we are developing an application, if we need to perform multi-threaded synchronous access to a shared variable, we can use the lock we learned to handle it.

However, ordinary locks can only be used on a single machine. With the development of the business, the number of users is increasing, and the number of visits has increased significantly. We need to build a cluster. An application needs to be deployed to several machines and then load balanced, as shown in the figure below: As you can see in the figure above,
Simple Cluster Diagram
variables A exists in three server memories (this variable A is mainly reflected in a member variable in a class, which is a stateful object), if no control is added, variable A will allocate a piece of memory at the same time, three requests Send it over and operate on this variable at the same time, obviously the result is wrong! Even if they are not sent at the same time, the three requests operate on data in three different memory areas respectively, there is no sharing between variables A, and there is no visibility, and the processing result is also wrong.

If this scenario does exist in our business, we need a way to solve this problem.

In order to ensure that a method or property can only be executed by the same thread at the same time under high concurrency, in the case of single-machine deployment of traditional monolithic applications, functions related to concurrent processing can be used for mutual exclusion control. However, with the needs of business development, after the original single-machine deployment system is evolved into a distributed cluster system, because the distributed system is multi-threaded, multi-process and distributed on different machines, this will make the original stand-alone deployment situation concurrent The control lock strategy fails, and simple applications cannot provide the ability to distribute locks. In order to solve this problem, a cross-machine mutual exclusion mechanism is needed to control access to shared resources, which is the problem to be solved by distributed locks.

2. What conditions should a distributed lock have?

Before analyzing the three implementations of distributed locks, first understand what conditions distributed locks should have:

1. In a distributed system environment, a method can only be executed by one thread of one machine at a time;

2. Highly available lock acquisition and release;

3. High-performance acquisition and release of locks;

4. It has reentrant characteristics;

5. Equipped with lock failure mechanism to prevent deadlock;

6. It has the feature of non-blocking lock, that is, if the lock is not acquired, it will directly return the failure to acquire the lock.

3. Common implementations of distributed locks

At present, almost many large-scale websites and applications are deployed in a distributed manner. Data consistency in distributed scenarios has always been an important topic. The distributed CAP theory tells us that "any distributed system cannot satisfy Consistency, Availability, and Partition tolerance at the same time, and can only satisfy two at the same time." Therefore, many systems are in At the beginning of the design, it is necessary to make a trade-off between these three. In most scenarios in the Internet field, it is necessary to sacrifice strong consistency in exchange for high availability of the system. The system often only needs to ensure "final consistency", as long as the final time is within the acceptable range of users.

In many scenarios, in order to ensure the final consistency of data, we need many technical solutions to support it, such as distributed transactions and distributed locks. Sometimes, we need to ensure that a method can only be executed by the same thread at the same time.

1. Realize distributed lock based on database;

2. Implement distributed locks based on cache (Redis, etc.);

3. Implement distributed locks based on Zookeeper;

4. Database-based implementation

The core idea of ​​the database-based implementation is: create a table in the database, which contains fields such as the method name, and create a unique index on the method name field. If you want to execute a method, use this method name to enter the table Insert data, acquire the lock if the insertion is successful, and delete the corresponding row data to release the lock after the execution is completed.

1. Implementation steps:

(1) Create a table

CREATE TABLE `method_lock` (
  `id` int unsigned NOT NULL AUTO_INCREMENT COMMENT '主键',
  `method_name` varchar(64) NOT NULL COMMENT '锁定的方法名',
  `desc` varchar(255) NOT NULL COMMENT '备注信息',
  `update_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT '更新时间',
  PRIMARY KEY (`id`),
  UNIQUE KEY `uidx_method_name` (`method_name`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8mb3 COMMENT='锁定中的方法';

insert image description here
(2) If you want to execute a certain method, use this method name to insert data into the table

INSERT INTO method_lock (method_name, desc) VALUES ('methodName', '测试的methodName');

Because we have a unique constraint on the method_name, if there are multiple requests submitted to the database at the same time, the database will ensure that only one operation can succeed, then we can think that the thread with the successful operation has obtained the lock of the method and can execute Method body content.
(3) If the insertion is successful, the lock is acquired, and the corresponding row data is deleted after the execution is completed to release the lock

delete from method_lock where method_name ='methodName';

Note: This is just a method based on the database, there are many other ways to implement distributed locks using the database

2. Disadvantages of using database to implement distributed lock

It is very simple to use this database-based implementation, but there are some problems that need to be solved and optimized for the conditions that distributed locks should have:

1. Because it is implemented based on the database, the availability and performance of the database will directly affect the availability and performance of the distributed lock. Therefore, the database requires dual-machine deployment, data synchronization, and active-standby switching;

2. It does not have the feature of reentrancy, because the row data still exists before the same thread releases the lock, and the data cannot be successfully inserted again. Therefore, a new column needs to be added to the table to record the machine and the lock currently acquired. Thread information, when acquiring the lock again, first check whether the machine and thread information in the table are the same as the current machine and thread, and if they are the same, directly acquire the lock;

3. There is no lock failure mechanism, because it is possible that after the data is successfully inserted, the server goes down and the corresponding data is not deleted. When the service is restored, the lock cannot be obtained, so a new column needs to be added to the table for Record the invalidation time, and need to have a scheduled task to clear these invalidated data;

4. It does not have the feature of blocking locks. If the lock cannot be acquired, it will directly return to failure. Therefore, it is necessary to optimize the acquisition logic and loop multiple times to acquire it.

5. Various problems will be encountered during the implementation process. In order to solve these problems, the implementation method will become more and more complicated; relying on the database requires certain resource overhead, and performance issues need to be considered.

Five, Redis-based implementation

1. Reasons for using redis to implement distributed locks:

(1) Redis has high performance;

(2) The Redis command supports this better, and it is more convenient to implement;

2. Introduction to using commands:

(1) SETNX
SETNX key val: If and only if the key does not exist, set a string whose key is val and return 1; if the key exists, do nothing and return 0.

(2) expire
expire key timeout: set a timeout time for the key, the unit is second, beyond this time the lock will be released automatically to avoid deadlock.

(3) delete
delete key: delete key

When using Redis to implement distributed locks, these three commands are mainly used.

3. Realize ideas

(1) When acquiring the lock, use setnx to add the lock, and use the expire command to add a timeout period for the lock. After this time, the lock will be automatically released. The value of the lock is a randomly generated UUID. Through this, when releasing the lock judge.

(2) When acquiring the lock, an acquisition timeout is also set. If this time is exceeded, the acquisition of the lock will be abandoned.

(3) When releasing the lock, judge whether it is the lock by UUID. If it is the lock, execute delete to release the lock.

4. Simple implementation code of distributed lock

/**
 * 通过【Redis】实现的分布式锁,仅供演示参考
 */
@Slf4j
public class RedisLock implements AutoCloseable {
    
    

    /**
     * 拓展:可以使用jedis、redission等实现分布式锁
     */
    private final RedisTemplate redisTemplate;
    private final String key;
    private final String value;
    /**
     * 单位:秒
     */
    private final int expireTime;

    public RedisLock(RedisTemplate redisTemplate, String key, int expireTime) {
    
    
        this.redisTemplate = redisTemplate;
        this.key = key;
        this.expireTime = expireTime;
        this.value = UUID.randomUUID().toString();
    }

    /**
     * 获取分布式锁
     */
    public Boolean getLock() {
    
    
        RedisCallback<Boolean> redisCallback = connection -> {
    
    
            //设置NX
            RedisStringCommands.SetOption setOption = RedisStringCommands.SetOption.ifAbsent();
            //设置过期时间
            Expiration expiration = Expiration.seconds(expireTime);
            //序列化key
            byte[] redisKey = redisTemplate.getKeySerializer().serialize(key);
            //序列化value
            byte[] redisValue = redisTemplate.getValueSerializer().serialize(value);
            //执行setnx操作
            return connection.set(redisKey, redisValue, expiration, setOption);
        };

        //获取分布式锁
        return (Boolean) redisTemplate.execute(redisCallback);
    }

    /**
     * 解除分布式锁
     * Lua脚本含义说明:
     * 假设key为redisKey,即get获取key为redisKey的value值,等于传入的value值的话,删除这个redisKey的内容。
     * 在这里即获取redisKey对应的uuid,如果获取的uuid值与传入的uuid相等的话,就删除redisKey对应的数据,达到释放锁的目的。
     */
    public Boolean unLock() {
    
    
        String script = "if redis.call(\"get\",KEYS[1]) == ARGV[1] then\n" +
                "    return redis.call(\"del\",KEYS[1])\n" +
                "else\n" +
                "    return 0\n" +
                "end";
        RedisScript<Boolean> redisScript = RedisScript.of(script, Boolean.class);
        List<String> keys = Collections.singletonList(key);

        Boolean result = (Boolean) redisTemplate.execute(redisScript, keys, value);
        log.info("释放锁的结果:{}", result);
        return result;
    }

    /**
     * 关闭资源
     */
    @Override
    public void close() {
    
    
        unLock();
    }
}

6. Implementation based on ZooKeeper

ZooKeeper is an open source component that provides consistent services for distributed applications. Inside it is a hierarchical file system directory tree structure, which stipulates that there can only be one unique file name in the same directory.

1. The steps to implement distributed locks based on ZooKeeper are as follows:

(1) Create a directory mylock;

(2) If thread A wants to acquire the lock, it creates a temporary sequential node in the mylock directory;

(3) Get all the child nodes in the mylock directory, and then get the sibling nodes smaller than yourself. If it does not exist, it means that the current thread sequence number is the smallest, and the lock is obtained;

(4) Thread B obtains all nodes, judges that it is not the smallest node, and sets to monitor the second smaller node than itself;

(5) Thread A finishes processing, deletes its own node, and thread B listens to the change event, judges whether it is the smallest node, and acquires the lock if it is.

Here we recommend Curator, an Apache open source library, which is a ZooKeeper client. The InterProcessMutex provided by Curator is an implementation of distributed locks. The acquire method is used to acquire locks, and the release method is used to release locks.

Advantages: It has the characteristics of high availability, reentrancy, and blocking locks, which can solve the problem of invalid deadlocks.

Disadvantages: Because of the need to frequently create and delete nodes, the performance is not as good as the Redis method.

7. Summary

The above three implementation methods are not perfect in all occasions, so the most suitable implementation method should be selected according to different application scenarios.

In a distributed environment, it is sometimes very important to lock resources, such as snapping up a certain resource. At this time, using distributed locks can control resources well.

Of course, in specific use, many factors need to be considered, such as the selection of timeout time and the selection of lock acquisition time, which have a great impact on the amount of concurrency. The distributed lock implemented above is only a simple implementation, mainly An idea, the specific implementation needs to be refined in combination with business scenarios and actual needs .

Eight, implement the code

The partial implementation of the code is better than the implementation method described in the article, and has a high reference value, which can greatly enhance the understanding of distributed locks for small partners.

If I can help you, please give me a star, thank you.

=> Distributed lock implementation source link

Guess you like

Origin blog.csdn.net/qq_41378597/article/details/123384266