Database\redis\zookeeper implements distributed locks

   This article is excerpted from the following 2 blog posts

            http://www.hollischuang.com/archives/1716

            https://blog.csdn.net/u010963948/article/details/79006572

    Distributed system: Take the data-based distributed system as an example, with the increasing amount of data, the single or master-slave data storage server can no longer meet the production requirements, so consider breaking up the data according to a certain algorithm and storing it in on different servers.

    Distributed lock: In a distributed database, if a shared resource is accessed, it is required that only a certain method of one thread access the data at the same time to ensure data consistency and avoid concurrency problems. In this scenario, the table-level locks and row-level locks of the previous database cannot meet the requirements, and a cross-database lock, that is, a distributed lock, needs to be designed. The core principle of distributed locks is to use a physically global object to control access to resources in a distributed system. For example, a row of records in a database table, a file, an object in a cached database, an object in memory, etc., such as database, redis, memorycache, zookeeper, etc.

    Data consistency in distributed scenarios has always been an important topic. The distributed CAP theory tells us that "no distributed system can satisfy Consistency, Availability and Partition tolerance at the same time, at most two of them can be satisfied at the same time." Therefore, many systems in At the beginning of the design, it is necessary to make a choice between these three. In the vast majority of scenarios in the Internet field, strong consistency needs to be sacrificed in exchange for high system availability. The system often only needs to ensure "eventual consistency", as long as the final time is within the range acceptable to users.

    In many scenarios, in order to ensure the eventual consistency of data, many technical solutions are required, such as distributed transactions and distributed locks. Sometimes, we need to ensure that a method can only be executed by the same thread at the same time. In a stand-alone environment, Java actually provides many APIs related to concurrent processing, but these APIs are powerless in distributed scenarios. That is to say, pure Java Api cannot provide the ability of distributed lock. Therefore, there are currently a variety of solutions for the realization of distributed locks.

    For the implementation of distributed locks, the following schemes are commonly used:

基于数据库实现分布式锁 

基于缓存(redis,memcached)实现分布式锁

基于Zookeeper实现分布式锁

    Before analyzing these implementation schemes, let's think about what the distributed locks we need should be like? (The method lock is used as an example here, and the same is true for resource locks)

可以保证在分布式部署的应用集群中,同一个方法在同一时间只能被一台机器上的一个线程执行。

这把锁要是一把可重入锁(避免死锁)

这把锁最好是一把阻塞锁(根据业务需求考虑要不要这条)

有高可用的获取锁和释放锁功能

获取锁和释放锁的性能要好

Distributed lock based on database

Based on database table

To implement distributed locks, the easiest way may be to create a lock table directly, and then do so by manipulating the data in the table.

When we want to lock a method or resource, we add a record to the table, and delete this record when we want to release the lock.

Create a database table like this:

CREATE TABLE `methodLock` ( `id` int(11) NOT NULL AUTO_INCREMENT COMMENT '主键', `method_name` varchar(64) NOT NULL DEFAULT '' COMMENT '锁定的方法名', `desc` varchar(1024) NOT NULL DEFAULT '备注信息', `update_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT '保存数据时间,自动生成', PRIMARY KEY (`id`), UNIQUE KEY `uidx_method_name` (`method_name `) USING BTREE ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='锁定中的方法';

When we want to lock a method, execute the following SQL:

insert into methodLock(method_name,desc) values (‘method_name’,‘desc’)

Because we have method_namemade a unique constraint, if there are multiple requests submitted to the database at the same time, the database will ensure that only one operation can succeed, then we can consider that the thread that succeeded in the operation has obtained the lock of the method and can execute the method. body content.

After the method is executed, if you want to release the lock, you need to execute the following Sql:

delete from methodLock where method_name ='method_name'

The above simple implementation has the following problems:

1、这把锁强依赖数据库的可用性,数据库是一个单点,一旦数据库挂掉,会导致业务系统不可用。

2、这把锁没有失效时间,一旦解锁操作失败,就会导致锁记录一直在数据库中,其他线程无法再获得到锁。

3、这把锁只能是非阻塞的,因为数据的insert操作,一旦插入失败就会直接报错。没有获得锁的线程并不会进入排队队列,要想再次获得锁就要再次触发获得锁操作。

4、这把锁是非重入的,同一个线程在没有释放锁之前无法再次获得该锁。因为数据中数据已经存在了。

Of course, we can also have other ways to solve the above problems.

  • Is the database a single point? Engage in two databases, and the data is synchronized in both directions before. Once hung up, quickly switch to the standby database.
  • No expiration time? Just do a scheduled task, and clean up the timeout data in the database at regular intervals.
  • non-blocking? Engage in a while loop until the insert is successful and then return to success.
  • non-reentrant? Add a field to the database table to record the host information and thread information of the machine that currently obtains the lock, then query the database first when acquiring the lock next time. If the host information and thread information of the current machine can be found in the database, directly Just assign the lock to him.

Database exclusive lock

In addition to adding and deleting records in the data table, distributed locks can also be implemented with the help of the locks that come with the data.

We also use the database table we just created. Distributed locks can be implemented through database exclusive locks. Based on MySql's InnoDB engine, you can use the following methods to implement locking operations:

public boolean lock(){
    connection.setAutoCommit(false)
    while(true){
        try{
            result = select * from methodLock where method_name=xxx for update;
            if(result==null){
                return true;
            }
        }catch(Exception e){

        }
        sleep(1000);
    }
    return false;
}

If it is added after the query statement for update, the database will add an exclusive lock to the database table during the query process. Table-level locks. Here we want to use row-level locks, so we need to add an index to method_name. It is worth noting that this index must be created as a unique index, otherwise there will be a problem that multiple overloaded methods cannot be accessed at the same time. If the method is overloaded, it is recommended to add the parameter type as well.). When an exclusive lock is added to a record, other threads cannot add an exclusive lock to the row.

We can think that the thread that obtains the exclusive lock can obtain the distributed lock. When the lock is obtained, the business logic of the method can be executed. After the method is executed, it can be unlocked by the following methods:

public void unlock(){
    connection.commit();
}

connection.commit()The lock is released by an operation.

This method can effectively solve the above-mentioned problems of inability to release locks and blocking locks.

  • blocking lock? for updateThe statement returns immediately after successful execution, and blocks when execution fails until it succeeds.
  • After the lock, the service is down and cannot be released? In this way, the database will release the lock itself after the service goes down.

However, it still cannot directly solve the problem of database single point and reentrancy.

There may be another problem here, although we method_name use a unique index on , and show use for updateto use row-level locking. However, MySql will optimize the query. Even if the index field is used in the condition, whether to use the index to retrieve data is determined by MySQL by judging the cost of different execution plans. If MySQL thinks that the full table scan is more efficient, such as For some very small tables, it will not use indexes, in which case InnoDB will use table locks instead of row locks. If this happens, use force index to force mysql to go the index.

Another problem is that we need to use exclusive locks to lock distributed locks. If an exclusive lock is not submitted for a long time, it will occupy the database connection. Once there are too many similar connections, the database connection pool may burst. If this happens, you can set the timeout time for SQL execution, such as <setting name="defaultStatementTimeout" value="25"/> in the global settings of mybatis 

Summarize

Summarize the way to use the database to implement distributed locks. These two ways are dependent on a table of the database. One is to determine whether there is currently a lock by the existence of records in the table, and the other is to determine whether there is a current lock through the database. Distributed locks are implemented using exclusive locks.

The advantages of implementing distributed locks in databases

It is easy to understand directly with the help of the database.

Disadvantages of implementing distributed locks in databases

There will be all kinds of problems that will make the whole scheme more and more complicated in the process of solving them.

Operating the database requires a certain amount of overhead, and performance issues need to be considered.

Using database row-level locks is not necessarily reliable, especially when our lock table is not large.

Distributed lock based on cache

Compared with the scheme of implementing distributed locks based on the database, the implementation based on the cache will perform better in terms of performance.

There are many mature caching products, including Redis, memcached, etc. Here we take Redis as an example to analyze the solution of using cache to realize distributed lock.

There are many related articles on the Internet to implement distributed locks based on Redis. The main implementation method is to use the Jedis.setNX method to achieve

public boolean trylock(String key) {    
    ResultCode code = jedis.setNX(key, "This is a Lock.");    
    if (ResultCode.SUCCESS.equals(code))        
        return true;    
    else        
        return false; 
} 
public boolean unlock(String key){
    ldbTairManager.invalid(NAMESPACE, key); 
}

The above implementation also has several problems:

  • 1. Single point problem.

  • 2. This lock has no expiration time. Once the unlock operation fails, the lock record will always be in redis, and other threads can no longer obtain the lock.

  • 3. This lock can only be non-blocking, and returns directly regardless of success or failure.

  • 4. This lock is non-reentrant. After a thread acquires the lock, it cannot acquire the lock again before releasing the lock, because the key used already exists in redis. The setNX operation can no longer be performed.

  • 5. This lock is unfair. All waiting threads initiate the setNX operation at the same time, and the lucky thread can acquire the lock.

Of course, there are also ways to solve it.

  • Now mainstream cache services all support cluster deployment, and single-point problems can be solved through clusters.

  • No expiration time? The setExpire method of redis supports passing in the expiration time, after which the data will be automatically deleted.

  • non-blocking? while repeats execution.

  • non-reentrant? After a thread acquires the lock, it saves the current host information and thread information, and checks whether it is the owner of the current lock before acquiring it next time.

  • unfair? Put all waiting threads into a queue before the thread acquires the lock, and then acquire the lock according to the first-in, first-out principle.

Using redis cluster to solve a single problem, but the synchronization strategy of redis cluster takes time, it is possible that thread A gets the lock after successful setNX, but this value has not been updated to the server where thread B executes setNX, then it will generate Concurrency issues.

Salvatore Sanfilippo, the author of redis, proposed the Redlock algorithm, which implements a more secure and reliable distributed lock management (DLM) than a single node.

The Redlock algorithm assumes that there are N redis nodes, which are independent of each other, generally set to N=5, and these N nodes run on different machines to maintain the independence at the physical level.

The steps of the algorithm are as follows:

  • 1. The client obtains the current time in milliseconds.
  • 2. The client tries to acquire the locks of N nodes, (each node acquires the lock in the same way as the cache lock mentioned above), and the N nodes acquire the lock with the same key and value. The client needs to set the interface access timeout. The interface timeout needs to be much smaller than the lock timeout. For example, if the lock is automatically released for 10s, the interface timeout should be set to about 5-50ms. In this way, after a redis node is down, the access to the node can be timed out as soon as possible, and the normal use of locks can be reduced.
  • 3. The client calculates how much time it takes to acquire the lock. The method is to subtract the time acquired in step 1 from the current time. Only the client acquires the lock of more than 3 nodes, and the time to acquire the lock is less than the timeout of the lock. Time, the client obtains the distributed lock.
  • 4. The time for the client to acquire the lock is the set lock timeout time minus the time spent for acquiring the lock calculated in step 3.
  • 5. If the client fails to acquire the lock, the client will delete all the locks in turn.
    Using the Redlock algorithm can ensure that the distributed lock service can still work when at most 2 nodes hang up, which greatly improves the availability compared to the previous database locks and cache locks. Due to the efficient performance of redis, the performance of distributed cache locks No worse than database locks.

However, a distributed expert wrote an article "How to do distributed locking" questioning the correctness of Redlock.

The expert mentioned that there are two aspects to consider when considering distributed locks: performance and correctness.

If you use high-performance distributed locks and do not require high correctness, then using cache locks is sufficient.

If a highly reliable distributed lock is used, then strict reliability issues need to be considered. And Redlock does not conform to correctness. Why doesn't it fit? Experts listed several aspects.

The virtual machines used by many programming languages ​​now have the GC function. During the Full GC, the program will stop to process the GC. Sometimes the Full GC takes a long time, and even the program freezes for a few minutes. The article lists the example of HBase. , HBase sometimes GC for a few minutes, which will cause the lease to time out. And when the Full GC arrives, the program cannot control it, and the program may stop to process the GC at any time. For example, in the following figure, when client 1 obtains the lock and is preparing to process the shared resource, the Full GC occurs until the lock expires. In this way, client 2 acquires the lock again and starts to process the shared resource. While client 2 is processing, client 1 completes the Full GC and starts to process shared resources, so there is a situation where both clients are processing shared resources.

lock_unsafe_lock

Experts gave a solution, as shown in the figure below, it looks like MVCC, bring a token to the lock, the token is the concept of version, every time the lock operation is completed, the token will be incremented by 1, and the token is brought when dealing with shared resources, only the specified A version of the token can handle shared resources.

lock_fencing-token

Then the experts also said that the algorithm relies on local time, and when redis processes key expiration, it relies on the gettimeofday method to obtain the time instead of the monotonic clock, which will also bring time inaccuracy. For example, in a scenario, two clients client 1 and client 2, 5 redis nodes nodes (A, B, C, D and E).

  • 1. Client 1 successfully acquires locks from A, B, and C, and acquires locks from D and E. Network timeout.
  • 2. The clock of node C is inaccurate, causing the lock to time out.
  • 3. Client 2 successfully acquires locks from C, D, and E, and acquires locks from A and B. Network timeout.
  • 4. In this way, both client 1 and client 2 have obtained the lock.

Summarizing the two points the experts made about Redlock's unavailability:

  • 1. Scenarios such as GC may occur at any time, and cause locks to be acquired on the client side, time out during processing, and cause other clients to acquire locks. Experts also gave a solution to using self-incrementing tokens.
  • 2. The algorithm relies on local time, and the clock may be inaccurate, resulting in the situation that two clients acquire locks at the same time.
    Therefore, the conclusion given by experts is that Redlock can work normally only with bounded network delays, bounded program interruptions, and bounded clock error ranges, but the boundaries of these three scenarios cannot be confirmed, so experts do not Redlock is recommended. For scenarios with high requirements on correctness, experts recommend Zookeeper, and the use of Zookeeper as a distributed lock will be discussed later.

Response from the Redis author

After seeing this expert's article, the redis author wrote a blog to respond. The author politely thanked the experts, and then expressed his disagreement with the experts.

I asked for an analysis in the original Redlock specification here: http://redis.io/topics/distlock. So thank you Martin. However I don’t agree with the analysis.

The redis author's use of tokens to solve the lock timeout problem can be summarized into the following five points:

  • Opinion 1. The use of distributed locks is generally because you have no other way to control shared resources. Experts use tokens to ensure the processing of shared resources, so distributed locks are not needed.
  • Opinion 2: For the generation of tokens, in order to ensure the reliability of tokens obtained by different clients, the service of generating tokens still requires distributed locks to ensure the reliability of services.
  • Opinion 3, the author of redis thinks that the way of self-incrementing tokens mentioned by experts is completely unnecessary. Each client can generate a unique uuid as a token, and set the shared resource to a state that only the client of the uuid can handle, so that Other clients cannot process the shared resource until the client that acquired the lock releases the lock.
  • Opinion 4, the author of redis believes that the order of tokens cannot solve the GC problem raised by experts. As shown in the figure above, if the client of token 34 sends GC during the writing process, the lock times out, and other clients may obtain lock for token 35 and start writing again, resulting in a lock conflict. Therefore, the ordering of tokens cannot be combined with shared resources.
  • Viewpoint 5, the author of redis believes that in most scenarios, distributed locks are used to deal with update problems in non-transactional scenarios. The author means that some scenarios are difficult to deal with shared resources in combination with tokens, so you have to rely on locks to lock resources and process them.

Another clock problem mentioned by experts, the redis author also gave an explanation. The time the client actually acquires the lock is the default timeout, minus the time it takes to acquire the lock. If it takes too long to acquire the lock and exceeds the default timeout of the lock, then the client cannot acquire the lock at this time. There will be no examples presented by experts.

personal feeling

The first question I summarize is that after a client acquires a distributed lock, the lock may be released over time during the client's processing. In the processing mentioned here, except for non-resistance such as GC, the program flow is not processed. Ending is also possible. I mentioned earlier that the timeout time of the database lock setting is 2 minutes. If a task occupies an order lock for more than 2 minutes, then another trading center can obtain the order lock, so that the two trading centers can process the same order at the same time. . Under normal circumstances, the task is of course processed in seconds, but sometimes, the timeout time set for adding a certain rpc request is too long, and there are multiple such timeout requests in a task, then it is very likely that the automatic unlock time will be exceeded. At the beginning, our transaction module was written in C++, and there was no GC. If it was written in Java, there might be a Full GC in the middle. After the lock was unlocked over time, the client could not perceive it, which was a very serious matter. I don't think this is a problem with the lock itself. Any of the distributed locks mentioned above will have such a problem as long as it has the feature of overtime release. If the timeout function of the lock is used, the client must set the lock timeout and take corresponding processing instead of continuing to process the shared resource. Redlock's algorithm, after the client acquires the lock, will return the lock time that the client can occupy. The client must process the time and stop the task after the time is exceeded.

The second problem, naturally, is that distributed experts do not understand Redlock. A key feature of Redlock is that the time to acquire the lock is the total time of the default timeout of the lock minus the time it takes to acquire the lock, so that the processing time of the client is a relative time and has nothing to do with the local time.

From this point of view, the correctness of Redlock can be well guaranteed. A careful analysis of Redlock shows that compared with redis on a node, the most important feature provided by Redlock is higher reliability, which is an important feature in some scenarios. But I feel that Redlock is costing too much for reliability.

  • First, 5 nodes must be deployed to make Redlock more reliable.
  • Then, 5 nodes need to be requested to obtain the lock. Through the Future method, the request is sent to 5 nodes concurrently, and then the response results are obtained together, which can shorten the response time, but it still takes more time than a single-node redis lock.
  • Then, because more than 3 of the 5 nodes must be acquired, there may be a lock acquisition conflict, that is, everyone has acquired 1-2 locks, and as a result, no one can acquire the lock. For this problem, the redis author borrowed from the raft algorithm. The essence, by starting at a random time after the conflict, can greatly reduce the conflict time, but this problem cannot be avoided very well, especially when the lock is acquired for the first time, so the time cost of acquiring the lock increases.
  • If two of the five nodes are down, the availability of the lock will be greatly reduced at this time. First, it must wait for the results of the two down nodes to time out before returning. In addition, there are only three nodes, and the client must obtain all three of them. Only the lock of the node can have the lock, and the difficulty is also increased.
  • If there is a network partition, it is possible that the client will never be able to acquire the lock.

After analyzing so many reasons, I think the most critical point of Redlock's problem is that Redlock requires clients to ensure the consistency of writing. The five backend nodes are completely independent, and all clients have to operate these five nodes. If five nodes have one leader, as long as the client acquires the lock from the leader, other nodes can synchronize the leader's data, so that problems such as partition, timeout, and conflict will not exist. So in order to ensure the correctness of distributed locks, I think using a distributed coordination service with strong consistency can better solve the problem

Advantages of using a cache to implement distributed locks

It has good performance and is more convenient to implement.

Disadvantages of using a cache to implement distributed locks

It is not very reliable to control the expiration time of the lock through the timeout period.

Distributed lock based on Zookeeper

Distributed locks that can be implemented by temporary ordered nodes based on zookeeper.

The general idea is: when each client locks a method, a unique instantaneous ordered node is generated in the directory of the designated node corresponding to the method on zookeeper. The way to determine whether to acquire a lock is very simple, you only need to determine the one with the smallest sequence number in the ordered node. When releasing the lock, just delete the transient node. At the same time, it can avoid the deadlock problem that the lock cannot be released due to the service downtime.

Let's see if Zookeeper can solve the problems mentioned above.

  • The lock cannot be released? Using Zookeeper can effectively solve the problem that the lock cannot be released, because when the lock is created, the client will create a temporary node in ZK. Once the client acquires the lock and suddenly hangs up (the session connection is disconnected), then this temporary node The node is automatically deleted. Other clients can then acquire the lock again.

  • Non-blocking locks? Using Zookeeper can achieve blocking locks. Clients can create sequential nodes in ZK and bind listeners to the nodes. Once the nodes change, Zookeeper will notify the client, and the client can check whether the node it created is current. The node with the smallest serial number among all nodes, if it is, then it will acquire the lock and execute the business logic.

  • Not reentrant? Using Zookeeper can also effectively solve the problem of non-reentrancy. When the client creates a node, it directly writes the host information and thread information of the current client to the node. The next time it wants to acquire the lock, it will be the smallest node at present. Compare the data in . If the information is the same as your own, then you can directly acquire the lock, if not, create a temporary sequence node to participate in the queuing.

  • Single point question? Using Zookeeper can effectively solve single-point problems. ZK is deployed in a cluster. As long as more than half of the machines in the cluster survive, it can provide external services.

  • Fairness issue? Using Zookeeper can solve the fair lock problem. The temporary nodes created by the client in ZK are ordered. Every time the lock is released, ZK can notify the smallest node to acquire the lock, which ensures fairness.

The question is coming again. We know that Zookeeper needs cluster deployment. Will there be data synchronization problems like Redis cluster?

Zookeeper is a distributed component that guarantees weak consistency or eventual consistency.

Zookeeper adopts a data synchronization protocol called Quorum Based Protocol. If there are N Zookeeper servers in the Zookeeper cluster (N is usually an odd number, 3 can satisfy data reliability and have high read and write performance, and 5 have the best balance in terms of data reliability and read and write performance), then a user writes The operation is first synchronized to N/2 + 1 servers, and then returned to the user, prompting the user to write successfully. The data synchronization protocol based on the Quorum Based Protocol determines what strength of consistency Zookeeper can support.

In a distributed environment, data storage that satisfies strong consistency basically does not exist. It requires that when updating the data of one node, all nodes need to be updated synchronously. This synchronization strategy occurs in master-slave synchronously replicated databases. However, this synchronization strategy has too much impact on write performance and is rarely seen in practice. Because Zookeeper writes N/2+1 nodes synchronously, and N/2 nodes are not updated synchronously, Zookeeper is not strongly consistent.

The user's data update operation does not guarantee that subsequent read operations can read the updated value, but it will eventually show consistency. Sacrificing consistency is not to ignore the consistency of the data completely, otherwise the data will be chaotic, then the system availability is high and distributed no matter how good it is. Consistency is sacrificed, but the strong consistency in relational databases is no longer required, but as long as the system can achieve eventual consistency.

Whether Zookeeper satisfies causal consistency depends on the programming method of the client.

Practices that do not satisfy causal consistency

    1. Process A writes a piece of data to Zookeeper's /z and returns successfully
    1. Process A informs process B that A has modified the data of /z
    1. B reads the data of Zookeeper's /z
    1. Since the Zookeeper server connected to B may not have been updated with the data written by A, B will not be able to read the data written by A

Approaches to Satisfying Causal Consistency

    1. Process B monitors the data changes of /z on Zookeeper
    1. Process A writes a piece of data to /z of Zookeeper. Before returning successfully, Zookeeper needs to call the listener registered on /z, and the Leader informs B of the data change notification
    1. After the event response method of the B process gets the response, to get the changed data, then B must be able to get the changed value
    1. The causal consistency here refers to the causal consistency between the Leader and B, that is, the Leader notifies that the data has changed

The second event monitoring mechanism is also the method that should be used to program Zookeeper correctly, so Zookeeper should satisfy causal consistency

Therefore, when we implement distributed locks based on Zookeeper, we should use a method that satisfies causal consistency, that is, threads waiting for locks listen to changes in Zookeeper's locking. When the lock is released, Zookeeper will notify the lock change notification to satisfy the A thread waiting for a fair lock condition.

You can directly use the zookeeper third-party library Curator client, which encapsulates a reentrant lock service.

public boolean tryLock(long timeout, TimeUnit unit) throws InterruptedException {
 try { 
   return interProcessMutex.acquire(timeout, unit); 
 } catch (Exception e) {
   e.printStackTrace(); 
 } return true; 
} 

public boolean unlock() {
 try {
   interProcessMutex.release(); 
 } catch (Throwable e) { 
   log.error(e.getMessage(), e); 
 } finally {
   executorService.schedule(new Cleaner(client, path), delayTimeForClean, TimeUnit.MILLISECONDS);   
 } return true; 
}

The InterProcessMutex provided by Curator is an implementation of distributed locks. The acquire method user acquires the lock, and the release method is used to release the lock.

The distributed lock implemented using ZK seems to fully meet all our expectations for a distributed lock at the beginning of this article. However, it is not. The distributed lock implemented by Zookeeper actually has a disadvantage, that is, the performance may not be as high as that of the cache service. Because every time in the process of creating and releasing locks, instantaneous nodes must be dynamically created and destroyed to realize the lock function. The creation and deletion of nodes in ZK can only be performed through the Leader server, and then the data cannot be shared with all Follower machines.

In fact, the use of Zookeeper may also bring concurrency problems, but it is not common. Considering such a situation, due to network jitter, the client's session connection to the ZK cluster is disconnected, then ZK thinks that the client is hung up, and will delete the temporary node, at which time other clients can obtain distributed locks. Concurrency issues may arise. This problem is not common because zk has a retry mechanism. Once the zk cluster cannot detect the heartbeat of the client, it will retry. The Curator client supports multiple retry strategies. The temporary node will be deleted if it does not work after several retries. (So, it is also more important to choose an appropriate retry strategy, and to find a balance between the granularity of the lock and the concurrency.)

Summarize

Advantages of using Zookeeper to implement distributed locks

Effectively solve single-point problems, non-reentrant problems, non-blocking problems and problems that locks cannot be released. It is simpler to implement.

Disadvantages of using Zookeeper to implement distributed locks

The performance is not as good as using a cache to implement distributed locks. Requires some understanding of the principles of ZK.

Comparison of the three schemes

None of the above methods can be perfect. Just like CAP, it cannot be satisfied at the same time in terms of complexity, reliability, performance, etc. Therefore, it is the kingly way to choose the most suitable one according to different application scenarios.

In terms of ease of understanding (low to high)

Database > Cache > Zookeeper

From an implementation complexity perspective (low to high)

Zookeeper >= cache > database

From a performance perspective (high to low)

Cache > Zookeeper >= Database

From a reliability perspective (high to low)

Zookeeper > Cache > Database

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326440327&siteId=291194637