How to achieve smooth expansion of horizontal sub-libraries

In the previous introduction to "Application of Consistent Hash on DynamoDB ", a special highlight is that it can dynamically expand without downtime.

This has great advantages for our commonly used sub-database and sub-table solutions. The expansion of sub-database and sub-tables is a headache. If we use consistent hashing for the db layer or support for the middle price, it will The cost is too high. If not, it can only be processed by downtime maintenance, which will have an impact on high availability.

Is there a way to scale quickly without reducing availability? In this article, we will talk about the extension scheme of sub-database and sub-table for everyone to discuss together.

 

First, the horizontal sub-library expansion problem

In order to increase the concurrency capability of db, a common solution is to sharding data, which is often referred to as sub-database and sub-table. This requires an expectation of data planning in the early stage, so that enough databases are allocated in advance for processing.

For example, three databases are currently planned, and the remainder is sharded based on uid, then the division rules on each database are as follows:

As we can see above, the data can be evenly distributed among the three databases.

However, what should I do if the follow-up business develops rapidly, the number of users increases substantially, and the current capacity is insufficient to support it?

It is necessary to expand the database horizontally, and then add new libraries to decompose. After the new library is added, the data originally sharded to three libraries can be sharded to four libraries

However, at this time, due to the change of the sharding rules (uid%3 becomes uid%4), most of the data cannot be hit on the original database and needs to be redistributed, and a large amount of data needs to be migrated.

For example, uid1 was previously allocated to library A through uid1%3. After adding library D, the algorithm is changed to uid1%4. At this time, it may be allocated to library B. If you have read the previous "Principle and Practice of Consistent Hashing", you will find that if you add a new node, about 90% of the data will need to be migrated. This is still quite a pressure on DB students, so how to deal with it?

There are generally the following ways.

 

Second, stop the migration

The most common solution is to stop the server and migrate, and the general process is as follows:

  1. Estimate the suspension time and issue a suspension announcement

  2. Stop the server, use the data migration tool prepared in advance, and migrate according to the new sharding rules

  3. Modify sharding rules

  4. start the service

We see that this method is relatively safe. No data is written after the server is stopped, which can ensure the normal progress of the migration work, and there is no consistency problem. The only problem is the suspension and time pressure.

  1. Server outage hurts user experience and reduces server availability

  2. The migration must be completed within the specified time, and if it fails, it needs to be done again on another day. At the same time, it increases the pressure on developers and is prone to major accidents.

  3. When the amount of data is huge, migration takes a lot of time

Is there any other way to improve it? Let's take a look at the following two solutions.

 

3. Upgrade the slave library

For online databases, in order to maintain high availability, we generally configure each master database with a slave database, read and write in the master database, and then synchronize the master and slave to the slave database. As follows, A and B are the master libraries, and A0 and B0 are the slave libraries.

At this time, when the capacity needs to be expanded, we upgrade A0 and B0 to be the new main library nodes, thus changing from 2 sub-libraries to 4 sub-libraries. At the same time, in the upper-layer sharding configuration, do a good job of mapping, the rules are as follows:

uid%4=0 and uid%4=2 point to A and A0 respectively, that is, the data that previously pointed to uid%2=0, split into uid%4=0 and uid%4=2

uid%4=1 and uid%4=3 point to B and B0, that is, the data that previously pointed to uid%2=1, split into uid%4=1 and uid%4=3

Because the data of the A and A0 libraries are the same, and the data of B and B0 are the same, there is no need to do data migration at this time. You only need to change the shard configuration and update it through the configuration center without restarting.

Since the data of uid%2 was allocated in 2 libraries before, it is now scattered into 4 libraries. Since the old data still exists (uid%4=0, and half of the data of uid%4=2), it is necessary to Redundant data is cleaned up once.

And this cleaning will not affect the consistency of online data, but it can be done anytime, anywhere.

After the processing is completed, in order to ensure high availability and further expansion requirements. An existing master can be assigned a slave again.

To summarize the steps of this program are as follows:

  1. Modify the shard configuration, and do a good job of mapping between the new library and the old library.

  2. Synchronize configuration, upgrade from library to master library

  3. Release the master-slave relationship

  4. Redundant data cleaning

  5. Build a new slave library for the new data node

 

4. Double write migration

The double-write solution is mostly used for online database migration. Of course, for the expansion of the sub-database, it is also necessary to migrate data. Therefore, it can also assist in the problem of sub-database expansion.

The principle is the same as the above, and the split expansion is performed, but the data synchronization method is different.

 

1. Add new library write link

The core principle of double write is to add a new library to the database that needs to be expanded, and add a write link to the existing shard, and write two copies of data at the same time.

Because the data of the new library is empty, the CRUD of the data has no effect on it. In the upper logic layer, the data of the old library is still the main one.

2. New and old database data migration

Use tools to migrate the data of the old database to the new database. At this time, you can choose to synchronize the split data (1/2) to synchronize, or you can fully synchronize. Generally, full synchronization is recommended, and it is easy to handle when you finally do data verification. .

3. Data verification

According to the ideal environment, after data migration, because it is a double-write operation, the data on both sides is consistent, especially insert and update, the consistency is very high. However, there will be network delays in the real environment, which is not ideal for delete situations, such as:

When database A deletes data a, data a is being migrated and has not been written to library C. At this time, the deletion operation of library C has been performed, and there will be an additional piece of data in library C.

At this point, it is necessary to do a good job of data verification. The data verification can be done several times until the data is almost the same, and the data of the old database shall prevail as much as possible.

 

4. Sharding configuration modification

After the data synchronization is completed, the shard mapping of the new library can be reprocessed, or according to the method of splitting the old library.

Before u, uid%2=0 becomes uid%4=0 and uid%4=2

uid%2=1, becomes uid%4=1 and uid%4=3.

 

Quote:

https://mp.weixin.qq.com/s/BLOneOs-cPxP_9b5eH8oQA

 

-----------------------------------------------------------------------------

If you want to see more interesting and original technical articles, scan and follow the official account.

Focus on personal growth and game development, and promote the growth and progress of the domestic game community.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325443144&siteId=291194637