A must-read for database selection: How to choose the most suitable business among the dazzling products?

Abstract: It’s important to spend more time choosing the most suitable database "in accordance with local conditions".

In the Internet + AI era, as business scenarios become more and more complex, various open source and commercial databases are diversified, making many developers dazzled and unable to start.

There is a saying in the industry that any talk about architecture out of business is nonsense. Therefore, when selecting a database, users need to balance their own business architecture, business data volume, data types, and even the business capabilities of team members, and consider which database to choose.

So, how do we choose the most suitable for our business according to the characteristics of each database?

Emphasize affairs, choose it

Let’s talk about the most widely used relational database. The most important feature of relational data is transactions. It can ensure consistent data consistency. Here we have to mention the ACID characteristics of transactions:

Atomicity: A transaction is an indivisible unit of work. The operations in the transaction either all happen or never happen. There is no such thing as only one or a few steps are executed.

Consistency: After the transaction is executed, the state of the database is consistent with its business rules. Such as the transfer business, regardless of whether the transaction is executed successfully or not, the sum of the balances of the two accounts involved in the transfer should be unchanged.

Isolation: When multiple transactions are accessed concurrently, the transactions are isolated, and one transaction should not affect the effect of other transactions.

Persistence: After the transaction is completed, the changes made to the database by the transaction are persisted in the database, and will not cause data errors or loss due to abnormalities or downtime.

In addition, the general SQL language makes it very convenient to operate relational databases and supports complex queries such as joins.

According to its characteristics, the applicable scenarios of relational databases can usually be divided into OLTP and OLAP.

Among them, online transaction processing (OLTP) stores/query activity data in business applications to support daily business activities. It is an indispensable production transaction system for every enterprise. The amount of data in this type of application is generally small, emphasizing real-time, fast response, and strong data consistency, such as bank deposits and withdrawals, shopping consumption.

Online analytical processing (OLAP) stores historical data to support complex analysis operations, focusing on decision support. This type of application has a large amount of data, high parallelism, low concurrency, and low availability requirements, such as customer service systems, sales systems, etc.

Take MySQL as an example. This is currently the most popular open source relational database and the preferred database for OLTP scenarios.

Huawei's GaussDB (for MySQL) and GaussDB (openGauss) are also capable of OLTP processing as their core competence, and they also have certain OLAP processing capabilities.

However, it is generally not recommended to completely mix OLTP and OLAP services. In typical OLAP processing scenarios, you should use a database designed for OLAP, and in a typical OLTP processing scenario, use a database designed for OLTP, otherwise the expansion of OLAP will not be achieved. It cannot meet the real-time and high-performance requirements of OLTP.

It should be emphasized that it is particularly not recommended to increase the TP capability based on OLAP, because the latter is real-time and online, while AP is more analytical. OLAP databases often cannot meet the performance requirements, even if it seems to meet the requirements at the initial stage of the launch, but As the volume of business increases, problems will become prominent.

In addition, it is necessary to evaluate the dimensionality based on whether it supports the JSON format and the supported storage mode. There are many details. Therefore, when choosing a database based on business scenarios, you must be cautious and flexible.

Finally, in terms of specific business scenarios, Amway follows Huawei Cloud's database products.

Website business: Website business requests write less and read more. You can use the cloud data MySQL read-only instance to horizontally expand the read load capacity, and use it with distributed database middleware DDM to achieve automatic read-write separation and read load balancing.

Mobile applications: mobile applications that include positioning functions can use the cloud database Postgre SQL database to obtain location computing capabilities; mobile applications with huge data, use Huawei Cloud RDS MySQL database with DDM, to easily deal with the problem of sub-database sub-table.

Game business: The explosive growth of player data storage and read-write requests can use Huawei Cloud RDS to quickly expand storage, change specifications or deploy a new game partition database; game data archive or rollback can use Huawei Cloud RDS automatic backup and PITR Features flash back to any point at any time.

E-commerce business: For high-concurrency database requests such as e-commerce "spike" and "rush to buy", you can use HUAWEI CLOUD RDS high-spec instance; AZ deployment obtains higher availability support.

Financial business: financial-level business continuity and data reliability requirements, you can use Huawei Cloud RDS dual machine hot backup, cross-AZ deployment, or Huawei's self-developed distributed database GaussDB to ensure high service availability, multiple copies of data storage and strong Consistency; financial-level security compliance requirements, can be used with the database security service DBSS, real-time monitoring and interception of SQL injection, preventing desensitization data leakage, and auditing database logs.

For specific scenarios, to meet high concurrent reading and writing, choose it

Relational databases are good, but they are also inadequate in many ways. For example, in the face of high concurrent read and write requirements, multi-table related queries, complex data analysis types, and performance problems sacrificed in order to ensure ACID characteristics are highlighted.

At this time, it is the world of non-relational databases.

NoSQL no longer focuses on data consistency. It targets specific scenarios with high performance and ease of use. Currently, non-relational databases that are widely used include the following types: document databases, key-value databases, column store databases, time series databases, and graph databases.

Document database:

The document database stores data in the form of documents. Each document is a self-contained data unit, a collection of a series of data items. Each data item has a name and a corresponding value. The value can be a simple data type, such as string, number, and date, etc.; it can also be a complex type, such as an ordered list and associated objects. The smallest unit of data storage is a document. The document attributes stored in the same table can be different, and the data can be stored in multiple forms such as XML, JSON, or JSONB.

Representative products: MongoDB, CouchDB, RavenDB

Applicable scene:

1. Log: In an enterprise environment, each application has different log information. The document database does not have a fixed model, we can use it to store different information.

2. Analysis: Because of its weak mode structure, different measurement methods can be stored and new measurements can be added without changing the mode.

Remarks: Take MongDB as an example. Its usage scenarios can largely be compared to relational databases, but it is more suitable for processing data that does not have joins, does not have strong consistency requirements, and the table schema changes frequently.

Key-value database:

Key-value databases are like hash tables used in traditional languages. You can add, query or delete data by key. In view of the use of primary key access, good performance and scalability will be obtained. Its advantages are simplicity, easy deployment, and high concurrency.

Mainstream representative products: Riak, Redis, Memcached

Applicable scenarios: Store user information, such as sessions, configuration files, parameters, shopping carts, etc., including role information in game scenes, experience props information, friend rankings, massive product display information in e-commerce scenarios, shopping review information, etc. , This information is generally linked to ID (key), so a key-value database is a good choice.

Column store database:

The column store database stores data in column families, and a column family stores related data that are frequently queried together. For example, if there is a Person class, we usually query their name and age together instead of salary. In this case, the name and age will be put into one column family, and the salary will be in another column family.

Representative products: Cassandra, HBase

Applicable scene:

1. Log: Data can be stored in different columns, and each application can write information into its own column family.

2. Blog platform: Store each information in different column families. For example, tags can be stored in one column family, and categories and articles can also be stored in different column families.

Time series database:

A time series database is a database that stores time series data. It needs to support basic functions such as fast writing, persistence, and multi-latitude aggregation query of time series data. In contrast to traditional databases that only record the current value of the data, time series databases record all historical data. Its query will always bring time as a filter condition. The data exists in a time stream, and each record includes a timestamp. A large amount of time series data can be stored more efficiently and quickly and the data can be analyzed in real time.

Representative products: Prometheus, InfluxDB and OpenTSDB

Applicable scenarios: IoT sensor time series data analysis; securities and encrypted currency transaction data; real-time monitoring of software and hardware equipment; urban environmental protection data collection, etc.

Graph database:

The data is stored in the form of graphs. Entities will be treated as vertices, and the relationships between entities will be treated as edges. For example, if we have three entities, Steve Jobs, Apple and Next, there will be two "Founded by" edges connecting Apple and Next to Steve Jobs.

Representative products: Neo4J, Infinite Graph, OrientDB

Applicable scene:

1. In some highly relational data, it is used to build relational graphs, such as social networks.

2. Recommendation engine. If we present the data in the form of graphs, it will be very beneficial to the formulation of recommendations.

In the field of non-relational database services, Huawei Cloud has also launched GaussDB (for Mongo), GaussDB (for Redis), GaussDB (for Influx), and GaussDB (for Cassandra). At present, GaussDB (for Mongo), GaussDB (for Cassandra), and GaussDB (for Redis) have been officially commercialized. Developers can choose a database that matches their business according to different data models and processing logic.

At last:

For a simple example, if your business has a lot of structured data and many transactional operations, then the first choice is definitely a database like MySQL.

If your business often has sudden traffic peaks, give priority to non-relational databases such as MangoDB, which have better scalability.

The choice of database determines the sustainable development of the business, so we must spend more time, "adapt to local conditions" and choose the most suitable database according to the actual situation of the business. (Note: Part of the conceptual content in the article comes from the Internet.)

 

Click to follow and learn about Huawei Cloud's fresh technology for the first time~

Guess you like

Origin blog.csdn.net/devcloud/article/details/108881350