Database access optimization funnel rule - 5. Use more resources

Database access optimization funnel rule
This optimization rule can be summarized into 5 levels:
1. Reduce the number of data accesses (reduce disk access)
2. Return less data (reduce network transfer or disk access)
3. Reduce the number of interactions (reduce network transfer)
4 , Reduce server CPU overhead (reduce CPU and memory overhead)
5. Use more resources (increase resources)




5. Use more resources
1. Client multi-process parallel access
Multi-process parallel access refers to creating multiple processes on the client (Thread), each process establishes a connection to the database, and then submits access requests to the database at the same time. When the database host resources are idle, we can use the method of concurrent access by multiple client processes to improve performance. If the database host is already busy, using multi-process parallel access performance will not improve, but may be slower. Therefore, it is best to communicate with the DBA or system administrator before deciding whether to use this method.
For example:
we have 10,000 product IDs, and now we need to retrieve the product details according to the ID. If single-threaded access is performed, each IO will take 5ms to calculate, ignoring the host CPU operation and network transmission time, we need 50s to complete the task. If 5 parallel accesses are used and each process accesses 2000 IDs, it is possible to complete the task in 10s.
Is it true that the more the number of parallelisms, the better? Does it only take 50ms to open 1000 parallelisms? The answer is definitely no. When the number of parallelisms exceeds the upper limit of the server host resources, the performance will not improve. If it increases, it will increase. The host's inter-process scheduling cost and process conflict probability.

The following are some basic suggestions on how to set the number of parallels:
If the bottleneck is on the server host, but the host has idle resources, the maximum number of parallels is the minimum of the two parameters, the number of CPU cores on the host and the number of disks on which the host provides data services. To ensure that the host has the resources to do other tasks.
If the bottleneck is processed by the client, but the client still has free resources, it is recommended not to increase the parallelism of SQL, but to use one process to retrieve data and then start multiple processes on the client. The number of processes depends on the client CPU core. number calculation.
If the bottleneck is in the client network, it is recommended to do data compression or add multiple clients and use the map reduce architecture for processing.
If the bottleneck is on the server network, it is necessary to increase the network bandwidth of the server or compress the data on the server and then process it.

 
2. Database parallel processing
Database parallel processing refers to a SQL request from the client, and the database is automatically decomposed into multiple processes for parallel processing.
Not all SQL can use parallel processing, generally only when all access to the table or index can use parallel. The database table does not open parallel access by default, so you need to specify the SQL parallel prompt, as shown below:
select /*+parallel(a,4)*/ * from employee;

 
Parallel advantages:
use multi-process processing to make full use of the database host resources (CPU, IO) to improve performance.

Disadvantages of parallelism:
1. A single session occupies a lot of resources and affects other sessions, so it is only suitable for use when the host load is low;
2. Only direct IO access can be used, and cache data cannot be used, so the dirty cache data will be triggered before execution. into the disk operation.

 
Note:
1. Parallel processing should be used with caution in OLTP systems. Improper use will cause a session to occupy all host resources, and normal transactions cannot be responded in time, so it is generally only used for data warehouse platforms.
2. Generally, parallel access performance cannot be improved for small tables with less than one million records, but may make the performance worse.


 



Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325840552&siteId=291194637