High-concurrency projects are usually ToC scenarios in javaWeb projects, such as common spikes and ticket grabbing.
From project development to launch, hot data needs to be paid attention to throughout the project cycle, and accurate prediction and monitoring of hot data can help develop a better implementation system.
This article starts from a small point and analyzes the architecture design of high concurrency.
1. What is a hot spot
Hotspots are divided into hotspot operations and hotspot data.
1.1 Hotspot operation
The so-called "hotspot operations", such as a large number of page refreshes, a large number of shopping cart additions, and a large number of orders placed at midnight on Double Eleven, are all such operations. For the system, these operations can be abstracted into "read requests" and "write requests". The processing methods of these two hotspot requests are quite different (almost). The optimization space for read requests is larger, while the bottleneck of write requests is generally At the storage layer, the idea of optimization is to balance it according to the CAP theory.
1.2 Hot data
"Hot data" is easier to understand, that is, the data corresponding to the user's hot request. The hotspot data is divided into "static hotspot data" and "dynamic hotspot data".
Static hot data
The so-called "static hotspot data" is hotspot data that can be predicted in advance. For example, we can filter out in advance through seller registration, and mark these hot commodities through the registration system. In addition, we can also discover hot commodities in advance through big data analysis. For example, we analyze historical transaction records and user shopping cart records to discover which commodities may be more popular and sell better. These are hot spots that can be analyzed in advance.
Dynamic hotspot data
The so-called "dynamic hotspot data" refers to the hotspots that cannot be predicted in advance and are temporarily generated during the operation of the system. For example, the seller advertised on Douyin, and the product became popular immediately, which caused it to be bought in large quantities in a short time.
Two, discover hot spots
The architecture is usually hierarchical and heterogeneous. We can use Linux, jvm or mysql monitoring tools to accurately locate hotspot operations.
2.1 nginx log analysis
The nginx log contains the request path. You can analyze the hot request path through commands:
cat log_file|awk '{print $11}'|sort|uniq -c|sort -nr | head -10
cat log_file|awk '{print $11}'|sort|uniq -c|sort -nr|head -20
awk '{print $1}' log_file |sort -n -r |uniq -c | sort -n -r | head -20
If you use Alibaba Cloud server, its access entry comes with a hotspot analysis function.
Or use tools to build a set of log analysis services
GoAccess is an open source real-time web log analyzer and interactive viewer that runs in a terminal in *nix systems or through your browser.
It provides fast and valuable HTTP statistics for system administrators that require a visual server report on the fly.
2.1 Hotspots of business log analysis
The salesperson log passes through the Alibaba Cloud log analysis system and
can search and analyze business hotspots based on business keywords
2.3 Redis hotspot data analysis
Redis can use the Monitor command to view real-time Redis operations
Two usages of redis-faina:
1. You can read N commands from stdin through a pipe
redis-cli -p port MONITOR | head -n <NUMBER OF LINES TO ANALYZE> | ./redis-faina.py [options]
2. You can also read N commands from a file
redis-cli -p port MONITOR | head -n <NUMBER OF LINES TO ANALYZE> > /tmp/outfile.txt
./redis-faina.py /tmp/outfile.txt
# redis-cli -p 6381 MONITOR | head -n 10000 | ./redis-faina.py
Overall Stats
========================================
Lines Processed 10000
Commands/Sec 274.73
Top Prefixes #按照key的前缀统计
========================================
testcache-rendsord-lang 1684(16.84%)
testcache-inuanGoods-id 1090(10.90%)
testcache-riceroup-cat_id 307 (3.07%)
testcache-ategorynfo-id 190 (1.90%)
testcache-ategoryey-lang 189 (1.89%)
testcache-earchtrremplate-id 61 (0.61%)
testcache-riceroup-id 15 (0.15%)
testcache-otordata-lang 9 (0.09%)
Top Keys #请求最频繁的key
========================================
testcache-acebookhareandsave 2373(23.73%)
testcache-hippingFee 2198(21.98%)
testcache-rendsord-lang:en 1464(14.64%)
testcache-ountryurrency 1181(11.81%)
testcache-inuanoods 442 (4.42%)
testcache-ategoryey-lang: 183 (1.83%)
testcache-rendsord-lang:es 124 (1.24%)
testcache-inuanoods-id:68 114 (1.14%)
Top Commands # 执行最频繁的命令
========================================
GET 9957(99.57%)
AUTH 13 (0.13%)
COMMAND 13 (0.13%)
SADD 10 (0.10%)
info 5 (0.05%)
SET 1 (0.01%)
Command Time (microsecs) # 命令执行时长
========================================
Median 2309.0
75% 4959.75
90% 8447.0
99% 18482.0
Heaviest Commands (microsecs) #耗时最多的命令
========================================
GET 36281717.75
COMMAND 85269.25
SADD 17985.75
info 10698.5
SET 3228.0
AUTH 625.5
Slowest Calls #执行最慢的命令
========================================
179962.0 "GET" "testcache-hippingee"
62659.0 "GET" "testcache-romotionullateeduce"
44902.0 "GET" "testcache-hippingee"
40305.25 "GET" "testcache-hippingee"
39559.0 "GET" "testcache-hippingee"
36831.25 "GET" "testcache-hippingee"
33852.0 "GET" "testcache-hippingee"
33501.0 "GET" "testcache-hippingee"
Reference: https://blog.csdn.net/fuqianming/article/details/99682764
2.4 Mysql analyzes hot data
Alibaba Cloud's yunsql comes with SQL analysis function
MySQL itself does not have this function.
Change the idea of implementation. Parser binLog writes
a parser by hand to analyze the execution statement.
mysql proxy ideas
Alibaba's open source database connection pool druid has the sql statement statistics function.
https://github.com/alibaba/druid