background

In the k8s cluster, container horizontal automatic scaling (HPA) can automatically expand and shrink the number of workload replicas (replicas) according to the usage of container resources and within the set replica range. HPA scales based on indicator thresholds. Common indicators include CPU and memory. It can also be scaled through custom indicators, such as QPS, number of connections, etc. But there is a scenario: there is a certain delay in scaling based on indicators. This type of delay mainly includes: collection delay (minute level), judgment delay (minute level) and scaling delay (minute level). This kind of minute-level delay cannot adapt to the rapid increase in business traffic in the short term, which may cause the application CPU to surge and the response time to become longer.

Container scheduled horizontal automatic scaling (CronHPA ) is a supplement to HPA. For businesses with peak periods in fixed time periods, the number of container instances can be expanded in advance to prevent performance shortfalls caused by sudden business traffic and business delays. When business is at a low point, scheduled resource recycling is triggered.

In some business scenarios, there are bursts of traffic and obvious peaks and troughs. If both CronHPA and HPA policies are configured at the same time, the following situation may occur: before the business peak arrives, the CronHPA scheduled task expands the business container copy in advance. , and at this time HPA may detect that resource usage is very low and trigger instance scaling, causing the pre-scaling strategy to fail.

Huawei Cloud CCE service supports the joint setting of CronHPA policy and HPA policy, and dynamically sets the upper and lower limits of the HPA copy range to adjust the number of business container instances.

Usage example

In daily life, many business scenarios have obvious peaks and troughs when traffic bursts, and are very sensitive to response delays, such as:

1. 网络游戏：X游戏客户，旗下某大型网络游戏，在晚上或周末、节假日等高峰期间，玩家数量会急剧增加，导致游戏服务器的负载瞬间飙升，此时负载副本数若扩容较慢，可能导致网络卡顿，游戏体验显著下降；

2. Live video : X Video Live APP, when certain large-scale events, competitions and other live broadcasts start, the number of viewers will increase rapidly, resulting in a sharp increase in server load and an increase in network latency, which will in turn lead to a loss of users watching the live broadcast. Decreased experience;

3. E-commerce promotion : X e-commerce platform usually arouses user enthusiasm during its promotional activities, resulting in a significant increase in user visits and a sharp increase in server load. If the business container is not expanded in time, it is likely that users will The experience is degraded, which may seriously lead to business interruption;

4. Financial transactions : X financial trading platform involves a variety of financial products, all of which require real-time response. Network delay has a great impact on transaction efficiency and accuracy. During peak periods, transaction volume will increase dramatically, and network latency will also increase.

The above business scenarios all require efficient and stable network support and are very sensitive to network latency . If the business container is not expanded in time, network latency will be too high, user experience will be degraded, and even normal business operations will be affected.

The following takes the live video service as an example to introduce how to perform elastic scaling configuration. Suppose there is a popular live broadcast from 8:30 to 10 o'clock every night. During this period, the number of user visits will increase sharply, and then the traffic will slowly decrease until it reaches a trough. In order to save costs, you can use the following configuration to expand the number of business container instances in advance before the traffic peak arrives. After the traffic peak subsides, slowly shrink the number of container instances based on the business flow:

1. In the CCE console, click the cluster name to enter the cluster.

2. Click "Workload" in the left navigation bar, and click "More > Auto Scaling" in the operation column of the target workload.

3. Select "HPA+CronHPA policy" for the policy type, enable the HPA policy, and enable the CronHPA policy at the same time . At this time, CronHPA will regularly adjust the maximum and minimum number of instances of the HPA policy.

4. Set HPA policy

Set the instance range and system policy, as shown below. HPA will dynamically adjust the number of instances of the container in the range of 1-10 based on the CPU utilization of the current business container. When the CPU utilization is greater than 80%, it will automatically expand the capacity. When the CPU utilization is low, Automatically scale down the number of business container instances at 60%.

5. Set up CronHPA policy

Set up scheduled tasks, as shown in the figure below

Strategy 1 : Adjust the HPA policy instance number range from 1-10 to 8-10 at 20:00;

Strategy 2 : Adjust the HPA policy instance number range from 8-10 to 1-10 at 22:30.

6. Repeat the above steps. You can add multiple policy rules, but the triggering time of the policies cannot be the same .

7. After the settings are completed, click "Create"

After the above configuration is completed, CronHPA will adjust the number of HPA policy instances from 1-10 to 8-10 at 20:00 before the traffic peak. At this time, HPA will expand the number of business instances to at least 8 for the upcoming period. Prepare for the coming traffic peak. Wait until 22:30 after the traffic peak has passed to adjust the HPA policy instance number range from 8-10 to 1-10. At this time, HPA will shrink the number of business container instances to an appropriate value based on business traffic conditions to reduce user usage. cost.

CronHPA and HPA linkage analysis

HPA is a controller used to control the horizontal scaling of Pods. HPA periodically checks the resource usage data of Pods, calculates the number of replicas required to meet the target value configured for HPA resources, and then adjusts the replicas field of the target resource (such as Deployment).

CronHPA supports regular adjustment of the maximum and minimum number of instances of the HPA policy to achieve linkage with HPA to meet workload scaling in complex scenarios.

Since HPA and CronHPA both act on the same deployment object, there is a conflict problem. The two scaling strategies are independent of each other. The operation performed later will overwrite the operation performed earlier, causing the scaling effect not to meet expectations. Therefore, this situation needs to be avoided.

In order to avoid the above problems, we have enhanced CronHPA to support the application of CronHPA rules on HPA policies. CronHPA only adjusts the policy configuration of HPA, and the number of instances of business containers is only operated by HPA, thus realizing the collaborative work of the two elastic strategies.

Summarize

The HPA policy provided by the k8s community supports automatic expansion and contraction based on the CPU, memory and other resource usage of the business container within the configured number of instances. Superimposing the scheduled expansion strategy CronHPA, it is expected that before the business peak arrives, the number of business container copies will be expanded in advance through the CronHPA scheduled task. However, at this time, HPA may detect that the resource usage is very low and trigger instance shrinkage, resulting in a pre-expansion strategy. Invalid. Huawei Cloud CCE service combines HPA and CronHPA to achieve organic synergy between indicator elasticity strategies and timing elasticity strategies, meeting the complex elastic scaling scenarios of customer businesses.

Reference documentation:

https://kubernetes.io/zh-cn/docs/tasks/run-application/horizontal-pod-autoscale/

https://support.huaweicloud.com/usermanual-cce/cce_10_0415.html

Click to follow and learn about Huawei Cloud’s new technologies as soon as possible~

Coping with complex business elastic scaling scenarios through HPA+CronHPA combination

background

Usage example

CronHPA and HPA linkage analysis

Summarize

Reference documentation:

Guess you like