ACK One x OpenKruiseGame best practices for consistent delivery of global game servers in multiple regions

Author: Liu Qiuyang, Cai Jing

Preface

In today's globally integrated economic environment, the digital entertainment industry is increasingly becoming a powerful representative of cultural and commercial exchanges. In this context, a large number of game manufacturers have tried to take their games overseas and have achieved remarkable results. Many games have attracted a wide range of player groups around the world with a global server structure. The global deployment of games not only expands the market size of individual products, but also increases the global influence of game manufacturers, but at the same time it also brings many technical challenges:

The high-frequency interaction and low latency required by game services require game servers to be deployed in multiple regions under the global server framework. In actual operations, we usually need to plan the location of the server in advance based on the geographical location of the target user group and tolerance for latency. Generally speaking, the following areas are our priority server addresses - the eastern United States is densely populated and can provide services to a large number of North American players; the Frankfurt area is the intersection of the European Internet and can effectively serve the network experience of players throughout Europe; Singapore The region broadly covers the player base in Southeast Asia; the Tokyo region mainly provides support for players in Japan and South Korea.

Faced with possible configuration differences, game version updates, and inconsistencies in the number of servers in different regions, how to effectively achieve consistent delivery of game servers on a global scale has become a core challenge that we must face and solve when designing the global server architecture. . This article will use an example to explain the best practices for multi-regional consistency delivery of global game servers.

Deployment architecture

In the example, we plan to open servers in Shanghai, Tokyo, and Frankfurt, so we need infrastructure resources in these three regions. Faced with heterogeneous and complex infrastructure scenarios, the declarative API and consistent delivery features brought by cloud native can fully shield the differences in underlying resources, allowing game operation and maintenance engineers to focus on the application itself, and greatly improve the efficiency of game server delivery . From the perspective of regional autonomy stability and user scheduling complexity, we believe that independently deploying Kubernetes clusters in each server region and unified operation and maintenance management through multi-cluster capabilities is the best way to deliver consistent game servers.

In this practice, we chose Alibaba Cloud's distributed cloud container platform ACK One to manage multi-region Kubernetes clusters. As Alibaba Cloud's enterprise-class cloud-native platform for hybrid cloud, multi-cluster, disaster recovery and other scenarios, ACK One can connect and manage Kubernetes clusters in any region and on any infrastructure, and provides consistent management to support applications and traffic. , security, storage, observability, etc. are under unified control.

The deployment architecture of this example is shown in the figure, including 3 production environments in different regions and 1 development and test environment. Generally speaking, by verifying and confirming that the version is stable in the R&D test environment before deploying it to the production environment, this process helps ensure the overall stability of the project and effectively prevent potential risks.

The example uses a multi-cluster hybrid cloud architecture. Specifically, the Shanghai cluster, Frankfurt cluster, and ShangHai Dev cluster are Alibaba Cloud ACK clusters; while the Japan cluster is a non-Alibaba Cloud Kubernetes cluster, which is integrated and managed by registering the cluster. Within each cluster, we use GameServerSet to deploy game servers. GameServerSet is a game-specific workload provided by OpenKruise, an open source project incubated by the Cloud Native Computing Foundation (CNCF). Compared with the native Deployment and StatefulSet workloads, GameServerSet has game semantics and is closer to the game scene, making the operation and maintenance management of the game server more convenient and efficient.

Cluster management

After the Kubernetes cluster preparation is completed, we use the ACK One fleet to uniformly manage the clusters on and off the cloud:

First, register the IDC or third-party public cloud cluster to Alibaba Cloud through ACK One registration cluster [ 1] , specifically:

  1. Create a registration cluster [ 2] , and click Details under the operation column on the right side of the created registration cluster .

  2. Click the Connection Information tab on the cluster information page .

  3. In the cluster import agent configuration area, select public network or private network as needed , then click Copy on the right to copy the contents of the public network or private network tag to a file, and execute the kubectl command to register the target cluster to the new cluster. middle. For example, create a new agent.yaml file, copy the above content to the agent.yaml file, and execute the kubectl apply -f agent.yaml command in the target cluster.

Through this step, the Japan cluster has been registered with Alibaba Cloud.

Secondly, open the ACK One multi-cluster fleet [ 3] and associate the registered cluster with the cloud cluster on the ACK One console [ 4] . Since the cluster spans multiple regions, the ACK One fleet will use the public network to associate with the cluster, and the VPC where the fleet is located needs to be configured with a public network NAT gateway.

At this point, a multi-cluster fleet is ready. The schematic diagram corresponding to the example is as follows:

Game server release

Before executing the specific release operation of the example, let's get to know the cloud native delivery idea - declarative rather than process-oriented, which means that cloud native application delivery focuses not on the application deployment process but on the application. Definition. The definition of an application is Yaml, which describes what the application should look like through configuration parameters. Therefore, all changes and releases related to the application are actually changes to the application description (Yaml).

So far we have discovered that we need a warehouse to store Yaml, record the current description of the application, and be able to trace and audit past Yaml changes. Having said this, I believe everyone will find that git repo naturally meets this characteristic. Operation and maintenance engineers can upload and put Yaml on disk by submitting Commit and Merge Request. Permission management and auditing all follow git specifications. In an ideal world, we only need to maintain a set of YAML descriptions of game servers, and trigger the release of game servers in multiple regions around the world with one click. There is no need to operate the cluster one by one to perform deployment actions. This is the idea of ​​GitOps.

The most challenging thing in the implementation process of GitOps is actually the abstract description of the game server application. As mentioned at the beginning of the article, there are more or less differences in game servers in each region, and it seems difficult to summarize all game servers through one Yaml. For example, in Shanghai I want to release 10 game regional servers, but in Frankfurt I only want to release 3 regional servers. In this way, a Yaml cannot describe game servers in different regions due to differences in the replicas field. Do we need to maintain a Yaml for each region? This is also unreasonable. When making non-differentiated field changes (for example, labeling game servers in all regions), we need to repeatedly perform multiple Yaml changes. When the number of clusters is large, it is easy to cause omissions or errors. This is not in line with the idea of ​​cloud native delivery.

In fact, we can further abstract the game server application through Helm Chart and extract the different parts as Value. In our example this time (GitHub game server Helm Chart example [ 5] ), we abstracted several different fields:

  • Master image - the master image of each region/cluster may be different
  • Sidecar image - the sidecar image of each region/cluster may be different
  • Number of copies - the number of game servers released per region/cluster may vary
  • Whether to auto-scaling - each region/cluster may have different requirements for auto-scaling

Other than this, other fields remain consistent, meaning there are no regional differences.

After understanding GitOps and abstracting and describing game server applications, we used ACK One GitOps to perform practical application delivery. Next we start the specific operations:

Connect to Git repository

Log in to the ArgoCD UI: Go to Fleet -> GitOps -> GitOps Console in the left navigation bar of the ACK One console , and on the login page, click LOG IN VIA ALIYUN to log in to the ArgoCD UI. If you need public network access, you need to open a public network to access GitOps [ 6] .

  1. Select Settings in the left navigation bar of the ArgoCD UI, then select Repositories > + Connect Repo.

  2. Configure the following information in the pop-up panel, and then click CONNECT to add a connection.

Publish PvE type games

PvE games usually have the concept of regional servers. In most cases, operation and maintenance engineers manually control the number of servers opened in each region. For the best practices of cloud-native PvE games, please refer to the OKG PvE game best practices document [ 7] .

White screen management application

When trying ArgoCD for the first time, we can use the white screen console to create Applications for the clusters in each region:

  1. Select  Applications in the left navigation bar of the ArgoCD UI , and then click + NEW APP.

  2. Configure the following information in the pop-up panel, and then click  CREATE  to create. (Take opengame-demo-shanghai-dev as an example).

  1. After the creation is completed,   you can see the application status of opengame-demo-shanghai-dev on the Application page. If  the SYNC POLICY  selects  Manual  mode, you need to manually click  SYNC to synchronously deploy the application to the target cluster. The status of the application   is  Healthy and Synced , indicating that the synchronization has been successful.

  1. Click the opengame-demo-shanghai-dev application name to view the application details and display the topology and corresponding status of the Kubernetes resources related to the application.

Publish with one click through ApplicationSet

After becoming familiar with ArgoCD, we can also use the ApplicationSet object to publish game servers with one click. The differences of each cluster are abstracted through elements. For example, in the Yaml below, three fields are abstracted from the cluster dimension: cluster name is used to distinguish the Application name; url is used to distinguish the target cluster address; replicas is used to distinguish games published by different clusters. Serve quantity.

After writing the ApplicationSet Yaml, deploy it to the ACK One fleet cluster to automatically create four Applications.

kubectl apply -f pve.yaml -n argocd

# pve.yaml 内容如下:
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
  name: minecraft
spec:
  generators:
  - list:
      elements:
      - cluster: shanghai-dev
        url: <https://47.100.237.xxx:6443>
        replicas: '1'
      - cluster: shanghai-prod
        url: <https://47.101.214.xxx:6443>
        replicas: '3'
      - cluster: frankfurt-prod
        url: <https://8.209.103.xxx:6443>
        replicas: '2'
      - cluster: japan-prod
        url: <https://10.0.0.xxx:6443>
        replicas: '2'
  template:
    metadata:
      name: '{{cluster}}-minecraft'
    spec:
      project: default
      source:
        repoURL: '<https://github.com/AliyunContainerService/gitops-demo.git>'
        targetRevision: HEAD
        path: manifests/helm/open-game
        helm:
          valueFiles:
          - values.yaml
          parameters: #对应helm chart中提取的value参数
          - name: replicas
            value: '{{replicas}}'
          - name: scaled.enabled 
            value: 'false'
      destination:
        server: '{{url}}'
        namespace: game-server #部署到对应集群的game-server命名空间下
      syncPolicy:
        syncOptions:
          - CreateNamespace=true #若集群中命名空间不存在则自动创建

In this Yaml, all image versions are consistent. If you want the image versions of each cluster to be different, you can add new parameters in the same way as replicas.

Publish PvP type games

For PvP-type games, the number of room servers is allocated by its own scaler rather than manually specified by operation and maintenance engineers. For cloud-native best practices for PvP games, please refer to the OKG PvP game best practices document [ 8] .

In OKG, we implement elastic scaling of room servers by configuring the ScaledObject object for GameServerSet. Therefore, scaled.enabled needs to be turned on in this scenario. In addition, the number of copies of the room server conflicts with two controllers, ArgoCD and OKG. This can be solved by letting ArgoCD ignore changes in the number of copies of the GameServerSet resource. Specifically, set the corresponding fields in spec.ignoreDifferences. Considering the above, the pvp.yaml looks like this:

apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
  name: pvp
spec:
  generators:
  - list:
      elements:
      - cluster: shanghai-dev
        url: <https://47.100.237.xxx:6443>
      - cluster: shanghai-prod
        url: <https://47.101.214.xxx:6443>
      - cluster: frankfurt-prod
        url: <https://8.209.103.xxx:6443>
      - cluster: japan-prod
        url: <https://10.0.0.xxx:6443>
  template:
    metadata:
      name: '{{cluster}}-pvp'
    spec:
      project: defaultminecraft
      ignoreDifferences: # 设置 GameServerSet minecraft副本数目由集群自控制
      - group: game.kruise.io
        kind: GameServerSet
        name: minecraft
        namespace: game
        jsonPointers:
        - /spec/replicas
      source:
        repoURL: '<https://github.com/AliyunContainerService/gitops-demo.git>'
        targetRevision: HEAD
        path: manifests/helm/open-game
        helm:
          valueFiles:
          - values.yaml
      destination:
        server: '{{url}}'
        namespace: pvp-server
      syncPolicy:
        syncOptions:
          - CreateNamespace=true

Summarize

This article uses an example to introduce the best practices for consistent delivery of ACK One in multiple regions on global game servers. The example involves 4 Kubernetes clusters and a simple game server Yaml. In the actual production environment, it is likely that the number of clusters will be larger and the game server application description will be more complex. At this time, the key is to abstract the application well.

Welcome to join the cloud native game DingTalk group (group number: 44862615) to communicate and discuss with OpenKruiseGame developers and game industry R&D and operation engineers; for questions related to ACK One, you can also join the DingTalk group (group number: 35688562) for consultation .

Related Links:

[1] ACK One registration cluster

https://help.aliyun.com/zh/ack/distributed-cloud-container-platform-for-kubernetes/user-guide/overview-9?spm=a2c4g.11186623.0.0.3e4157eb3o9J3v

[2] Create a registration cluster

https://help.aliyun.com/zh/ack/distributed-cloud-container-platform-for-kubernetes/user-guide/create-a-cluster-registration-proxy-and-register-a-kubernetes-cluster-deployed-in-a-data-center?spm=a2c4g.11186623.0.0.2f833eb6R1YTOq

[3] Start ACK One multi-cluster fleet

https://help.aliyun.com/zh/ack/distributed-cloud-container-platform-for-kubernetes/user-guide/enable-fleet-management?spm=a2c4g.11186623.0.0.8cc462853sti0H

[4] ACK One console

https://account.aliyun.com/login/login.htm?oauth_callback=https%3A%2F%2Fcs.console.aliyun.com%2Fone

[5] GitHub game server Helm Chart example

https://github.com/AliyunContainerService/gitops-demo/tree/main/manifests/helm/open-game

[6] Need to open a public network to access GitOps

https://help.aliyun.com/zh/ack/distributed-cloud-container-platform-for-kubernetes/user-guide/enable-gitops-public-network-access?spm=a2c4g.11186623.0.0.e7db48aeenz8AX

[7] OKG PvE Game Best Practices Document

https://openkruise.io/zh/kruisegame/best-practices/pve-game

[8] OKG PvP gaming best practices document

https://openkruise.io/zh/kruisegame/best-practices/session-based-game/

The Google Python Foundation team was laid off. Google confirmed the layoffs, and the teams involved in Flutter, Dart and Python rushed to the GitHub hot list - How can open source programming languages ​​and frameworks be so cute? Xshell 8 opens beta test: supports RDP protocol and can remotely connect to Windows 10/11. When passengers connect to high-speed rail WiFi , the "35-year-old curse" of Chinese coders pops up when they connect to high-speed rail WiFi. MySQL's first long-term support version 8.4 GA AI search tool Perplexica : Completely open source and free, an open source alternative to Perplexity. Huawei executives evaluate the value of open source Hongmeng: It still has its own operating system despite continued suppression by foreign countries. German automotive software company Elektrobit open sourced an automotive operating system solution based on Ubuntu.
{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/3874284/blog/11067468