K8s series -K8S best practical operation of the log collection

Thanks for sharing text - http://bjbsair.com/2020-04-03/tech-info/29912.html

Log collection difficulties:

In Kubernetes, the log acquisition compared to traditional virtual machines, physical machines to many complex ways, the most fundamental reason is Kubernetes the bottom of the exception mask, to provide more fine-grained resource scheduling, providing a steady upward dynamic environment. So log collection face is richer and more multi-dynamic environment, need to be considered.

E.g:

  • For a very short running time of Job type applications, from start to stop for only a few seconds, how to ensure that the log collection of real-time data can not keep up and lost?
  • K8s is recommended to use large-sized nodes, each node can run 10-100 + containers, collection containers 100+ how in resource consumption as low as possible?
  • In K8s in applications to yaml way to deploy, while log collection is still dominated by manual configuration file the form, how to make the log collection in a manner K8s be deployed?

K8s series -Kubernetes log collection best practices

Collection: active OR passive

Log into acquisition mode and push, passive collection, in K8s, the passive collection is generally divided into Sidecar and DaemonSet two ways, the initiative to push and push DockerEngine have direct business written in two ways.

  • DockerEngine LogDriver function itself, stdout container may be written to the remote memory by DockerEngine by configuring different LogDriver, in order to achieve the purpose of the log collection. Can be customized in this way, flexibility, resource isolation are very low, generally not recommended for use in a production environment;
  • Direct business written in the application of integrated log collection SDK, the SDK will be sent directly to the server log. This eliminates the need for a logical way off the collection plate and does not require additional deployment Agent, for a minimum consumption of system resources, but because of business and log SDK strong binding, overall flexibility is very low, generally only a great amount of logs scene use;
  • DaemonSet only way to run a log agent on each node node, collect all the logs on the node. DaemonSet relative resource consumption is much smaller, but the scalability, tenant isolation is limited, more suitable for a single function or business is not a lot of clusters;
  • Sidecar way to deploy a separate log for each POD agent, the agent is only responsible for a business application log collection. Sidecar relatively large resource consumption, flexibility and multi-tenant isolation but strong, the proposed large-scale clusters or as K8s PaaS platform for multiple business party cluster service in that way.

K8s series -Kubernetes log collection best practices

To conclude:

  • DockerEngine direct writing is generally not recommended;
  • Direct business written in the log recommended maximum amount of the scene;
  • DaemonSet generally used in small and medium clusters;
  • Sidecar recommended for very large clusters.

Comparison of various ways of collecting detailed as follows:

K8s series -Kubernetes log collection best practices

K8s series -Kubernetes log collection best practices

Log Output: stdout or file

And VM / different physical machines, K8s container and standard output files in two ways. In a container, the log output standard output directly to stdout or stderr, and stdout and stderr DockerEngine takeover file descriptor, upon receiving the log is processed in accordance with the rules DockerEngine LogDriver arranged; log file and print mode to VM / physical machines essentially similar, but you may use different log storage, such as a default storage, EmptyDir, HostVolume, NFS, etc.

Although Stdout print log is Docker official recommended way, but we need to note: This recommendation is based on the container only as a simple application scenario, the actual business scenarios, we recommend that you use the file of the way as much as possible, have the following main reasons points:

  • Stdout performance issues, from the application output stdout to the server, the middle will go through several processes (for example, the widespread use of JSON LogDriver): Application stdout -> DockerEngine -> LogDriver -> serialized as JSON -> Save to File -> Agent capture files -> parse JSON -> Upload server. Overhead compared to the entire file of the process a lot more, when the pressure measured, 100,000 lines per second log output will occupy DockerEngine 1 extra CPU cores;
  • Stdout does not support the classification, that is, all the outputs are mixed in a stream, not like a file classification output, usually an application has AccessLog, ErrorLog, InterfaceLog (call external interface logs), TraceLog, etc., and the format of these logs, use vary, if mixed in the same flow collection and analysis difficult;
  • The main program only supports Stdout container output, if the program is a daemon / fork run will not be able to use stdout;
  • Dump the way files support a variety of strategies, such as synchronous / asynchronous writes, cache size, file rotation policy, compressing the policy, clear strategy, relatively more flexible.

Therefore, we recommend using the online application mode output log file, Stdout only function in a single application or some K8s system / operation and maintenance component.

CICD Integration: Logging Operator

K8s series -Kubernetes log collection best practices

Kubernetes provides a standardized service deployment, routing rules can be declared by yaml (K8s API), the exposed service, mount the storage, run businesses, defined shrinkage expansion rules, so it is easy to Kubernetes CICD and system integration. The operation and maintenance monitoring log collection is an important part of the process, after all logs should be carried out in real-time on-line business collection.

Original way is to manually deploy the logical logs collected after the release, this approach requires manual intervention, contrary to CICD automation purposes; to automate, some people based log collection API / SDK package deployment of an automated service, publishing after trigger a call by webhook CICD, but the high cost of development of this approach.

In Kubernetes, the log is the most standard way to integrate a new resource to Kubernetes registration system to Operator (CRD) way to manage and maintain. In this way, CICD system does not require additional development, only to realize additional logging-related configuration on when deploying to Kubernetes system.

Kubernetes log collection program

K8s series -Kubernetes log collection best practices

Long before Kubernetes appeared, we began to develop environmental container log collection scheme, with the gradual stable K8s, we began to migrate to a lot of business on K8s platform, it is also based on the basis of the previous log on specially developed set of K8s acquisition program. With the main features are:

  • Support a variety of real-time data collection, comprising a container file, the Stdout container, host files, Journal, Event and the like;
  • Support multiple capture deployment, including DaemonSet, Sidecar, DockerEngine LogDriver and so on;
  • Support enrichment log data, including the additional Namespace, Pod, Container, Image, Node information;
  • Stability, high reliability, Agent achieve Logtail Ali collected from the research-based, the whole network currently has several millions of deployed instances;
  • CRD-based extensions, can use Kubernetes deploy released to deploy log collection rules, perfectly integrated with CICD.

Installation log acquisition components:

Currently this collection program has been opened, we offer a Helm installation package, which includes DaemonSet, AliyunlogConfig the statement Logtail CRD and CRD Controller, can be used directly after installation DaemonSet acquisition and configuration of the CRD. Installation is as follows:

  1. Ali cloud Kubernetes clusters can check installed when opened, it will automatically install the components described above, when the cluster is created. If the open time is not installed, it can be installed manually;
  2. If the self Kubernetes, whether self or at the cloud or other line, such acquisition scheme may be used, with reference to the specific installation mounted on a self Kubernetes Ali cloud.

After installing the above components, Logtail and the corresponding Controller will run in a cluster, but these components by default and will not collect any log, you need to configure log collection rules to collect all kinds of logs specified Pod.

Collection rule configuration: The environment variable or CRD:

In addition to manually configure the service console on the log, for Kubernetes additionally supports two configurations: environment variables and CRD.

  • Environment variables are self-configuration swarm era have been using, only need to declare an address data can be collected in a container environment variable that you want collected, Logtail will automatically gather the data to the server;

In this way the deployment of simple, low-cost learning, very easy to use; but rarely able to support configuration rules, many advanced configuration (eg analytical methods, filtration, black and white lists, etc.) do not support, but the way this statement is not supported modify / delete, modify each time they are actually created a new collection configuration, historical collection configuration requires manual cleaning, otherwise it will cause waste of resources.

K8s series -Kubernetes log collection best practices

CRD configuration is very much in line Kubernetes official recommended standard extension, which allows the collection is arranged so as K8s resource management, to declare the data to be collected through the deployment of AliyunLogConfig this particular CRD resources to Kubernetes.

Example, the following example is to deploy a collection container standard output, and wherein the definitions require Stdout Stderr are collected, and to exclude environmental variables comprises COLLEXT_STDOUT_FLAG: false container.

CRD-based way of configuration to Kubernetes standard extension resources management, support additions and deletions to change search the complete configuration of semantics, but also supports a variety of advanced configuration, we recommend extremely acquisition configuration.

K8s series -Kubernetes log collection best practices

Recommended configuration collection rules

K8s series -Kubernetes log collection best practices

Practical application scenarios are generally used DaemonSet or DaemonSet with Sidecar mixed mode, DaemonSet advantage of resource utilization is high, but there is a problem that all Logtail DaemonSet share the global configuration, and a single Logtail capped configuration support, therefore unable to support more number application clusters.

The above is the recommended configuration we give, the core idea is:

  • A similar configuration as much data acquisition, reducing the number of configuration, the pressure reducing DaemonSet;
  • The core of the application should be given adequate resources to collect, you can use Sidecar way;
  • Whenever possible, use configuration CRD way;
  • Since each individual is Logtail Sidecar configuration, there is no limit on the number of configuration, which is more suitable for the use of large clusters.

Practice 1-- SME clusters

K8s series -Kubernetes log collection best practices

The vast majority are small and medium Kubernetes cluster, and there is no clear definition for small and medium sized, the number of general application in less than 500, less than the size of the node 1000, no clear function Kubernetes platform operation and maintenance. This scenario is not a particularly large number of applications, DaemonSet can support all of the collection configuration:

  • Most of the data business applications using DaemonSet collection methods;
  • Core applications (high reliability requirements for collection, such as order / trading system) used alone Sidecar acquisition mode.

Practice 2 - large clusters

K8s series -Kubernetes log collection best practices

For some large / very large clusters as a PaaS platform, general business than in 1000, the size of the node are more than 1000, a special Kubernetes platform operation and maintenance personnel. The number of applications in this scenario there is no limit, DaemonSet can not support, so you must use Sidecar way, the overall plan is as follows:

  • Kubernetes system components log platform itself, the type of kernel log relatively fixed, this part of the log acquisition DaemonSet use, mainly for the operation and maintenance personnel to provide services platform;
  • Sidecar each service using the log acquisition mode, each service may be provided independently Sidecar acquisition destination address, to provide sufficient flexibility for DevOps service personnel.

Guess you like

Origin www.cnblogs.com/lihanlin/p/12657680.html