ElasticAPM first experience

Foreword

From the most fundamental observability of view, leads to realize this idea of APM (Application Performance Management) framework, through the understanding of the APM core components and data models, can deepen ElasticAPM understand. Finally, the practical exercise ElasticAPM, to achieve performance tracking applications.

Observability

"Observability" supplier can not deliver alone outside the system function, but you are rooted in a property which is in the construction of the system, like ease of use, high availability and stability. Design and Construction "observable" target system is to ensure that when it is run in production, the person responsible for the operation it is possible to detect the undesirable behavior (e.g., service downtime, errors, slow response), and operable to have information effectively determine the root cause (for example, detailed event logs, fine-grained resource usage information, and application tracking). It may seem bland, but the organization in achieving this goal will encounter many challenges, common challenges include: not gather enough information; collecting too much information, but did not extract the contents of the instructive; this information They are stored in many different locations.

As seen above citations, observability focus on when they encounter problems, difficult to locate the problem solving system problems in operation, especially in large-scale distributed systems, business processes rely on a large number of downstream systems, and more It is to improve the positioning of the complexity of the problem. And have a good system can be observed in nature, can effectively locate the key to the problem with a very low cost. Observability shown below has three major support:

  • Log: Event program runs produced, can explain in detail its operation status
  • Indicators: a set of aggregate values, mainly used for monitoring infrastructure (machines, containers, network, etc.), but also has applications will be used to monitor the operational level, such as open source search system Elasticsearch have a query or write about the amount, time-consuming, refused rate application level indicators;
  • Application Performance Monitoring (Application Performance Monitor): Note that here the M stands for Monitor , refer to dive into the code-level track (or monitor), including the execution of internal procedures, the situation is the link between service calls, etc., can easily find the cause of the program "slow". APM is most commonly used to track a request to the web server process, including the implementation of internal logic, call external services and their corresponding time-consuming.

You can observe the three elements of

It says so much, popular thing about an example, when you are responsible for the business logic of complex systems, third-party services depend on the premise middleware or more. One day, you receive a series of police performance, ask your boss why you ** This interface is so slow? ** If the observability do relatively poor system, you crazy play log log, statistical time-consuming among the various lines of code, and then on the line, which line analysis in the end more slowly in the vast ocean of the log, if found Congratulations to you, you are lucky, you only need to remove the logs or change the log level, and then the last line on it!

However, it is likely to be not found, then it will not happen GC? At that time more bandwidth? Will server load was high? And so the possibility you because there is no complete record, is once not now, you will be difficult to ensure that the interface is not responsible for the outbreak performance will alarm problem, your boss will doubt your abilities! Well terrible about?

If then, heaven-sent you a fully functional APM system, you stated by the police in traceid , APM platform in the search for the corresponding call chain, we found the interface very time-consuming your dependent, or through your Metrics found During that time GC is very frequent, and through the log to get specific implementation information, Bingo! I believe the problem has been pretty close, less than 10 minutes the boss replies: "I found the problem, because balabala", yes, asked me why I have APM, because everything under control .

APM Brief

By analyzing the previous section, we know what is observability . APM that a good product is to improve the observability of the application. Most products currently on the market APM Application Performance Monitoring (the Application Performance Monitoring) , but the definition of Wikipedia is the application performance management (the Application Performance Management) .

APM is defined in Wikipedia is the application performance management (the Application Performance Management) , while the market most of the APM product definition is application performance monitoring (Application Performance Monitoring). "What Is Application Performance Monitoring and Why It Is Not Application Performance Management" This text is considered to application performance monitoring application performance management part of the former can help you find the problem, which can help you analyze and solve problems. But in fact most of the APM products include some analysis of the problem, and the industry did not make a clear distinction between the two definitions, so basically we can both be considered to be identical.

Dimensions of APM functionality

  • End-user experience monitoring (End user experience monitoring). By monitoring the user's behavior in order to optimize the user experience. For example: monitoring user interface and web / client interaction, and record the time interaction events.

  • Runtime application architecture (Runtime application architecture). Understanding the dependencies between services, network topology architecture applications that interact.

  • Business Services (Business transaction). Generate meaningful SLA reports, and provides trend information about application performance from a business perspective.

  • Depth Component Monitoring (Deep dive component monitoring). It is always needed for the main agent and the intermediate layer, including web servers, application servers, and the message. Robust monitoring should be able to show a clear path code execution, because this dimension and the second dimension is closely related, APM product will usually merge these two dimensions as a function.

  • Analysis or reporting (Analytics / reporting). A series of indicators from the application data collected, universal view into application performance standard way to represent data.

Elastic APM Profile

Specifically refer to the official documentation , summed up, is ElasticAPM is based on the ELK, made application performance monitoring components, the Logging, Metrics, and the introduction of APM Tracing data already exist, integrate, into a one station type application platform can be observed .

Performance Monitoring assembly schematic ElasticAPM

Package

Elastic APM consists of four components:

  • APM agents: provided in the form of application libraries, performance monitoring data collection program and reported to APM server.
  • APM Server: After receiving the data from the APM agents, checksum process of writing Elasticsearch APM particular index. Although the agent may also be implemented as: reporting after the data collection process directly to the ES, the official reason given not to do: to keep the agent lightweight, prevent certain security risks and enhance compatibility Elastic components.
  • Elasticsearch: for storing performance data and providing aggregate functionality.
  • Kibana: visualize performance data and help locate performance bottlenecks.

ElasticAPM components

Data Model

Elastic APM agent from detection ( Instrument collection application) in different types of data, these are called event types include span, transaction, error and indicators of four.

  • Span contain information about specific code paths have been executed. They measure from the beginning to the end of the activities carried out, and may have a parent / child relationship with other span.
  • Transaction (Transaction) is a special Span (no parent span, can only be derived from the sub-span, can be understood as "tree" root node of such a data structure), has other attributes associated with it. Transactions can be regarded as the highest level of service work, such as service requests, etc.
  • Error: Error Event contains information about the original exception or create relevant when an exception occurs in the log occurred.
  • Indicators: APM agent automatically obtain the basic host-level targets, including system and process-level CPU and memory indicators. In addition to specific indicators also available agent, for example, JVM metrics in Java agent and Go Go runtime agents in the index.

Elastic APM combat

Environment Installation

Installation package ready

Using the latest version of the ELK and Apm agent

Java agent: elastic-apm-agent-1.12.0.jar

Apm Server: apm-server-7.5.1-darwin-x86_64

ElasticSearch: elasticsearch-7.5.1

Kibana: kibana-7.5.1-darwin-x86_64

Environment to build

Environmental boot sequence is as follows

  • Start ElasticSearch, the default port 9200

  • Start Kibana, the default port 5601

    Installed Kibana

  • Start Server APM, ./apm-server -e, the default port is 8200, due to the use Golang write apm Server, you need to install Go local environment.

  • Start the java -javaagent:/Tools/apm/elastic-apm-agent-1.12.0.jar -Delastic.apm.secret_token= -jar apm-0.0.1-SNAPSHOT.jarapplication, . If you use the Idea launch an application, you can configure the following

    Carrying the idea to start parameter apm

    After starting, if we find the following log represents woven into apm-java-agent. Java agent employed Byte Buddy art the weaving bytecode, so that no invasive agent becomes.

    2020-01-08 15:09:32.603 [apm-server-healthcheck] INFO co.elastic.apm.agent.report.ApmServerHealthChecker - Elastic APM server is available: {  "build_date": "2019-12-16T20:57:12Z",  "build_sha": "348d8d83c3c823b64fc0692be607b1a5a8fac775",  "version": "7.5.1"}
    2020-01-08 15:09:32.779 [main] INFO co.elastic.apm.agent.configuration.StartupInfo - Starting Elastic APM 1.12.0 as sample_wsy on Java 1.8.0_201 (Oracle Corporation) Mac OS X 10.14.6
    
    复制代码

Demo Development

Pom.xml, projects need to introduce apm-agentdependent, apm-agent default collection events such as HTTP requests and database queries, and in order not to invade applications, bytecode using weaving technology to implement java-agent, such apm-agent business system becomes transparent, but transparent means uncontrollable, apm is also provided public API, manually controlled manner, tell the agent to collect the information.

Specific Java-agent has many functions, this article does not go into details, please refer to the official documentation

<dependency>
	<groupId>co.elastic.apm</groupId>	
	<artifactId>apm-agent-api</artifactId>
	<version>1.12.0</version>
</dependency>
复制代码

Controller Code

    @GetMapping("/test/{name}")
    public String test(@PathVariable String name) {
        String result = null;
        Transaction transaction = ElasticApm.currentTransaction();
        System.out.println(transaction.getTraceId());
        try {
            transaction.setName("WsyController#test");
            transaction.setType("CUSTOM");
            result = testService.test(name);


        } catch (Exception e) {
            transaction.captureException(e);
        } finally {
            //WARN co.elastic.apm.agent.impl.transaction.AbstractSpan - End has already been called: ''
            //注意,如果调用ElasticApm.currentTransaction();就不需要transaction.end();否则会报如上警告
            transaction.end();
        }

        return result;
    }
复制代码

Service Code

@Service
public class TestService {
    public String test(String name) {
        Span span = ElasticApm.currentTransaction().startSpan();
        try {
            span.setName("test0-wsy");
            test1(name);
        } catch (Exception e) {
            span.captureException(e);
        } finally {
            span.end();
        }
        return "hello " + name;
    }

    public String test1(String name) {
        Span span = ElasticApm.currentTransaction().startSpan();
        try {
            span.setName("test1-wsy");
            
        } catch (Exception e) {
            span.captureException(e);
        } finally {
            span.end();
        }
        return name;
    }
}
复制代码

After requesting the Controller, refresh Kibana, already we have found a link tracking

APM tracing schematic

submit questions

1, this stand-alone version of the environment, through testing, but in a distributed environment, the request from the series for many applications, tracking service that can be achieved? What principle is to achieve?

2, Elastic APM can automatically capture http requests, in the PRC distributed environment, Elastic APM can work properly? Do you have to employ public APIto achieve?

to sum up

1, APM can improve system observability

2, ElasticAPM can help solve link tracking ElasticSearch and Kibana stop

Reference Documents

Dapper, large-scale distributed systems tracking system

Dapper, a Large-Scale Distributed Systems Tracing Infrastructure

Official APM Java Agent Reference 1.x

Official APM Overview

Full Distributed Link Tracking System with APM

Use Elastic APM make application performance monitoring

Elastic Stack achieved by observability

Distributed Tracking, tracking and open ElasticAPM

Guess you like

Origin juejin.im/post/5e15a07f6fb9a0484d690b5d