Elasticsearch data query and statistics based on Spring Data Jest

Command Query Responsibility Segregation (CQRS) separates modification (Command, addition, deletion, modification, which will modify the system status) and query (Query, check, will not modify the system status) from the business. behavior. This makes the logic clearer and facilitates targeted optimization of different parts.

CQRS has the following advantages:

1. The division of labor is clear and can be responsible for different parts;
2. Separating the responsibilities of business commands and queries can improve the performance, scalability and security of the system. In addition, it can maintain a high degree of flexibility in the evolution of the system, and can prevent the occurrence of CRUD mode, when one side of the query or modification is changed, causing problems on the other side;
3. The logic is clear, and it is possible to see those behaviors or operations in the system that cause the state of the system to change;
4. You can switch from Data-Driven to Task-Driven and Event-Driven.

Therefore, Command uses a common database (relational database or non-relational database), and Query uses Elasticsearch, which is more efficient to query.

How to ensure data consistency between database and Elasticsearch?

We can use event-driven (Event-Driven), that is, Spring Data's Domain Event to synchronize data, please refer to the blog: http://www.wisely.top/2017/06/20/spring-data-domain-event-usage/ .

When the old database has a large amount of data that needs to be imported into Elasticsearch, you can refer to the blog: http://www.wisely.top/2018/02/24/spring-batch-elasticsearch/

Spring Data Elasticsearch uses the transport client, and the Elasticsearch official website recommends using the REST client. Alibaba Cloud's Elasticsearch still has problems using the transport client. Alibaba Cloud recommends using the REST client.

This example uses Spring Data Jest to link Elasticsearch, and the version of Elasticsearch is: 5.5.3 (currently only supported by spring boot 2.0 and above)

1. Project build

1.pom dependencies are as follows:

<dependency>
    <groupId>com.github.vanroy</groupId>
    <artifactId>spring-boot-starter-data-jest</artifactId>
    <version>3.0.0.RELEASE</version>
</dependency>

<dependency>
    <groupId>io.searchbox</groupId>
    <artifactId>jest</artifactId>
    <version>5.3.2</version>
</dependency>

2. Configuration file

spring:
  data:
    jest:
      uri: http://127.0.0.1:9200
      username: elastic
      password: changeme

2. Construct query conditions

Take a simple entity class as an example

package com.hfcsbc.esetl.domain;

import lombok.Data;
import org.springframework.data.elasticsearch.annotations.Document;
import org.springframework.data.elasticsearch.annotations.Field;
import org.springframework.data.elasticsearch.annotations.FieldType;

import javax.persistence.Entity;
import javax.persistence.Id;
import javax.persistence.OneToOne;
import java.util.Date;
import java.util.List;

/**
 * Create by pengchao on 2018/2/23
 */
@Document(indexName = "person", type = "person", shards = 1, replicas = 0, refreshInterval = "-1")
@Entity
@Data
public class Person {
    @Id
    private Long id;
    private String name;
    @OneToOne
    @Field(type = FieldType.Nested)
    private List<Address> address;
    private Integer number;
    private Integer status;
    private Date birthDay;
}

package com.hfcsbc.esetl.domain;

import lombok.Data;

import javax.persistence.Entity;
import javax.persistence.Id;

/**
 * Create by pengchao on 2018/2/23
 */
@Entity
@Data
public class Address {
    @Id
    private Long id;
    private String name;
    private Integer number;
}

1. Query according to multiple states (similar to sql's in)

BoolQueryBuilder orderStatusCondition = QueryBuilders.boolQuery()
        .should(QueryBuilders.termQuery("status", 1))
        .should(QueryBuilders.termQuery("status", 2))
        .should(QueryBuilders.termQuery("status", 3))
        .should(QueryBuilders.termQuery("status", 4))
        .should(QueryBuilders.termQuery("status", 5));

2.and link query (similar to sql's and)

BoolQueryBuilder queryBuilder = QueryBuilders.boolQuery();
queryBuilder
        .must(queryBuilder1)
        .must(queryBuilder2)
        .must(queryBuilder3);

3.range query (similar to sql's between .. and ..)

QueryBuilder rangeQuery = QueryBuilders.rangeQuery("birthDay").from(yesterday).to(today);

4. Nested Object Query

QueryBuilder queryBuilder = QueryBuilders.nestedQuery("nested", QueryBuilders.termQuery("address.id", 100001), ScoreMode.None);

ScoreMode: Defines how the score in the other join side is used. If we don't care about scoring, we only need to set it to ScoreMode.None, this method will ignore the score and thus be more efficient and save memory

3. Get Statistics

1. Non-nested data summation

SumAggregationBuilder sumBuilder = AggregationBuilders.sum("sum").field("number");
SearchQuery searchQuery = new NativeSearchQueryBuilder()
        .withIndices(QUERY_INDEX)
        .withTypes(QUERY_TYPE)
        .withQuery(boolQueryBuilder)
        .addAggregation(sumBuilder).build();
        
AggregatedPage<ParkingOrder> account = (AggregatedPage<ParkingOrder>) esParkingOrderRepository.search(EsQueryBuilders.buildYesterdayArrearsSumQuery(employeeId));

int sum = account.getAggregation("sum", SumAggregation.class).getSum().intValue();

2. Nested data summation

SumAggregationBuilder sumBuilder = AggregationBuilders.sum("sum").field("adress.num");
AggregationBuilder aggregationBuilder = AggregationBuilders.nested("nested", "adress").subAggregation(sumBuilder);
SearchQuery searchQuery = new NativeSearchQueryBuilder()
        .withIndices(QUERY_INDEX)
        .withTypes(QUERY_TYPE)
        .withQuery(boolQueryBuilder)
        .addAggregation((AbstractAggregationBuilder) aggregationBuilder).build();
AggregatedPage<ParkingOrder> account = (AggregatedPage<ParkingOrder>) esParkingOrderRepository.search(EsQueryBuilders.buildYesterdayArrearsSumQuery(employeeId));
int sum = account.getAggregation("nested", SumAggregation.class).getAggregation("sum", SumAggregation.class).getSum().intValue();