Elasticsearch 高级查询技巧

一、简介

1.1 Elasticsearch 简介

Elasticsearch 是一个开源的分布式搜索和分析引擎,它构建在 Apache Lucene 之上。它提供了分布式的实时搜索和分析功能,能够处理海量数据,并以毫秒级的响应时间返回查询结果。Elasticsearch 使用 JSON 格式来存储、搜索和分析数据。

二、基本查询类型

2.1 Match Query

Match Query 是最基本和常用的查询类型之一,用于在指定字段中搜索匹配指定词条的文档。

// 创建一个 Match Query
MatchQueryBuilder matchQuery = QueryBuilders.matchQuery("title", "Elasticsearch")
                                 .operator(Operator.AND)  // 设置匹配操作符为 AND
                                 .fuzziness(Fuzziness.AUTO)  // 设置模糊匹配
                                 .prefixLength(3)  // 设置前缀长度
                                 .maxExpansions(10);  // 设置最大扩展数

// 执行查询
SearchResponse searchResponse = client.prepareSearch("index")
                                     .setTypes("type")
                                     .setQuery(matchQuery)
                                     .get();

2.2 Bool Query

Bool Query 用于组合多个查询条件,支持 must、must_not、should 和 filter 四种逻辑操作。

// 创建一个 Bool Query
BoolQueryBuilder boolQuery = QueryBuilders.boolQuery()
                               .must(QueryBuilders.matchQuery("title", "Elasticsearch"))
                               .mustNot(QueryBuilders.termQuery("category", "deleted"))
                               .should(QueryBuilders.rangeQuery("price").lte(100))
                               .filter(QueryBuilders.termQuery("availability", "instock"));

// 执行查询
SearchResponse searchResponse = client.prepareSearch("index")
                                     .setTypes("type")
                                     .setQuery(boolQuery)
                                     .get();

2.3 Range Query

Range Query 用于搜索指定字段的值在指定范围内的文档。

// 创建一个 Range Query
RangeQueryBuilder rangeQuery = QueryBuilders.rangeQuery("price")
                               .gte(50)  // 大于等于 50
                               .lte(100);  // 小于等于 100

// 执行查询
SearchResponse searchResponse = client.prepareSearch("index")
                                     .setTypes("type")
                                     .setQuery(rangeQuery)
                                     .get();

2.4 Terms Query

Terms Query 用于在指定字段中搜索匹配多个指定词条的文档。

// 创建一个 Terms Query
TermsQueryBuilder termsQuery = QueryBuilders.termsQuery("category", "electronics", "books", "clothing");

// 执行查询
SearchResponse searchResponse = client.prepareSearch("index")
                                     .setTypes("type")
                                     .setQuery(termsQuery)
                                     .get();

2.5 Aggregation Query

Aggregation Query 用于对搜索结果进行聚合分析,可以统计、分组、计算等。

// 创建一个 Aggregation Query
SearchRequestBuilder searchRequest = client.prepareSearch("index")
                                           .setTypes("type");

AggregationBuilder aggregation = AggregationBuilders.terms("by_category")
                                     .field("category.keyword")
                                     .size(5)  // 设置返回的桶数量
                                     .order(Terms.Order.count(false));  // 设置桶排序方式

searchRequest.addAggregation(aggregation);

// 执行查询
SearchResponse searchResponse = searchRequest.get();

// 获取聚合结果
Terms byCategoryAggregation = searchResponse.getAggregations().get("by_category");
List<? extends Terms.Bucket> buckets = byCategoryAggregation.getBuckets();
for (Terms.Bucket bucket : buckets) {
    
    
    String key = bucket.getKeyAsString();
    long count = bucket.getDocCount();
    System.out.println(key + ": " + count);
}

三、高级查询

3.1 Fuzzy Query (模糊查询)

// 创建 Fuzzy Query
FuzzyQueryBuilder fuzzyQuery = QueryBuilders.fuzzyQuery("fieldName", "searchTerm");
// 设置模糊匹配的最大编辑距离
fuzzyQuery.fuzziness(Fuzziness.AUTO);
// 设置前缀长度
fuzzyQuery.prefixLength(2);
// 设置最大扩展项数
fuzzyQuery.maxExpansions(10);
// 设置查询的模糊程度
fuzzyQuery.transpositions(true);

// 将 Fuzzy Query 添加到查询对象中
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
sourceBuilder.query(fuzzyQuery);

3.2 Prefix Query (前缀查询)

// 创建 Prefix Query
PrefixQueryBuilder prefixQuery = QueryBuilders.prefixQuery("fieldName", "searchTerm");

// 将 Prefix Query 添加到查询对象中
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
sourceBuilder.query(prefixQuery);

3.3 Wildcard Query (通配符查询)

// 创建 Wildcard Query
WildcardQueryBuilder wildcardQuery = QueryBuilders.wildcardQuery("fieldName", "searchTerm");

// 将 Wildcard Query 添加到查询对象中
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
sourceBuilder.query(wildcardQuery);

3.4 Regexp Query (正则表达式查询)

// 创建 Regexp Query
RegexpQueryBuilder regexpQuery = QueryBuilders.regexpQuery("fieldName", "regexpPattern");

// 将 Regexp Query 添加到查询对象中
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
sourceBuilder.query(regexpQuery);

3.5 Geo-query (地理位置查询)

3.5.1 Geo-bounding box 查询

// 创建 Geo-bounding box 查询
GeoBoundingBoxQueryBuilder geoBoundingBoxQuery = QueryBuilders.geoBoundingBoxQuery("fieldName")
    .setCorners(bottomLeftLat, bottomLeftLon, topRightLat, topRightLon);

// 将 Geo-bounding box Query 添加到查询对象中
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
sourceBuilder.query(geoBoundingBoxQuery);

3.5.2 Geo-distance 查询

// 创建 Geo-distance 查询
GeoDistanceQueryBuilder geoDistanceQuery = QueryBuilders.geoDistanceQuery("fieldName")
    .point(centerLat, centerLon)
    .distance(distance, DistanceUnit.KILOMETERS);

// 将 Geo-distance Query 添加到查询对象中
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
sourceBuilder.query(geoDistanceQuery);

3.5.3 Geo-shape 查询

// 创建 Geo-shape 查询
GeoShapeQueryBuilder geoShapeQuery = QueryBuilders.geoShapeQuery("fieldName", "shapeType", "coordinates");

// 将 Geo-shape Query 添加到查询对象中
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
sourceBuilder.query(geoShapeQuery);

3.6 Script Query (脚本查询)

// 创建 Script Query
ScriptQueryBuilder scriptQuery = QueryBuilders.scriptQuery(new Script(ScriptType.INLINE, "painless", "script"));

// 将 Script Query 添加到查询对象中
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
sourceBuilder.query(scriptQuery);

四、复杂查询应用

4.1 多重排序

在 Elasticsearch 中,您可以通过指定多个排序字段来实现多重排序。下面是一个示例代码,展示了如何通过 Java API 进行多重排序:

// 创建 SearchRequest 请求对象
SearchRequest searchRequest = new SearchRequest("index_name");

// 创建 SearchSourceBuilder 构建查询条件
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

// 设置排序字段和排序方式
searchSourceBuilder.sort(new FieldSortBuilder("field1").order(SortOrder.ASC));
searchSourceBuilder.sort(new FieldSortBuilder("field2").order(SortOrder.DESC));

// 将查询条件设置到请求对象中
searchRequest.source(searchSourceBuilder);

// 发起搜索请求
SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);

4.2 模糊匹配和纠错

Elasticsearch 提供了模糊匹配和纠错的功能,您可以使用 Fuzzy Query 来实现。下面是一个示例代码,展示了如何通过 Java API 进行模糊匹配和纠错:

// 创建 SearchRequest 请求对象
SearchRequest searchRequest = new SearchRequest("index_name");

// 创建 SearchSourceBuilder 构建查询条件
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

// 创建 Fuzzy Query
FuzzyQueryBuilder fuzzyQuery = QueryBuilders.fuzzyQuery("field", "keyword").fuzziness(Fuzziness.AUTO);

// 将 Fuzzy Query 设置到查询条件中
searchSourceBuilder.query(fuzzyQuery);
// 将查询条件设置到请求对象中
searchRequest.source(searchSourceBuilder);

// 发起搜索请求
SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);

4.3 多字段查询

如果您想要在多个字段中进行查询,可以使用 Multi Match Query。下面是一个示例代码,展示了如何通过 Java API 进行多字段查询:

扫描二维码关注公众号,回复: 15865570 查看本文章
// 创建 SearchRequest 请求对象
SearchRequest searchRequest = new SearchRequest("index_name");

// 创建 SearchSourceBuilder 构建查询条件
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

// 创建 Multi Match Query
MultiMatchQueryBuilder multiMatchQuery = QueryBuilders.multiMatchQuery("keyword", "field1", "field2");

// 将 Multi Match Query 设置到查询条件中
searchSourceBuilder.query(multiMatchQuery);

// 将查询条件设置到请求对象中
searchRequest.source(searchSourceBuilder);

// 发起搜索请求
SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);

4.4 按照时间范围过滤数据

如果您想要按照时间范围来过滤数据,可以使用 Range Query。下面是一个示例代码,展示了如何通过 Java API 进行按照时间范围过滤数据:

// 创建 SearchRequest 请求对象
SearchRequest searchRequest = new SearchRequest("index_name");

// 创建 SearchSourceBuilder 构建查询条件
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

// 创建 Range Query
RangeQueryBuilder rangeQuery = QueryBuilders.rangeQuery("date_field").from("start_date").to("end_date");

// 将 Range Query 设置到查询条件中
searchSourceBuilder.query(rangeQuery);

// 将查询条件设置到请求对象中
searchRequest.source(searchSourceBuilder);

// 发起搜索请求
SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);

4.5 ResultSet 中的 Scroll Query 和 Search After

在 Elasticsearch 中,您可以使用 Scroll Query 和 Search After 来分页检索大量数据。下面是一个示例代码,展示了如何通过 Java API 使用 Scroll Query 和 Search After:

// 创建 SearchRequest 请求对象
SearchRequest searchRequest = new SearchRequest("index_name");

// 创建 SearchSourceBuilder 构建查询条件
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

// 设置每页返回的数据量
int pageSize = 10;
searchSourceBuilder.size(pageSize);

// 设置 Scroll 过期时间
TimeValue scrollTime = TimeValue.timeValueMinutes(1);
searchRequest.scroll(scrollTime);

// 将查询条件设置到请求对象中
searchRequest.source(searchSourceBuilder);

// 发起搜索请求
SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);

// 获取 Scroll Id
String scrollId = searchResponse.getScrollId();

// 获取第一页的数据
SearchHits hits = searchResponse.getHits();
String[] scrollIds = new String[hits.getHits().length];
for (int i = 0; i < hits.getHits().length; i++) {
    
    
    scrollIds[i] = hits.getHits()[i].getId();
}

// 使用 Scroll Id 进行后续分页请求
while (hits.getHits().length != 0) {
    
    
    SearchScrollRequest scrollRequest = new SearchScrollRequest(scrollId);
    scrollRequest.scroll(scrollTime);
    
    // 设置 Search After 参数
    scrollRequest.searchAfter(scrollIds);
    
    // 发起 Scroll 请求
    searchResponse = client.scroll(scrollRequest, RequestOptions.DEFAULT);

    // 处理每一页的数据
    hits = searchResponse.getHits();
    for (int i = 0; i < hits.getHits().length; i++) {
    
    
        // 处理数据
    }
}

五、调试和性能优化

5.1 Explain API 来查看匹配细节

Explain API 可以帮助我们查看文档是如何匹配查询条件的细节。以下是一个使用 Explain API 的示例代码。

SearchRequest searchRequest = new SearchRequest("index_name");
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
searchSourceBuilder.query(QueryBuilders.termQuery("field_name", "search_term"));
searchRequest.source(searchSourceBuilder);

ExplainRequest explainRequest = new ExplainRequest("index_name","document_id");
explainRequest.source(searchSourceBuilder);

ExplainResponse explainResponse = client.explain(explainRequest, RequestOptions.DEFAULT);
Explanation explanation = explainResponse.getExplanation();

System.out.println(explanation.toString());

5.2 查看搜索 Profile 并进行优化

搜索 Profile 可以帮助我们了解查询的执行过程,以及查询的哪个部分消耗了最多的时间。以下是一个使用 Profile 的示例代码。

SearchRequest searchRequest = new SearchRequest("index_name");
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
searchSourceBuilder.query(QueryBuilders.termQuery("field_name", "search_term"));
searchSourceBuilder.profile(true);
searchRequest.source(searchSourceBuilder);

SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
SearchProfileShardResult profileResult = searchResponse.getProfileResults()[0];
SearchProfileTree profileTree = profileResult.getQueryProfileTree();

System.out.println(profileTree.toString());

5.3 使用 warmer 减少冷查询时间

当 Elasticsearch 发现一个新的查询模式时,会进行一次冷查询来初始化缓存。使用 warmer 可以减少冷查询的时间。以下是一个使用 warmer 的示例代码。

IndicesWarmerRequest request = new IndicesWarmerRequest("warmer_name");
request.indices("index_name");
request.types("type_name");
request.source("{\n" +
    "  \"query\": {\n" +
    "    \"match\": {\n" +
    "      \"field_name\": \"search_term\"\n" +
    "    }\n" +
    "  }\n" +
    "}");

WarmerResponse warmerResponse = client.indices().warmer(request, RequestOptions.DEFAULT);
boolean acknowledged = warmerResponse.isAcknowledged();

if (acknowledged) {
    
    
    System.out.println("Warmer has been successfully registered!");
}

猜你喜欢

转载自blog.csdn.net/u010349629/article/details/131697266