文章目录
常用聚合
1. Missing Aggregation
返回缺省文档数量
如下事例返回没有price的文档数
POST twitter/tweet/_search
{
"size": 0,
"aggs" : {
"products_without_postDate" : {
"missing" : { "field" : "price" }
}
}
}
效果
{
"took": 4,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 6,
"max_score": 0,
"hits": []
},
"aggregations": {
"products_without_a_price": {
"doc_count": 4
}
}
}
2. Range Aggregation
范围聚合,按范围区间聚合桶
注意,此聚合包括from值,而不包括每个范围的to值。
如下事例返回 -2.2,2.2-2.5 ,2.5- 三个桶。
POST twitter/tweet/_search
{
"size": 0,
"aggs" : {
"price_ranges" : {
"range" : {
"field" : "price",
"ranges" : [
{ "to" : 2.2 },
{ "from" : 2.2, "to" : 2.5 },
{ "from" : 2.5 }
]
}
}
}
}
效果
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 6,
"max_score": 0,
"hits": []
},
"aggregations": {
"price_ranges": {
"buckets": [
{
"key": "*-2.2",
"to": 2.2,
"doc_count": 0
},
{
"key": "2.2-2.5",
"from": 2.2,
"to": 2.5,
"doc_count": 1
},
{
"key": "2.5-*",
"from": 2.5,
"doc_count": 1
}
]
}
}
}
3. Histogram Aggregation
Histogram 直方图聚合,根据区间分割桶,同date_histogram差不多
- interval:间隔
- min_doc_count: 最小文档数,如果设置成1,则为0的桶不显示
- extended_bounds:设置返回桶的范围,Elasticsearch 默认只返回你的数据中最小值和最大值之间的 buckets。如果2.2桶没有数据,2.3有数据,则不设置extended_bounds的话只会从2.3的桶开始返回。
如下事例根据0.1间隔构建价格桶聚合
POST twitter/tweet/_search
{
"size": 0,
"aggs" : {
"prices" : {
"histogram" : {
"field" : "price",
"interval" : 0.1,
"min_doc_count" : 1, // 可选
"extended_bounds": { // 可选
"min": 2.2,
"max": 2.5
}
}
}
}
}
效果
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 6,
"max_score": 0,
"hits": []
},
"aggregations": {
"prices": {
"buckets": [
{
"key": 2.2,
"doc_count": 1
},
{
"key": 2.5,
"doc_count": 1
}
]
}
}
}
3.1 排序
根据子聚合排序
POST twitter/tweet/_search
{
"size": 0,
"aggs" : {
"prices" : {
"histogram" : {
"field" : "price",
"interval" : 0.1,
"order" : { "avg1" : "asc" }
},
"aggs" : {
"avg1" : {
"avg" : {
"field": "price"
}
}
}
}
}
}
根据KEY排序
POST twitter/tweet/_search
{
"size": 0,
"aggs" : {
"prices" : {
"histogram" : {
"field" : "price",
"interval" : 0.1,
"order" : { "_key" : "desc" }
}
}
}
}
根据count排序就是改成_count
4. Terms Aggregation
基于聚合的多桶值源,其中桶是动态构建的——每个惟一的值一个桶。
如下事例根据name聚合
POST twitter/tweet/_search
{
"size": 0,
"aggs" : {
"nameAgg" : {
"terms" : { "field" : "name" }
}
}
}
效果
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 6,
"max_score": 0,
"hits": []
},
"aggregations": {
"nameAgg": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "kimchy1",
"doc_count": 2
},
{
"key": "SIMA",
"doc_count": 1
},
{
"key": "SIMA1",
"doc_count": 1
},
{
"key": "SIMA5",
"doc_count": 1
},
{
"key": "SIMA6",
"doc_count": 1
}
]
}
}
}
5. Avg,Min,Max,Stats,Sum Aggregation
- stats:统计 count max min avg sum
- extended_stats:比stats多4个统计结果: 平方和、方差、标准差、平均值加/减两个标准差的区间
POST twitter/tweet/_search
{
"size": 0,
"aggs" : {
"avg_price" : {
"avg" : {
"field" : "price"
}
},
"sum_price" : {
"sum" : {
"field" : "price"
}
},
"min_price" : {
"min" : {
"field" : "price"
}
},
"max_price" : {
"max" : {
"field" : "price"
}
},
"stats_price" : {
"stats" : {
"field" : "price"
}
},
"extended_stats_price" : {
"extended_stats" : {
"field" : "price"
}
}
}
}
效果
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 6,
"max_score": 0,
"hits": []
},
"aggregations": {
"max_price": {
"value": 2.5
},
"min_price": {
"value": 2.2
},
"extended_stats_price": {
"count": 2,
"min": 2.2,
"max": 2.5,
"avg": 2.35,
"sum": 4.7,
"sum_of_squares": 11.09,
"variance": 0.022499999999999076,
"std_deviation": 0.1499999999999969,
"std_deviation_bounds": {
"upper": 2.649999999999994,
"lower": 2.050000000000006
}
},
"avg_price": {
"value": 2.35
},
"stats_price": {
"count": 2,
"min": 2.2,
"max": 2.5,
"avg": 2.35,
"sum": 4.7
},
"sum_price": {
"value": 4.7
}
}
}
6. Percentiles Aggregation
6.1 基础聚合
如下事例返回price百分比分布情况
POST schools/classes/_search
{
"size": 0,
"aggs" : {
"per_price" : {
"percentiles" : {
"field" : "price",
"percents" : [95, 99, 99.9]
}
}
}
}
效果
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 3,
"successful": 3,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 6,
"max_score": 0,
"hits": []
},
"aggregations": {
"per_price": {
"values": {
"1.0": 1.2,
"5.0": 1.2,
"25.0": 1.45,
"50.0": 2.2,
"75.0": 2.2,
"95.0": 2.3,
"99.0": 2.4
}
}
}
}
大多数价格在1.2~2.2,偶尔也会是2.3, 2.4
6.2 指定返回百分比
POST schools/classes/_search
{
"size": 0,
"aggs" : {
"per_price" : {
"percentiles" : {
"field" : "price",
"percents" : [95, 99, 99.9]
}
}
}
}
效果
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 3,
"successful": 3,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 7,
"max_score": 0,
"hits": []
},
"aggregations": {
"per_price": {
"values": {
"95.0": 2.27,
"99.0": 2.2939999999999996,
"99.9": 2.2994
}
}
}
}
6.3 脚本
POST schools/classes/_search
{
"size": 0,
"aggs" : {
"per_price" : {
"percentiles" : {
"field" : "price",
"percents" : [95, 99, 99.9]
}
}
}
}
效果
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 3,
"successful": 3,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 7,
"max_score": 0,
"hits": []
},
"aggregations": {
"load_time_outlier": {
"values": {
"1.0": 0.0012,
"5.0": 0.0012,
"25.0": 0.0017000000000000001,
"50.0": 0.0022,
"75.0": 0.0022,
"95.0": 0.00227,
"99.0": 0.002294
}
}
}
}
7. Global aggregation和Filter aggregation
8. Top Hits Aggregation
9. Percentile_ranks aggregation
求每个值对应的百分位
统计postDate小于等于1574874999000和1574928999000的文档的占比,和第6项相反
POST twitter/tweet/_search
{
"size" : 0,
"aggs": {
"agg_rank": {
"percentile_ranks": {
"field": "postDate",
"values": [
1574874999000,
1574928999000
]
}
}
}
}
效果
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 6,
"max_score": 0,
"hits": []
},
"aggregations": {
"agg_rank": {
"values": {
"1.574874999E12": 49.99942403091834,
"1.574928999E12": 100
}
}
}
}
结果说明:postDate小于1574874999000的文档占比为49.99%,postDate小于1574928999000的文档占比为100%,
10. Nested aggregation
项目推荐
IT-CLOUD :IT服务管理平台,集成基础服务,中间件服务,监控告警服务等。
IT-CLOUD-ACTIVITI6 :Activiti教程源码。博文在本CSDN Activiti系列中。
IT-CLOUD-ELASTICSEARCH :elasticsearch教程源码。博文在本CSDN elasticsearch系列中。开源项目,持续更新中,喜欢请 Star~