elasticsearch 笔记总结

# 索引初始化
curl XPUT 'http://192.168.17.128:9200/library/' -d '{
  "settings": {
  	"index": {
  	  "number_of_shards": 5,
  	  "number_of_replicas": 1
  	}
  }
}'

# 获取索引设置信息
curl XGET 'http://192.168.17.128:9200/library/_settings'

# 获取多个文档
curl 'localhost:9200/_mget' -d '{
  "docs": [
    {
      "_index": "shakespeare",
      "_type": "line",
      "_id": 6,
      "_source": "play_name"
    },
    {
      "_index": "shakespeare",
      "_type": "line",
      "_id": 28,
      "_source": "play_name"
    }
  ]
}'


内置字段	`_uid, _id, _type, _source, _all, _analyzer, _boost, _parent, _routing, _index, _size, _timestamp, _ttl`
字段类型	`String, Integer/long, Float/double, Boolean, Null, Date`

GET _search
{
  "query": {
    "match_all": {}
  }
}

# 删除索引
DELETE /kibana_sample_data_logs

DELETE /.kibana

# 索引初始化
# number_of_replicas 还可以换成:
# blocks.read_only 设为 true 则当前索引只允许读
# blocks.read 设为 true 则禁止读操作
# blocks.write 设为 true 则禁止写操作
# blocks.metadata 设为 true 则禁止 metadata 操作
PUT /library
{
  "settings": {
    "index": {
      "number_of_shards": 5,
      "number_of_replicas": 1
    }
  }
}

PUT /library2
{
  "settings": {
    "index": {
      "number_of_shards": 5,
      "number_of_replicas": 1
    }
  }
}

# 获取索引信息
GET /_settings

GET /_all/_settings

GET /library/_settings

GET /library2/_settings

GET /.kibana/_settings

GET /library,library2/_settings


# 创建文档
#! Deprecation: [types removal] Specifying types in document index requests is
# deprecated, use the typeless endpoints instead (/{index}/_doc/{id}, /{index}/_doc, or /{index}/_create/{id}).
PUT /library/books/1
{
  "title": "Elasticsearch: The Definitive Guide",
  "name": {
    "first": "Zachary",
    "last": "Tang"
  },
  "publish_date": "2015-02-06",
  "price": "49.99"
}


PUT /library/_create/2
{
  "title": "Elasticsearch Blueprints",
  "name": {
    "first": "Vineeth",
    "last": "Mohan"
  },
  "publish_date": "2015-06-06",
  "price": "35.99"
}


POST /library/_doc
{
  "title": "Elasticsearch Test Book",
  "name": {
    "first": "Vineeth",
    "last": "Mohan"
  },
  "publish_date": "2015-06-06",
  "price": "77.99"
}

GET /library/_search

GET _search
{
  "query": {
    "match_all": {}
  }
}

# 获取指定 id 的文档
GET /library/_doc/1

GET /library/_doc/2

GET /library/_doc/PF1tVW4BpUJa22bdv_cr

# 获取指定 id 的文档的指定字段
GET /library/_doc/1?_source=price

GET /library/_doc/1?_source=price,title

GET /library/_doc/1?_source=price,name.last

# 更新
# !Deprecation: [types removal] Specifying types in document update requests is
# deprecated, use the endpoint /{index}/_update/{id} instead.
POST /library/_doc/1/_update
{
  "doc": {
    "price": "199.99"
  }
}

POST /library/_update/1
{
  "doc": {
    "price": "49.98"
  }
}


# 删除
DELETE /library/_doc/PF1tVW4BpUJa22bdv_cr

GET /library2/_search

DELETE /library2


# 同时获取多个文档
GET /_mget
{
  "docs": [
    {
      "_id": 3,
      "_index": "library",
      "_source": ["price", "title"]
    },
    {
      "_id": "Pl2AVW4BpUJa22bd_vd-",
      "_index": "library",
      "_source": "name.first"
    },
    {
      "_id": "Pl2AVW4BpUJa22bd_vd-",
      "_index": "library"
    }
  ]
}

GET /library/_mget
{
  "ids": [2, 3]
}

批量操作 bulk

action（行为）	说明
create	当文档不存在时创建之
index	创建新文档或替换已有文档
update	局部更新文档
delete	删除一个文档

# 请求体格式
{action: {metadata}}\n
{request body}\n
{action: {metadata}}\n
{request body}\n

# 例
{"delete": {"_index": "library","_type":"books","_id": "1"}}


POST /library2/_bulk
{"create":{"_index":"music", "_id": "1"}}
{"title": "Ave Verum Corpus"}
{"create":{"_index":"music", "_id": "2"}}
{"title": "Ave Verum Corpus2"}
{"create":{"_index":"music", "_id": "3"}}
{"title": "Ave Verum Corpus3"}
{"create":{"_index":"music", "_id": "4"}}
{"title": "Ave Verum Corpus4"}
{"create":{"_index":"music", "_id": "5"}}
{"title": "Ave Verum Corpus5"}
{"create":{"_index":"music", "_id": "6"}}
{"title": "Ave Verum Corpus6"}
{"create":{"_index":"music", "_id": "7"}}
{"title": "Ave Verum Corpus7"}
{"create":{"_index":"music", "_id": "8"}}
{"title": "Ave Verum Corpus8"}

GET /music/_search

GET /music/_mget
{
  "ids": ["2","3","4"]
}

POST /music/_bulk
{"update":{"_index":"music", "_id": "1"}}
{"doc": {"title": "update Ave Verum Corpus"}}
{"create":{"_index":"music", "_id": "9"}}
{"title": "Ave Verum Corpus9"}
{"delete":{"_index":"music", "_id": "3"}}

版本控制机制

GET /library/_doc/1

POST /library/_update/1?version_type=external
{
  "doc": {
    "name.last": "Gui2"
  }
}

PUT /library/_create/abcde?version_type=external
{
  "title": "Elasticsearch Test Book",
  "name": {
    "first": "Vineeth",
    "last": "Mohan"
  },
  "publish_date": "2015-06-06",
  "price": "77.99"
}

Mapping 映射

Type	ES type	Description
String, Varchar, Text	string	A text field: such as a nice text and CODE0011
Integer	integer	An integer (32 bit): such as 1, 2, 3, 4
Long	long	A long value (64 bit)
Float	float	A floating-point number (32 bit): such as 1, 2, 4, 5
Double	double	A floating-point number (64 bit)
Boolean	boolean	A boolean value: such as true, false
Date/Datetime	date	A date or datetime value: such as 2013-12-25, 2013-12-25T22:21:20
Bytes/Binary	binary	This is used for binary data such as a fire or stream of bytes

字段属性

属性	描述	适用类型
store	值为: yes 或者 no 设为 yes 就是存储，no 就是不存储默认值是 no	all
index	值为: analyzed, not_ analyzed 或者 no analyzed 索引且分析 not_ analyzed 索引但是不分析 no 不索引这个字段，这样就搜不到默认值是 analyzed	string 其他类型只能设为 no 或 not_analyzed
null_value	如果字段是空值, 通过它可以设置一个默认值，比如 “null_value”: “NA”	all
boost	设置字段的权值, 默认值是1.0	all
index_analyzer	设置一个索引时用的分析器	all
search_analyzer	设置一个搜索时用的分析器	all
analyzer	可以设置索引和搜索时用的分析器, 默认下elasticsearch 使用的是standard 分析器除此之外，你还可以使用whitespace、simple 或 english 这三种内置的分析器	all
include_in_all	默认下 elasticsearch 会为每一个文档定义一个特殊的域 _ all, 它的作用就是每一个字段都将被搜索到, 如果你不想让某个字段被搜索到, 那么就在这字段里定义一个 include_ in_all=false; 默认值是 true	all
index_name	定义字段的名称; 默认值是字段本身的名字	all
norms	norms 的作用是根据各种规范化因素去计算权值，这样方便查询; 在analyzed 定义字段里，值是 true, not _analyzed 是 false	all

静态映射
动态映射

文档中碰到一个以前没见过的字段时，动态映射可以自动决定该字段的类型，并对该字段添加映射

如何配置动态映射 :

dynamic: true 默认，动态添加字段，false 忽略新字段，strict 碰到陌生字段，抛出异常

适用范围：适用在根对象上或者 object 类型的任意字段上


GET /library/_mapping

# 建立映射
PUT /library3
{
  "settings": {
    "number_of_shards": 5,
    "number_of_replicas": 1
  },
  "mappings": {
    "properties": {
         "name": {"type":"text", "index": false},
         "title": {"type": "text"},
         "publish_date": {"type": "date", "index": false},
         "price": {"type": "double"},
         "number": {"type": "integer"}
    }
  }
}

GET /library3/_mapping

PUT /library4
{
  "settings": {
    "number_of_shards": 5,
    "number_of_replicas": 1
  },
  "mappings": {
    "dynamic": "strict",
    "properties": {
         "name": {"type":"text", "index": false},
         "title": {"type": "text"},
         "publish_date": {"type": "date", "index": false},
         "price": {"type": "double"},
         "number": {"type": "object", "dynamic": true}
    }
  }
}

GET /library4/_mapping

GET /_all/_mapping

GET /library,library4/_mapping

# 更新修改 Mapping 映射
# 很遗憾，mapping 一旦建立，就不能修改现有的字段映射
# 如果要推倒现有的映射，你得重新建立一个索引，然后重新定义映射
# 然后把之前索引里的数据导入到新建立的映射里
# ----- 具体的方法 -----
# 1. 给现有的索引定义一个别名，并且把现有的索引指向这个别名
# 2. 运行 PUT /现有索引/_alias/别名A
# 3. 新创建一个索引，定义好最新的映射
# 4. 将别名指向新的索引，并且删除之前索引的执行
# 5. POST /_aliases
# {
#  "actions": [
#    {"remove": {"index": "现有索引名", "alias": "别名A"}},
#    {"add": {"index": "新建索引名","alias":"别名A"}}
#  ]
# }
# 注：通过这几个步骤就实现了索引的平滑过度，并且是零停机的

# 删除映射
DELETE /library4

查询

基本查询
组合查询
过滤


GET /library/_search

GET /library/_search?q=title:guide
GET /library/_search?q=name.last:mohan
GET /library/_search?q=name.last:*a*

GET /_search?q=title:guide

# term 查询
GET /library/_search
{
  "query": {
    "term": {
      "title": "guide"
    }
  }
}

GET /library/_search
{
  "query": {
    "term": {
      "title": {
        "value": "definitive"
      }
    }
  }
}

# terms
GET /library/_search
{
  ""query": {
    "terms": {
      "title": ["elasticsearch", "kibana"]
    }
  }
}


# from, size
GET /library/_search

GET /library/_search
{
  "from": 0,
  "size": 2
}
GET /library/_search
{
  "from": 2,
  "size": 2
}

# 返回版本号 _version
GET /library/_search
{
  "from": 0,
  "size": 2,
  "version": true
}


# match 与 term 区别：match 查询时，elasticsearch 会根据你给定的字段提供合适的分析器
# 而 term 查询不会有分析器分析的过程

GET /library/_search
{
  "query": {
    "match": {
      "title": "and"
    }
  }
}

GET /library/_search
{
  "query": {
    "match": {
      "price": "55.70"
    }
  }
}

# match_all 查询指定索引下的所有文档
GET /library/_search
{
  "query": {
    "match_all": {}
  }
}

# match_phrase
# 短语查询，slop 定义的是关键词之间隔多少未知单词
GET /library/_search
{
  "query": {
   "match_phrase": {
     "title": {
       "query": "Elasticsearch guide",
       "slop": 2
     }
   } 
  }
}

# multi_match
# 可以指定多个字段
# 比如查询 title 和 preview 这两个字段里都包含 elasticsearch 关键词的文档
GET /library/_search
{
  "query": {
    "multi_match": {
      "query": "elasticsearch",
      "fields": ["title"]
    }
  }
}

# 指定返回字段
# 注意只能返回 store 为 yes 的字段
GET /library/_search
{
  "_source": ["title"],
  "query": {
    "match": {
      "title": "elasticsearch"
    }
  }
}

# 通过 partial_fields 控制加载的字段
# Unknown key for a START_OBJECT in [partial_fields].
# 还能加通配符 如 include: ["pr*"]
GET /library/_search
{
  "query": {
    "match_all": {}
  },
  "partial_fields": {
    "NAME": {
      "include": ["price"],
      "exclude": ["title"]
    }
  }
}


# 排序
# Set fielddata=true on [price] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instea
GET /library/_search
{
  "query": {
    "match_all": {}
  },
  "sort": [
    {
      "price": {
        "order": "desc"
      }
    }
  ]
}

# prefix 前缀匹配查询
GET /library/_search
{
  "query": {
    "prefix": {
      "title": {
        "value": "ela"
      }
    }
  }
}

# 控制范围 range
# from, to, include_lower, include_upper
# (是否包含下/上边界)
GET /library/_search
{
  "query": {
    "range": {
      "price": {
        "gte": 20,
        "lte": 60
      }
    }
  }
}


GET /library/_search
{
  "query": {
    "range": {
      "price": {
        "from": 50,
        "to": 60
      }
    }
  }
}

# wildcard 查询，允许使用通配符 * 和 ? 来查询
# 注：该查询方式影响性能
GET /library/_search
{
  "query": {
    "wildcard": {
      "title": {
        "value": "*o?*"
      }
    }
  }
}



# fuzzy 模糊查询
# value 查询关键字
# boost 设置查询的权值，默认为 1.0
# min_similarity 设置匹配的最小相似度
# 默认为 0.5，对于字符串，取值为 0~1(包含 0 和 1)，对于数值可能大于 1，对于日期型取值为 1d,2d,1m 这样，1d 代表一天
# prefix_length 指明区分词的共同前缀长度，默认是 0
# max_expansions 指明查询的词项可扩展的数目，默认可以无限大
GET /library/_search
{
  "query": {
    "fuzzy": {
      "title": "ank"
    }
  }
}


# more_like_this 查询
# fields 定义字段组。默认是 _all
# like_text 定义要查询的关键词
# percent_terms_to_match :该参数指明一个文档必须匹配多大比例的词项才被视为相似。默认值是 0.3，意思是30%的比例
# min_term_freq  该参数指明在生成的查询中查询词项的最大数目。默认为 25
# stop_words 该参数指明将被忽略的单词集合
# min_doc_freq 该参数指明词项应至少在多少个文档中出现才不会被忽略。默认是 5
# max_doc_freq 该参数指明出现词项的最大数目。以避免词项被忽略。默认是无限大
# min_word_len 该参数指明单个单词的最小长度，低于该值的单词将被忽略，默认值是e
# max_word_len 指明单个单词的最大长度.高于该值的单词将被忽略。默认是无限大
# boost_terms 该参数指明提升每个单词的权重时使用的权值默认是 1
# boost 指明提升一个查询的权值。默认是 1.0
# analyer 指定用于分析的分析器
GET /library/_search
{
  "query": {
    "more_like_this": {
      "fields": [
        "title"
      ],
      "like": "elasticsearch",
      "min_term_freq": 1,
      "max_query_terms": 12
    }
  }
}


# 建立测试数据
POST /store/_bulk
{"index":{"_id":1}}
{"price":10,"productID":"SD1002136"}
{"index":{"_id":2}}
{"price":10,"productID":"SD2678421"}
{"index":{"_id":3}}
{"price":10,"productID":"SD8897573"}
{"index":{"_id":4}}
{"price":10,"productID":"SD4535233"}
{"index":{"_id":5}}
{"price":10,"productID":"SD4310944"}


GET /store/_mget
{
  "ids": [1, 2, 3, 4]
}

GET /store/_mapping

# 最简单 filter 查询
# no [query] registered for [filtered]
GET /store/_search
{
  "query": {
    "filtered": {
      "query": {
        "match_all": {}
      },
      "filter": {
        "term": {
          "price": 20
        }
      }
    }
  }
}

# no [query] registered for [filtered]
GET /store/_search
{
  "query": {
    "filtered": {
      "filter": {
        "term": {
          "price": [10, 20]
        }
      }
    }
  }
}

# no [query] registered for [filtered]
GET /store/_search
{
  "query": {
    "filtered": {
      "query": {
        "match_all": {}
      },
      "filter": {
        "term": {
          "productID": "SD2678421"
        }
      }
    }
  }
}

# 查看分析器解析的结果
GET /_analyze
{
  "text": "SD2678421"
}

GET /_analyze
{
  "text": ["SD2678421", "SD1002136"]
}


GET /store/_mapping

DELETE /store

# 重新建立一个映射，让 productID 处于 not_analyzed 模式
PUT /store
{
  "mappings": {
    "properties": {
      "productID": {
        "type": "text",
        "index": "not_analyzed"
      }
    }
  }
}

布尔过滤查询
格式
{
  "bool": {
    "must": [],
    "should": [],
    "must_not": []
  }
}


GET /store/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "term": {
            "price": {
              "value": "20"
            }
          }
        },
        {
          "term": {
            "productID": "sd4535233"
          }
        }
      ],
      "must_not": [
        {
          "term": {
            "price": {
              "value": "30"
            }
          }
        }
      ]
    }
  }
}

# 嵌套查询
GET /store/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "term": {
            "price": {
              "value": "10"
            }
          }
        },
        {
          "bool": {
            "must": [
              {
                "term": {
                  "productID": {
                    "value": "SD4535233"
                  }
                }
              }
            ]
          }
        }
      ]
    }
  }
}

# no [query] registered for [filtered]
# and/ or/ not
GET /store/_search
{
  "query": {
    "filtered": {
      "filter": {
        "and": [
          {
            "term": {
              "price": "10"
            }
          },
          {
            "productID": {
              "price": "SD2678421"
            }
          }
        ]
      }
    }
  }
}



# range
# gt/gte/lt/lte
GET /store/_search
{
  "query": {
    "range": {
      "price": {
        "gte": 10,
        "lte": 20
      }
    }
  }
}

# #! Deprecation: [types removal] Specifying types in bulk requests is deprecated.
POST /test_index/test/_bulk
{"index":{"_id":"1"}}
{"tags":["search"]}
{"index":{"_id":"2"}}
{"tags":["search", "open_source"]}
{"index":{"_id":"3"}}
{"other_field":"some data"}
{"index":{"_id":"4"}}
{"tags":null}
{"index":{"_id":"5"}}
{"tags":["search",null]}



# tags 非空的集合
GET /test_index/_search
{
  "query": {
    "exists": {
      "field": "tags"
    }
  }
}

# 空值
# no [query] registered for [missing]
GET /test_index/_search
{
  "query": {
    "missing": {
      "field": "tags"
    }
  }
}

# _cache 缓存

bool 查询: must/should/must_not
boosting 查询: positive/negative
constant_score 查询
indices 查询

xchenhao

发布了40 篇原创文章 · 获赞 14 · 访问量 1万+

私信关注

elasticsearch 笔记总结

猜你喜欢