之前几次要调节相关度算法都是直接修改的lucene的源码包, 需要重启es。 随着深度使用es ,集群重启又很麻烦,经过阅读官网文档终于找到了解决方案(建立mapping时,指定相关度)
官网说明https://www.elastic.co/guide/en/elasticsearch/reference/6.5/index-modules-similarity.html#bm25
{
"settings": {
"analysis": {
"analyzer": {
"comma": {
"type": "pattern",
"pattern": ","
},
"char": {
"type": "pattern",
"pattern": ""
}
}
},
"index" : {
"similarity" : {
"my_similarity" : {
//自定义相关度算法名称
"type" : "BM25",//相关度算法
"b" : "0.75",
"k1" : "1"
}
}
},
"number_of_replicas": 1,
"number_of_shards": 1
},
"mappings": {
"info": {
"properties": {
"message": {
"type": "text",
"analyzer": "ik_max_word"
, "similarity": "my_similarity" //使用自定义的相关度
}
}
}
}
}
效果展示
{
"value": 1.7618352,
"description": "tfNorm, computed as (freq * (k1 + 1)) / (freq + k1 * (1 - b + b * fieldLength / avgFieldLength)) from:",
"details": [
{
"value": 19,
"description": "termFreq=19.0",
"details": []
},
{
"value": 1,
"description": "parameter k1", //自定义的值
"details": []
},
{
"value": 0.75,
"description": "parameter b", // 自定义的值
"details": []
},
{
"value": 432.66666,
"description": "avgFieldLength",
"details": []
},
{
"value": 1337.4694,
"description": "fieldLength",
"details": []
}
]
}