elasticsearch ik parser

ik parser

 

1. ik parser

The IK Analysis plugin integrates Lucene IK analyzer (http://code.google.com/p/ik-analyzer/) into elasticsearch, support customized dictionary.

Analyzer: ik_smart, ik_max_word, Tokenizer: ik_smart, ik_max_word

 

Documentation: https://github.com/medcl/elasticsearch-analysis-ik

 

1.1. Download and install configuration

 

Publishing pages https://github.com/medcl/elasticsearch-analysis-ik/releases

Find the corresponding version here is 7.3.1, download;

cd your-es-root / plugins / && mkdir ik # Create a directory ik

unzip plugin to folder your-es-root / plugins / I # 解压 到 I

installation

Ik directory to extract to

 

test

rv = es.cat.plugins(v=True)

Q. (Rw)

name component   version

** analysis I-7.3.1

 

2. Test segmentation effect

Code

# Participle

def test1():

    # Ik word test results

    d3 = {

    "Text": "The world is can be recognized, knowledge is a dialectical process of development."

    ,"analyzer":"standard"

    }

    # Tokenizer

    ana = ["standard", "ik_smart", "ik_max_word"]

    for _ in ana:

        d3["analyzer"] = _

        rv = es.indices.analyze(body=d3, format="text")

        print (_ + "word result:", [x [ "token"] for x in rv [ "tokens"]]) # d1 segmentation results

 test1 ()

result:

standard segmentation results: [ 'World', 'sector', 'a', 'available', 'with', 'by', 'recognize', 'know', 'a', 'recognize', 'know', ' a ',' a ',' a ',' debate ',' card ',' send ',' development ',' a ',' over ',' away ']

ik_smart word Results: [ 'world', 'yes',' can ',' is', 'understanding', 'of', 'understanding', 'yes',' a ',' dialectical ',' development ',' the process of']

ik_max_word word Results: [ 'world', 'yes',' can ',' is', 'understanding', 'of', 'understanding', 'yes',' a ',' a ',' a ',' dialectical ',' development ',' a ',' process']

 

Guess you like

Origin www.cnblogs.com/wodeboke-y/p/11562837.html