How to add IK word Separator in ES 04/16 Update SLTechnology News&Howtos

How to add IK word Separator in ES

2025-04-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

ES how to add IK word splitter, many novices are not very clear about this, in order to help you solve this problem, the following editor will explain in detail for you, people with this need can come to learn, I hope you can gain something.

1. Download the IK word splitter, be sure to be consistent with the version of ES

2. Download it and put it in the\ plugins directory of ES to restart the ES service.

Test: http://localhost:9200/blog1/_analyze

{"text": "people's Republic of China MN", "tokenizer": "ik_max_word"}

Results:

{"tokens": [{"token": "people's Republic of China", "start_offset": 0, "end_offset": 7, "type": "CN_WORD", "position": 0}, {"token": "Chinese people" "start_offset": 0, "end_offset": 4, "type": "CN_WORD", "position": 1}, {"token": "Zhonghua", "start_offset": 0, "end_offset": 2, "type": "CN_WORD" "position": 2}, {"token": "Chinese", "start_offset": 1, "end_offset": 3, "type": "CN_WORD", "position": 3}, {"token": "people's Republic" "start_offset": 2, "end_offset": 7, "type": "CN_WORD", "position": 4}, {"token": "people", "start_offset": 2, "end_offset": 4, "type": "CN_WORD" "position": 5}, {"token": "Republic", "start_offset": 4, "end_offset": 7, "type": "CN_WORD", "position": 6}, {"token": "Republic" "start_offset": 4, "end_offset": 6, "type": "CN_WORD", "position": 7}, {"token": "country", "start_offset": 6, "end_offset": 7, "type": "CN_CHAR" "position": 8}, {"token": "mn", "start_offset": 7, "end_offset": 9, "type": "ENGLISH", "position": 9}]}

What's the difference between ik_max_word and ik_smart?

Ik_max_word: the text will be split in the finest granularity, for example, the "National Anthem of the people's Republic of China" will be divided into "people's Republic of China, Chinese people, Chinese, people's Republic, people, Republic, Republic, and National Anthem". All possible combinations will be exhausted, suitable for Term Query.

Ik_smart: will do the most coarse-grained split, such as splitting the "National Anthem of the people's Republic of China" into "people's Republic of China, National Anthem", which is suitable for Phrase query.

# Test parser GET _ analyze {"analyzer": "ik_smart", "text": "I love you China"} GET _ analyze {"analyzer": "ik_max_word", "text": "I love you China"} # Storage data PUT / test3/_doc/1 {"name": "Shi Ye", "age": 13, "birth": "2020-07-05"} # modify the data (all modify Birth will not be deleted) PUT / test3/_doc/1 {"name": "Shi Ye222", "age": 13} # modify the data, only the attribute name will be modified. Nothing else will change POST / test3/_doc/1/_update {"doc": {"name": "I modified it in post"}} # get the object structure GET / test3# get the document GET / test3/_doc/1# view all stored statistics in the database GET _ cat/indices?v# delete data DELETE / test3/_doc/1# storage With the new data PUT / shiye/user/6 {"name": "shiye Shi achieved good results", "age": 30, "desc": "the operation is as fierce as a tiger. World War I record 0-5 "," tags ": [" pretty boy "," travel "," mountain climbing "]} # query data through id GET / shiye/user/A001 # search by name GET / shiye/user/_search?q=name:shiye # by building complex queries Query the specified attribute _ source GET / shiye/user/_search {"query": {"match": {"name": "shiye"}}, "_ source": ["name", "desc" "age"]} # must query is equivalent to andGET / shiye/user/_search {"query": {"bool": {"must": [{"match": {"name": "shiye"}} {"match": {"name": "Master Shi"} # should query Equivalent to orGET / shiye/user/_search {"query": {"bool": {"should": [{"match": {"name": "shiye"}} {"match": {"name": "Master Shi"} # add must+filter GET / shiye/user/_search {"query": {"bool": {"must": [{"match": {"name": "shiye"}}] "filter": {"range": {"age": {"gte": 20 "lte": 40} # query GET / shiye/user/_search {"query": {"match": {"tags": "Mountain"} # Test text in tags Keyword # text can be participle # keyword without participle # specify the creation rules for each field of the index PUT testdb {"mappings": {"properties": {"name": {"type": "text"} "desc": {"type": "keyword"} # add data PUT testdb/_doc/2 {"name": "Wusong" "desc": "fighting Tiger"} # query GET testdb/_search {"query": {"match": {"desc": "military"}} # query + highlight GET testdb/_search {"query": {"match": {"name": "Shi"}, "highlight": {"pre_tags": "," post_tags ":"

"," fields ": {" name ": {} is it helpful for you to read the above content? if you want to learn more about relevant knowledge or read more related articles, please follow the industry information channel and thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.