Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to add remote phrases to ES IK word Separator

2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/03 Report--

This article introduces the knowledge of "how to add remote phrases to the ES IK word splitter". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

Modify the configuration item URLSURLS in the IKAnalyzer.cfg.xml configuration file

Replace URLS with remote dictionary addresses, multiple addresses separated by semicolons (;).

For each url, such as http://127.0.0.1/dict/my.txt, the request only needs to satisfy the following two points to complete the hot update of participle:

1) the response needs to return two headers (header), one is Last-Modified and the other is ETag, both of which are string types. Whenever one changes, the plug-in will grab the new participle and update the thesaurus.

2) the returned content is in the format of one participle per line, and the newline character can be used.

Hot update word segmentation can be achieved by meeting the above two requirements, and there is no need to restart the ES instance.

You can put the hot words that need to be updated automatically in a UTF-8-encoded .txt file under nginx or other simple http server. When the .txt file is modified, http server will automatically return the corresponding Last-Modified and ETag when the client requests the file, or you can make another tool to extract the relevant words from the business system and update the .txt file.

The way of http server is relatively simple. Here is an implementation that responds through SpringMVC.

/ / File path of dictionary file private static final String EXT_DICT_PATH = "/ data/soft/mydic"; @ RequestMapping (value= "/ getCustomDict.htm") public void getCustomDict (HttpServletRequest request, HttpServletResponse response) {try {/ / read dictionary file String path = EXT_DICT_PATH; File file = new File (path); String content = "" If (file.exists ()) {/ / read file contents FileInputStream fi = new FileInputStream (file); byte [] buffer = new byte [(int) file.length ()]; int offset = 0, numRead = 0; while (offset)

< buffer.length && (numRead = fi.read(buffer, offset, buffer.length - offset)) >

= 0) {offset + = numRead;} fi.close (); content = new String (buffer, "UTF-8");} / / return data OutputStream out= response.getOutputStream () / / Head needs to take the Last-Modified ETag attribute / / here is the output file content size, not necessarily like this, as long as make sure that when the file changes, Last-Modified and ETag are also OK, for example, it can also be the file MD5 response.setHeader ("Last-Modified", String.valueOf (content.length () Response.setHeader ("ETag", String.valueOf (content.length (); response.setContentType ("text/plain; charset=utf-8"); out.write (content.getBytes ("utf-8")); out.flush (); logger.info (content+ "this is the read data value");} catch (Exception e) {e.printStackTrace () This is the end of the content of "how to add remote phrases to the ES IK word splitter". Thank you for your reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report