Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to add solr Chinese word Segmentation to CDH

2025-04-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)05/31 Report--

Editor to share with you how to add CDH solr Chinese word segmentation, I believe that most people do not know much, so share this article for your reference, I hope you can learn a lot after reading this article, let's go to know it!

The hardest thing about cdh is the location of solr\ WEB-INF\ lib. Since I didn't install cdh and didn't configure SOLR_HOME, I looked for it for a long time. You can win the find order to find.

Solr itself is not very good at dealing with Chinese word segmentation, so Chinese applications often need to add an additional Chinese word segmentation device to process Chinese word segmentation. Ik-analyzer is one of the good Chinese word segmentation devices.

I. version information

Solr version: 4.10.0

Ik-analyzer version required: IK Analyzer 2012FF_hf1

II. Configuration steps

Download compression and decompression

We copy the IKAnalyzer2012FF_u1.jar to the solr service under solr\ WEB-INF\ lib. Note: if there is a change in the path of cdh, mine is:

/ opt/cloudera/parcels/CDH-5.4.4-1.cdh6.4.4.pp0.4/lib/solr/webapps/solr/WEB-INF/lib

In a higher version of CDH, the location is: / usr/lib/solr/webapps/solr/WEB-INF/lib

If you don't know where the jar is, you can search like this: find /-name admin.html

Note: do not upload jar here: / var/lib/solr/tomcat-deployment/webapps/solr/WEB-INF/lib

This is the location when the tomcat is deployed. After rebooting, copy the jar from the above two paths to / var/lib/solr/tomcat-deployment/webapps/solr/WEB-INF/lib. If you upload it here, the restart solr,jar will disappear.

We copy the IKAnalyzer.cfg.xml and stopword.dic to the conf of the core where we need to use the word splitter, in the same directory as the schema.xml file of core.

Modify the schema.xml of core:

Configure test fields:

Three: test the configuration

The above is all the content of the article "how to add solr Chinese word Segmentation in CDH". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report