In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
The process record of how a rookie uses hanlp as a participle
Recently, I have been learning the contents of hanlp, and I am going to see if I have time to organize a wave of hanlp sharing after the holiday. It should still share DKHadoop in the same way as before. Take a screenshot of the whole learning process and match it with words.
In the past two days, I am also reading some articles about learning and using hanlp shared by others, and the sharing I see later will also be reproduced and shared with you. This article shared today is also an article on how to use hanlp to do word segmentation shared by others a long time ago. Beginners can take a look at it!
Boss was given the task of doing word segmentation, and the first thing he wanted to use was the regular expression of the stuttering participle and. Later, it was found that the result was not good and needed to be screened over and over again [the first standard screened out 80% of the data, then developed the second standard, continued the screening, and then developed the third standard screening, etc.]
I used the stuttering participle and felt that it was only a general use of the names of people, places and institutions. In the actual separation, it is not good to separate the name of the organization. So hanlp participle is used instead.
But the disadvantage of hanlp participle is that it can only be used on java, but java has always been my weakness. So write a blog post here to describe how to use hanlp from beginning to end.
Besides, Pangpang locked my computer in the cabinet of the library in Beijing normal University. I don't have a computer to use at work, so I use Xiao Pang's computer, that is to say, I need to match all the basic variables myself, so it's the process from a blank piece of paper to using hanlp.
Step 1: download a jdk, go to the openjdk official website and install the next one.
After installation, you need to configure three environment variables, namely
1. JAVA_HOME:C:\ Program Files\ Java\ jdk1.8.0_73
2. CLASSPATH: this is the directory of the lib in the jdk after it is opened.
3. PATH: the bin directory behind jdk
After the configuration is complete, on the cmd under Windows, type java-version to see if there is any response to determine whether jdk is installed correctly.
I have a small problem here. In the fat computer, I don't know what she installed before, but I installed a jre1.6, but I installed jre1.8 and reported an error in cmd, saying that I couldn't find jre1.6. Later, I read the saying on the Internet, saying that maybe other software will also download the java environment, so you may have many different packages, when the system is looking for a path. By default, it will be found in the environment variables you configured above. Therefore, we need to put our latest environment variable in front of a large number of environment variables and try it. ]
After downloading jdk and installing it successfully, the second step is to download eclipse
Go to the official website and remember that x86 is 32-bit and x64 is 64-bit. Set the location of project after download. [for example, I set it in the root directory of D disk, but it's not good, but it can't be changed. Lesson]
After the installation is successful, the third step is to download various things from hanlp.
Method 1.maven method, download a 0 configuration. [but I can't play]
Method 2: first download the jar package hanlp-1.2.8.jar [remarks, the hanlp version has been released to portable-1.6.8]
Http://hanlp.com/
Then download the data.zip package, you can choose to download the standard data or mini data or all the data. Different sizes. What I play is the standard version. 40M
Download hanlp.properties again. This is a file that ends with properties. I've never seen it before, but I can open it with txt.
Step 4: import the downloaded things into eclipse and build the path.
1. Import the jar package into the lib directory of eclipse
Http://jingyan.baidu.com/article/ca41422fc76c4a1eae99ed9f.html
2. Create a package and a class in src. The package will be under the root directory DVR / set by me, and the first letter of the class name must be capitalized? [it seems that if you don't write much, it will be rejected]
3. Extract the data package and put it in a path you like [my path is D://py/]. Then, in the hanlp.properties file, change root to the directory at the next level where data is stored.
4. Drag hanlp.properties to the src directory
Then I tested a demo test, found an error, then clicked import import com.hankcs.hanlp.HanLP; and run the program.
Still reported an error and found that the properties file was not imported into the bin directory. Open test0320 again, copy the properties file in that bin directory and run it successfully.
Reproduced from tianbwin2995's blog
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.