In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
This article mainly introduces how to install and use the HanLP natural language processing package, which has a certain reference value. Interested friends can refer to it. I hope you can learn a lot after reading this article.
HanLP is a Java toolkit composed of a series of models and algorithms that aims to promote the application of natural language processing in production environments. HanLP has the characteristics of perfect function, efficient performance, clear structure, up-to-date corpus and customizable.
HanLP can provide the following functions: keyword extraction, phrase extraction, traditional to simplified, simplified to traditional, word segmentation, part of speech tagging, pinyin conversion, automatic summary, named entity recognition (place name, organization name, etc.), text recommendation and other functions. For more information, please see the following link: http://www.hankcs.com/nlp/hanlp.html
HanLP download address: https://github.com/hankcs/HanLP/releases Magi HanLP Project Home Page: https://github.com/hankcs/HanLP
1. HanLP installation
Hanlp consists of the jar package, the properties file, and the data data model, so you should have all three files at installation time. It can be run by setting up a java project.
The relative paths to different dictionaries and the root root directory are described in the hanlp.properties file, so you can modify their paths in this file.
The hanlp-1.3.4.jar package contains the api of various algorithms and extraction methods, most of which are static and can be called directly through HanLP, so it is very convenient to use.
The data folder contains dictionary and model folders, dictionary mainly contains various types of dictionaries, model is mainly analytical models, and algorithms in hanlp api need to use the data model in model.
2. The use of HanLP
The general java project directory is as follows:
3. The specific use of HanLP
For example: extract, calculate and sort hot words from the chat record field in excel. The functions are as follows
Package com.run.hanlp.demo
Import java.util.ArrayList
Import java.util.Collections
Import java.util.Comparator
Import java.util.HashMap
Import java.util.List
Import java.util.Map
Import java.util.Map.Entry
Import org.apache.log4j.Logger
Import com.hankcs.hanlp.HanLP
Import com.hankcs.hanlp.seg.common.Term
Import com.hankcs.hanlp.suggest.Suggester
Import com.hankcs.hanlp.summary.TextRankKeyword
Import com.hankcs.hanlp.tokenizer.NLPTokenizer
Import com.hankcs.hanlp.tokenizer.StandardTokenizer
Import com.run.util.ExcelUtil
Public class HanlpTest {
Public static final Logger log = Logger.getLogger (HanlpTest.class)
Public static void main (String [] args) {
Log.info ("keyword extraction:")
HanlpTest.getWordAndFrequency ()
}
/ * *
* get all keywords and frequencies
, /
Public static void getWordAndFrequency () {
/ / String content =
/ / "programmers (English Programmer) are professionals engaged in program development and maintenance. Programmers are generally divided into programmers and programmers, but the distinction between them is not very clear, especially in China. Software practitioners are divided into four categories: junior programmers, senior programmers, system analysts and project managers."
List content = ExcelUtil.readExcelByField ("i:/rundata/excelinput", 5000 dint 5)
Map allKeyWords=new HashMap ()
For (int iTuno Bandi)
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.