Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to install and use the HanLP Natural language processing package

2025-03-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)05/31 Report--

This article mainly introduces how to install and use the HanLP natural language processing package, which has a certain reference value. Interested friends can refer to it. I hope you can learn a lot after reading this article.

HanLP is a Java toolkit composed of a series of models and algorithms that aims to promote the application of natural language processing in production environments. HanLP has the characteristics of perfect function, efficient performance, clear structure, up-to-date corpus and customizable.

HanLP can provide the following functions: keyword extraction, phrase extraction, traditional to simplified, simplified to traditional, word segmentation, part of speech tagging, pinyin conversion, automatic summary, named entity recognition (place name, organization name, etc.), text recommendation and other functions. For more information, please see the following link: http://www.hankcs.com/nlp/hanlp.html

HanLP download address: https://github.com/hankcs/HanLP/releases Magi HanLP Project Home Page: https://github.com/hankcs/HanLP

1. HanLP installation

Hanlp consists of the jar package, the properties file, and the data data model, so you should have all three files at installation time. It can be run by setting up a java project.

The relative paths to different dictionaries and the root root directory are described in the hanlp.properties file, so you can modify their paths in this file.

The hanlp-1.3.4.jar package contains the api of various algorithms and extraction methods, most of which are static and can be called directly through HanLP, so it is very convenient to use.

The data folder contains dictionary and model folders, dictionary mainly contains various types of dictionaries, model is mainly analytical models, and algorithms in hanlp api need to use the data model in model.

2. The use of HanLP

The general java project directory is as follows:

3. The specific use of HanLP

For example: extract, calculate and sort hot words from the chat record field in excel. The functions are as follows

Package com.run.hanlp.demo

Import java.util.ArrayList

Import java.util.Collections

Import java.util.Comparator

Import java.util.HashMap

Import java.util.List

Import java.util.Map

Import java.util.Map.Entry

Import org.apache.log4j.Logger

Import com.hankcs.hanlp.HanLP

Import com.hankcs.hanlp.seg.common.Term

Import com.hankcs.hanlp.suggest.Suggester

Import com.hankcs.hanlp.summary.TextRankKeyword

Import com.hankcs.hanlp.tokenizer.NLPTokenizer

Import com.hankcs.hanlp.tokenizer.StandardTokenizer

Import com.run.util.ExcelUtil

Public class HanlpTest {

Public static final Logger log = Logger.getLogger (HanlpTest.class)

Public static void main (String [] args) {

Log.info ("keyword extraction:")

HanlpTest.getWordAndFrequency ()

}

/ * *

* get all keywords and frequencies

, /

Public static void getWordAndFrequency () {

/ / String content =

/ / "programmers (English Programmer) are professionals engaged in program development and maintenance. Programmers are generally divided into programmers and programmers, but the distinction between them is not very clear, especially in China. Software practitioners are divided into four categories: junior programmers, senior programmers, system analysts and project managers."

List content = ExcelUtil.readExcelByField ("i:/rundata/excelinput", 5000 dint 5)

Map allKeyWords=new HashMap ()

For (int iTuno Bandi)

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report