Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to install and use hanlp

2025-01-14 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly introduces how to install and use hanlp, the article is very detailed, has a certain reference value, interested friends must read it!

Download HanLP-1.3.4.zip

Download hanlp-1.3.4-release

Download hanlp.properties

Download data from https://github.com/hankcs/HanLP/releases and overwrite the extracted data from HanLP-1.3.4.zip

#Root directory of path in this profile, root directory + other path = absolute path

#Windows users please note that the path separator is uniformly used/

root=E:/hannlp/HanLP-1.3.4/HanLP-1.3.4/

#Core dictionary path

CoreDictionaryPath=data/dictionary/CoreNatureDictionary.txt

#2-gram dictionary path

BiGramDictionaryPath=data/dictionary/CoreNatureDictionary.ngram.txt

#Stop word dictionary path

CoreStopWordDictionaryPath=data/dictionary/stopwords.txt

#thesaurus path

CoreSynonymDictionaryDictionaryPath=data/dictionary/synonym/CoreSynonym.txt

#Name dictionary path

PersonDictionaryPath=data/dictionary/person/nr.txt

#Name dictionary transition matrix path

PersonDictionaryTrPath=data/dictionary/person/nr.tr.txt

#Simple dictionary path

TraditionalChineseDictionaryPath=data/dictionary/tc/TraditionalChinese.txt

#Custom dictionary path, separate multiple custom dictionaries with;, space at the beginning indicates that in the same directory, using the "file name part of speech" form indicates that the part of speech of this dictionary defaults to this part of speech. Decreasing priority.

#Also data/dictionary/custom/CustomDictionary.txt is a high-quality thesaurus, please do not delete it.

CustomDictionaryPath=data/dictionary/custom/CustomDictionary.txt; Modern Chinese Supplementary Dictionary.txt; National Toponymic Dictionary.txt ns; Personal Name Dictionary.txt; Institution Name Dictionary.txt; Shanghai Toponymic.txt ns;data/dictionary/person/nrf.txt nrf

#CRF participle model path

CRFSegmentModelPath=data/model/segment/CRFSegmentModel.txt

#HMM segmentation model

HMMSegmentModelPath=data/model/segment/HMMSegmentModel.bin

#Whether the participle result shows part of speech

ShowTermNature=true

project directory

Test Code:

package hanlp;

import java.util.List;

import com.hankcs.hanlp.HanLP;

import com.hankcs.hanlp.seg.common.Term;

import com.hankcs.hanlp.suggest.Suggester;

import com.hankcs.hanlp.tokenizer.NLPTokenizer;

public class Test {

public static void main(String[] args) {

System.out.println("HanLP will automatically build dictionary cache when compiling for the first time, please wait...\n");

//File not found error will occur the first time, but it will not affect the operation. After the cache is completed, there will be no more errors.

System.out.println("Standard participle: ");

System.out.println(HanLP.segment("Hello, welcome to HanLP! "));

System.out.println("\n");

List termList = NLPTokenizer.segment("The patient accidentally found right popliteal cyst without obvious cause before 1 month, the cyst was tough, no obvious activity, tenderness, no redness, no abnormal secretion");

System.out.println("NLP participle: ");

System.out.println(termList);

System.out.println("\n");

System.out.println("Smart Recommendation: ");

getSegement();

System.out.println("\n");

System.out.println("Keyword extraction: ");

getMainIdea();

System.out.println("\n");

System.out.println("Automatic Summary: ");

getZhaiYao();

System.out.println("\n");

System.out.println("phrase extraction: ");

getDuanYu();

System.out.println("\n");

}

/**

* Intelligent recommendation section

*/

public static void getSegement() {

Suggester suggester = new Suggester();

String[] titleArray = ("Prince William delivers speech calling for wildlife protection\n" + "Time Person of the Year finalists released Putin Ma selected\n" + "" Hagupit "sweeps Philippines: Philippines draws lessons from" Haiyan "early evacuation\n"

+ "Japan's secrecy law will come into effect. Japanese media say it undermines citizens 'right to know\n" + "British report says air pollution brings" public health crisis "").split("\\n");

for (String title : titleArray) {

suggester.addSentence(title);

}

System.out.println(Suggester.Suggest("speak", 1)); //semantics

System.out.println(Suggester.Suggest("Crisis Public", 1)); //character

System.out.println(Suggester.Suggest("mayun", 1)); //Pinyin

}

/**

* keyword extraction

*/

public static void getMainIdea() {

String content = "Programmer is a professional engaged in program development and maintenance. Programmers are generally divided into programmers and programmers, but the boundaries between the two are not very clear, especially in China. Software practitioners fall into four categories: junior programmers, senior programmers, systems analysts, and project managers. ";

List keywordList = HanLP.extractKeyword(content, 5);

System.out.println(keywordList);

}

/**

* automatic summarization

*/

public static void getZhaiYao() {

String document = "Algorithms can be broadly divided into basic algorithms, algorithms for data structures, number theory algorithms, algorithms for computational geometry, graph algorithms, dynamic programming and numerical analysis, encryption algorithms, sorting algorithms, retrieval algorithms, randomization algorithms, parallel algorithms, Hermitian deformation models, random forest algorithms.\ n"

+ "Algorithms can be broadly divided into three categories,\n" + "one, finite deterministic algorithms, which terminate in a finite period of time. They may take a long time to perform assigned tasks, but they will still terminate within a certain time. The results of such algorithms often depend on the input values.\ n"

+ "Two, finite nondeterministic algorithms, which terminate in finite time. However, for a given value (or values), the result of the algorithm is not unique or deterministic.\ n"

+ "Three, infinite algorithms are those algorithms that do not terminate because no termination definition condition is defined, or because the defined condition cannot be satisfied by the input data. Generally, infinite algorithms arise from the failure to define termination conditions. ";

List sentenceList = HanLP.extractSummary(document, 3);

System.out.println(sentenceList);

}

/**

* phrase extraction

*/

public static void getDuanYu() {

String text = "Algorithm Engineer\n"

+ An algorithm is a set of clear instructions for solving a problem, that is, for a specified input, to obtain the desired output in a finite time. If an algorithm is flawed or unsuitable for a problem, executing the algorithm will not solve the problem. Different algorithms may perform the same task in different time, space, or efficiency. The quality of an algorithm can be measured by space complexity and time complexity. An algorithmic engineer is someone who processes things using algorithms.\ n"

+ "\n" + "1Job Description\n" + "Algorithm Engineer is a very high-end position;\n" + "Professional requirements: computer, electronics, communications, mathematics and other related majors;\n"

+ "Education requirements: Bachelor degree or above, most of them are master degree or above;\n" + "Language requirements: English requirements are proficient, basically able to read foreign professional books;\n"

+ "Must master computer-related knowledge, skilled use of simulation tools MATLAB, etc., must know a programming language.\ n" + "\n" + "2 Research direction\n"

+ "Video algorithm engineer, image processing algorithm engineer, audio algorithm engineer, communication baseband algorithm engineer\n" + "\n" + "3 Current situation at home and abroad\n"

+ "At present, there are many engineers engaged in algorithm research in China, but there are very few senior algorithm engineers. They are a very scarce professional engineer. Algorithm engineers are divided into audio/video algorithm processing, two-dimensional information algorithm processing in image technology and one-dimensional information algorithm processing in communication physical layer, radar signal processing, biomedical signal processing and other fields according to their research fields. n"

+ Media Processing Service: Machine vision becomes the core of this kind of algorithm research; There is also a 2D to 3D algorithm (2D-to-3D conversion), de-interlacing algorithm (de-interlacing), motion estimation motion compensation algorithm (Motion estimation/Motion Compensation), denoising algorithm (Noise Reduction), scaling algorithm, Sharpening algorithm, Super Resolution algorithm, Gesture Recognition (gesture recognition), Face Recognition (face recognition).\ n"

+ "Algorithms commonly used in one-dimensional information fields such as communication physical layer: RRM and RTT in wireless field, modulation and demodulation, channel equalization, signal detection, network optimization, signal decomposition in transmission field, etc.\ n" + " In addition, data mining and Internet search algorithms have also become popular directions today.\ n"

+ "algorithmic engineers are moving toward artificial intelligence. ";

List phraseList = HanLP.extractPhrase(text, 10);

System.out.println(phraseList);

}

}

include Chinese word segmentation, syntax analysis and named entity recognition.

The above is "hanlp how to install and use" all the content of this article, thank you for reading! Hope to share the content to help everyone, more relevant knowledge, welcome to pay attention to the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report