Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Configuration of rasa Chinese language Model spacy

2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

This article introduces the relevant knowledge of "the configuration of rasa Chinese language model spacy". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

The latest version of spacy starts to support Chinese

1. Download the Chinese language model zh_core_web_md-2.3.1.tar.gz

Download address: https://spacy.io/models/zh

2 improved configuration of the config.yml chat robot:

Language: zh

Pipeline:

-name: SpacyNLP # pre-training word vector

Model: "zh_core_web_md"

-name: SpacyTokenizer # text Separator

-name: SpacyEntityExtractor # text characterization

-name: SpacyFeaturizer # feature extractor turns a sentence into a vector

Pooling: mean

-name: CountVectorsFeaturizer # creates word bag representations of user information and tags (intentions and responses) for intention classification and response selection creation features

-name: CountVectorsFeaturizer

Analyzer: "char_wb"

Min_ngram: 1

Max_ngram: 4

-name: DIETClassifier # intention classification

Epochs: 100

-name: EntitySynonymMapper # synonymous entity

-name: ResponseSelector

Epochs: 100

# Configuration for Rasa Core.

# https://rasa.com/docs/rasa/core/policies/

Policies:

-name: MemoizationPolicy

-name: TEDPolicy

Max_history: 5

Epochs: 100

-name: MappingPolicy

~

Write Chinese directly in nlu.md:

# # intent:greet

-Hello.

-Hello.

-hi

-good morning.

-good afternoon.

-good evening.

# # intent:goodbye

-Bye.

-see you later.

-Bye.

# # intent:affirm

-Okay.

-Okay.

-OK.

# # intent:deny

-No.

-No.

-disagree.

-No way.

# # intent:bot_challenge

-are you human?

-are you a robot?

-am I talking to a robot?

-am I talking to someone?

~

3, how to choose which component of intention classification?

There are two types of pre-training Embedding and supervised Embedding.

The first kind of pre-training Embedding: sklearnintentclassifier

Use spaCy library to load the pre-training language model, including Chinese.

When do I use this component? The pre-training word embedding that meets the project scene exists and can be applied to the project.

The second type of supervised Embedding:embeddingintentclassifier

Train word embedding from 0. It is usually used with countvectorsfeaturizer components.

Features: need enough training data, this classifier is independent of language, only with multi-intention messages, very flexible.

This is the end of the content of "the configuration of the rasa Chinese language model spacy". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report