Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What is the principle and example analysis of named entity recognition system based on CRF

2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

This article introduces the principle and example analysis of named entity recognition system based on CRF. The content is very detailed. Interested friends can use it for reference. I hope it will be helpful to you.

Often hear some friends say to use CRF (conditional random field algorithm) to do named entity recognition, but most of them call CRF++ package, and then they just construct some features, and then a few command lines are executed. Recently, some friends often ask CRF how to name entity recognition. Today, I will explain the process of CRF prediction with examples. If there is something wrong, you are welcome to beat the brick, which can be regarded as throwing a brick to attract jade.

This project is based on the training of the CRF model. If necessary, the next project can introduce the principle and process of training.

Usually, there are B, E, M and S tags in the named entity sequence tagging task of CRF. The template for this project is: U0VOR% x [- 1jue 0] U1VL% x [0J 0] U2VR% x [1J 0] U3V% x [- 1J 0]% x [0J 0]

U4GRO% x [0re0]% x [1jue 0] U5Rose% x [- 1jue 0]% x [1J 0]

The use case is "the Vestas windmill is on fire."

First of all, the characteristic function of "dimension" is calculated.

It can be seen that the current token is "dimension". The matrix is obtained from the template eigenfunction, and then the summation results of each column of the matrix are as follows:

The feature calculation process of other characters is the same, and the process is ignored here. The matrix DotMatrix result is as follows (because the word "dimension" is the start character, it cannot be E and M annotations, and Gu thinks it is set to the minimum weight):

According to the CRF calculation process, the above matrix is the point function score matrix, and we also need a label transfer matrix TransMatrix obtained during training, namely:

The maximum transfer probability between characters before and after each annotation can be deduced by the combination of DotMatrix and TransMatrix. The formula is as follows:

The calculation of score value is the process of calculating the transfer probability between characters, while the from matrix records the annotation of the previous character when the current node marks the maximum probability, which can be regarded as the record matrix of the optimal path, while the net matrix is the probability value of each character in the BEMS annotation obtained through the transfer calculation process, as follows:

The result of from matrix is as follows:

At this time, we need to trace back an optimal path, locate the "le" character. As the ending character, it can only be E or S. When we look at the values of net [fire] [E] and net [fire] [S], we can see that the S tag result is larger, so "le" is marked as S, look at the from matrix, from [s] = 1, and "fire" is marked as E, and the result is as follows:

On the principle of CRF-based named entity recognition system and case analysis is shared here, I hope that the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report