Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to implement Jeckard distance and Ring comparison algorithm by Python

2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/01 Report--

This article will explain in detail how to achieve Jeckard distance and ring comparison algorithm in Python. The editor thinks it is very practical, so I share it with you for reference. I hope you can get something after reading this article.

Preface

NLP- string similarity calculation and set similarity measurement

What's the distance from Jeckard?

Jeckard distance (Jaccard Distance) is an index used to measure the difference between two sets. It is a complement of Jeckard's similarity coefficient, which is defined as 1 minus Jaccard similarity coefficient. The Jeckard similarity coefficient (Jaccard similarity coefficient), also known as the Jeckard Index (Jaccard Index), is an index used to measure the similarity between two sets.

Define

Jaccard similarity index is used to measure the similarity between two sets. It is defined as the number of elements in the intersection of two sets divided by the number of elements in the union.

Jaccard distance is used to measure the difference between two sets. It is a complement of the similarity coefficient of Jaccard and is defined as 1 minus Jaccard similarity coefficient.

Python implementation

The code is as follows:

#-*-encoding:utf-8-*-import jiebadef Jaccard (model, reference): # terms_reference is the source sentence and terms_model is the candidate sentence terms_reference = jieba.cut (reference) # default precise mode terms_model = jieba.cut (model) grams_reference = set (terms_reference) # de-duplication If not, change to list grams_model = set (terms_model) temp = 0 for i in grams_reference: if i in grams_model: temp = temp + 1 fenmu = len (grams_model) + len (grams_reference)-temp # Union try: jaccard_coefficient = float (temp / fenmu) # intersection except ZeroDivisionError: print (model Reference) return 0 else: what is the return jaccard_coefficient ring ratio?

The development speed of the month-on-month ratio is the ratio of the level of the reporting period to the level of the previous period, indicating the development speed of the phenomenon period by period. If you calculate the comparison between each month of the year and the previous month, that is, February to January, March to February, April to March. The difference between December and November shows the degree of development month by month. For example, the analysis of the development trend of some economic phenomena during the fight against SARS is more illustrative than the same period last year.

Anyone who has studied statistics or economic knowledge knows that statistical indicators can be divided into total indicators, relative indicators and average indicators according to their specific contents, actual functions and forms of expression. Due to the different base period, the development speed can be divided into year-on-year development speed, month-on-month development speed and fixed base development speed. To put it simply, the year-on-year, month-on-month and fixed-base ratio can all be expressed as percentages or multiples.

The development speed of fixed base ratio, also referred to as the total speed, generally refers to the ratio of the level of the reporting period to the level of a fixed period, indicating the total speed of development of this phenomenon over a long period of time. The speed of development over the same period of last year generally refers to the relative speed of development achieved by comparing the development level of the current period with that of the same period last year. The speed of development on a month-on-month basis generally refers to the ratio of the level of the reporting period to that of the previous period, indicating the speed of development of the phenomenon period by period.

Year-on-year and ring-to-ring ratio, although both reflect the speed of change, due to the different base period, the connotation of the reflection is completely different; generally speaking, the ring-to-ring ratio can be compared with the ring-to-ring ratio, but not the same place; while for the same place, considering the reflection of the vertical development trend of time, it is often necessary to compare the year-on-year with the ring-to-ring. [1]

Python implementation

The code is as follows:

Def month_on_month_ratio (data_list): mid = 0 length = len (data_list) res = [] while mid < length-1: a, b = data_ list [mid: mid+2] res.append ((bmura) / a) mid+ = 1 return res about how Python implements Jeckard distance and ring comparison algorithm. Hope that the above content can be helpful to you, so that you can learn more knowledge, if you think the article is good, please share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report