Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to understand the Classification of domestic violence in Python to learn the basic Operation of NLP Natural language processing

2025-02-14 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/03 Report--

This article will explain in detail how to understand the classification of domestic violence in Python to learn the basic operations of NLP natural language processing. The content of the article is of high quality, so the editor will share it for you as a reference. I hope you will have some understanding of the relevant knowledge after reading this article.

Overview

Starting today, we will start a journey of Natural language processing (NLP). Natural language processing enables processing, understanding, and use of human language to bridge the gap between machine language and human language.

Data introduction

The data is a judicial data on domestic violence. It is divided into four different categories: the policeman is beaten by her husband, the whistleblower is beaten by his wife, the whistleblower is beaten by his son, and his daughter. Today we are going to use the knowledge we learned a few times to implement a NLP classification problem.

Word frequency statistics

CountVectorizer is a method of text feature extraction. It can help us to calculate the frequency of each word in the training text to obtain the word frequency matrix.

Format:

Vec = CountVectorizer (analyzer= "word", max_features=4000)

Parameters:

Analyzer: analyze the corpus with "word" or "char"

Max_features: the largest set of keywords

Method list functions fit () to fit transform () to return word frequency statistical matrix

Naive Bayes

MultinomialNB polynomials are naive Bayes, which is a very common classification method.

Formula:

P (B | A) = P (B) * P (A | B) / P (A)

Example:

Suppose the probability of traffic jams in Beijing in winter is 80% P (B) = 0.8

Suppose the probability of snowing in Beijing in winter is 10% P (A) = 0.1

If there is a traffic jam on a certain day, the probability of snow is 10%, P (A | B) = 0.1

We can get the probability of traffic jam P (B | A) = P (B) 0.8 * P (A | B). If there is a traffic jam on a certain day, the probability of snow is 0.1.

Divided by the probability of P (A) snowing = 0.1, wait until 0.8

In other words, if it snows on a certain day in Beijing in winter, there is an 80% chance that there will be a traffic jam on that day.

Code implementation

Pre-process import randomimport jiebaimport pandas as pddef load_data (): "" load data, perform basic conversion: return: husband, wife, son, daughter (domestic violence data) "" # load stop words stopwords = pd.read_csv ('data/stopwords.txt', index_col=False, quoting=3, sep= "\ t", names= [' stopword']] Encoding='utf-8') stopwords = stopwords ['stopword']. Values print (stopwords, len (stopwords)) # load corpus laogong_df = pd.read_csv ("data/beilaogongda.csv", encoding= "utf-8", sep= ",") laopo_df = pd.read_csv ("data/beilaopoda.csv", encoding= "utf-8", sep= " ") erzi_df = pd.read_csv (" data/beierzida.csv ", encoding=" utf-8 ", sep=", ") nver_df = pd.read_csv (" data/beinverda.csv ", encoding=" utf-8 ", sep=" ") # remove nan laogong_df.dropna (inplace=True) laopo_df.dropna (inplace=True) erzi_df.dropna (inplace=True) nver_df.dropna (inplace=True) # convert laogong = laogong_df.segment.values.tolist () laopo = laopo_df.segment.values.tolist () erzi = erzi_df.segment.values.tolist () nver = nver_df.segment.values.tolist () # debugging Output print (laogong [: 5]) print (laopo [: 5]) print (erzi [: 5]) print (nver [: 5]) return laogong Laopo, erzi, nver, stopwordsdef pre_process_data (content_lines, category Stop_words): "data preprocessing: param content_lines: corpus: param category: classification: param stop_words: deactivated words: return: preprocessed data" # Storage result sentences = [] # ergodic for line in content_lines: try: segs = jieba.lcut (line) Segs = [v for v in segs if not str (v). Isdigit ()] # remove the number segs = list (filter (lambda x: x.strip ()) Segs)) # remove the left and right spaces segs = list (filter (lambda x: len (x) > 1, segs)) # character length 1 segs = list (lambda x: x not in stop_words, segs) # remove the stop word result = (".join (segs)) Category) # Space splicing sentences.append (result) except Exception: # print error line print (line) continue return sentencesdef pre_process (): "data preprocessing main function: return: return preprocessed corpus (word segmentation + tagging)"# read data laogong, laopo, erzi, nver Stop_words = load_data () # preprocessing laogong = pre_process_data (laogong, 0, stop_words) laopo = pre_process_data (laopo, 1, stop_words) erzi = pre_process_data (erzi, 2, stop_words) nver = pre_process_data (nver, 3 Stop_words) # debug output print (laogong [: 2]) print (laopo [: 2]) print (erzi [: 2]) print (nver [: 2]) # splicing result = laogong + laopo + erzi + nver return resultif _ name__ = ='_ main__': pre_process () main function from sklearn.feature_extraction.text import CountVectorizerfrom sklearn.model_selection import train_test_splitfrom sklearn.naive _ bayes import MultinomialNBfrom pre_peocessing import pre_processdef main (sentences): "main function"# instantiate vec = CountVectorizer (analyzer=" word " Max_features=4000) # take out corpus and label x, y = zip (* sentences) # split dataset X_train, X_test, y_train, y_test = train_test_split (x, y) Random_state=0) # converted to word bag model vec.fit (X_train) print (vec.get_feature_names ()) # instantiated naive Bayesian classifier = MultinomialNB () classifier.fit (vec.transform (X_train) Y_train) # Forecast y_predit = classifier.predict (vec.transform (X_test)) # print (y_predit) # print (y_test) # calculation accuracy score = classifier.score (vec.transform (X_test), y_test) print (score) if _ name__ ='_ main__': data = pre_process () main (data)

Output result:

[!] 2627 [the policeman was beaten by her husband, please come to the scene to deal with it. Seeing that the above woman was beaten with a knife by her husband (120 has been notified, if the police are not required, please call 120 or 110) ask the police to bring the necessary protective equipment and pay attention to their own safety.' The policeman was beaten by her husband and drunk with a knife. (please bring the necessary personal protective equipment to the scene and pay attention to your own safety. The person who called the police was beaten by her husband, and the other person was there, so there was no need for medical help. Please come to the scene to deal with it.' The person who reported to the police said that he was beaten by her husband and injured one person without 120 asking the police to be present to deal with it.'] The person who reported to the police said he was beaten by his wife, unarmed and injured, and there was no need for 120. The wife was present. Please pay attention to your own safety and ask the police to deal with it on the spot. Domestic violence, said he was beaten by his wife, unarmed, one person injured without medical help, please come to the scene to deal with it.' The policeman was beaten by his wife, armed and no one was injured. Please come to the scene to deal with it and pay attention to your own safety.' The person who reported the family dispute was beaten by his wife without medical help, please come to the scene to deal with it.' Divorce leads to being beaten by his wife, no armed, no injury, please come to the scene to deal with.'] The policeman was beaten by his son and no one was hurt. Please come to the scene to deal with it.' The police reported that he was beaten by his son and asked the police to come to the scene to deal with it. (insider: 22649) the person who reported to the police was beaten by his son, uninjured and unarmed. Please bring the necessary protective equipment and pay attention to your own safety.' , 'the policeman was beaten by his son, please come to the scene to deal with it.' the policeman said that he was beaten by his son, slightly injured (one person was injured) and unarmed. The police are invited to bring the necessary protective equipment and pay attention to their own safety.'] The police reported that he was beaten by his daughter, and one person was injured without 120. Please come to the scene to deal with it.' If the policeman is beaten by his daughter and the other party leaves because of a family dispute, please bring the necessary protective equipment and pay attention to your own safety.' The policeman was beaten by his daughter and was unarmed. Please come to the scene to deal with it.' The person who reported to the police said he was beaten by his daughter, unarmed, and nothing happened, please ask the police to come to the scene to deal with it. Please bring the necessary protective equipment and pay attention to your own safety.' The policeman said that his wife was beaten by his daughter, unarmed and uninjured, and asked the police to deal with her at the scene.'] [('report to the police', 0), ('the above husband armed with a knife informed the police to arrive at the scene and call the police protective equipment', 0)] [('the policeman said that the wife's armed person was injured without the presence of the police at the scene', 1), (the domestic violence, the wife's armed person's injury did not need to be attended by the police at the scene) ] [('report the son to the police without police presence', 2), ('the policeman said the son was present', 2)] [('the policeman claimed that the daughter was injured without the police present', 3), ('report the daughter's family dispute to the police to carry protective equipment', 3)] ['aa67c3',' q5alarm, 'one person', 'one person injured' 'one', 'one punch', 'first floor', 'one', 'husband', 'address', 'not going up', 'not living','no tumbler', 'unknown', 'unclear','no need','no','no concession', 'unknown', 'impassability','no need', 'something', 'interruption', 'you have', 'Chinese name' 'Fenglu', 'ping-pong', 'Jiuting', 'Jiujing Road', 'quarrel', 'Armenian', 'Ren Daibao', 'injured', 'borrowed', 'head', 'hand', 'nothing happened','No injury','No injury', 'personal name', 'Human Department', 'acting', 'acting newspaper', 'Vacation', 'injured' 'Household', 'security', 'thermos', 'ready', 'urge', 'ask', 'son', 'charger', 'bus stop', 'centimeter', 'highway', 'closed', 'off', 'its name', 'its proximity', 'specific address', 'conflict', 'days', 'stool', 'bleeding' "derailment", "division", "score already", "location", "minute", "just arrived", "ex-wife", "scissors", "cut", "Canada", "zoning", "treatment", "hospital", "National Day", "bedroom", "Health Bureau", "yes", "anti-fight". 'lock', 'happen', 'injury', 'change', 'quarrel', 'inarticulate','go back', 'tell', 'bite','we don't need', 'beer bottle', 'throat', 'cry for help', 'Xitai Road', 'drink', 'lips', 'back','go back','go home', 'come back','be present' 'at home','on the ground', 'address', 'sit', 'deal with', 'police officer', 'summer dream mist', 'trauma', 'foreigner', 'outside', 'how old', 'bridge', 'marble', 'big trouble', 'husband and wife', 'head injury', 'dizziness', 'headache', 'head', 'Audi' 'daughter', 'woman', 'mother', 'sister', 'wife', 'threat', 'extramarital affair', 'baby', 'wife', 'granddaughter', 'lonely old man', 'location', 'home', 'family', 'family member', 'domestic violence', 'domestic violence', 'family door', 'confrontation', 'object', 'general door' 'Community', 'sister-in-law', 'child', 'tail number', 'neighborhood committee', 'neighborhood committee', 'resident', 'mother-in-law', 'work', 'staff', 'tools', 'work number', 'already in place', 'market', 'and said', 'landline', 'open door', 'remote place', 'mouth', 'trigger', 'brother' 'client', 'know', 'necessary', 'pregnancy', 'ambulance', 'situation', 'unknown situation', 'emotion', 'plot', 'success', 'hand', 'hold', 'finger', 'mobile phone', 'cell phone number', 'hand pain', 'hand', 'hand', 'hit someone', 'hit', 'down' 'hit it', 'fight', 'fight', 'kill', 'call', 'break', 'slap in the face', 'hit please', 'broom', 'scratch', 'report', 'call the police', 'burden', 'slippers', 'unstoppable', 'take out', 'get it', 'take ruler', 'take length', 'get shoes', 'hold knife' 'armed', 'sustained', 'holding food', 'pinching', 'answering the phone', 'control', 'measures', 'carry', 'put down', 'put down', 'help', 'rescue', 'ambulance','no one','no harm','no need', 'morning', 'yesterday', 'temporary', 'injured', 'knife' 'wooden stick', 'cups', 'Songjiang', 'table', 'Meilong', 'stick', 'stick head', 'chair', 'upstairs', 'hammer', 'this policeman', 'weapons', 'weapons and equipment', 'disabled people', 'mother', 'mother and son', 'towels', 'policeman' Kettle, fruit knife, help, Shen Yang, nothing, Shanghai card, safety, activity room, police station, blood, Pudong, excitement, ashtray, paper burning, photo, unable to get up, father, teeth, property company 'articles', 'Toys', 'present people', 'present Dao', 'scene', 'now called', 'now to run', 'Glass', 'bottle', 'Appliances', 'feet', 'Telegraph', 'Wire','TV', 'phone', 'suspected', 'Disease', 'White', 'Belt', 'Box' 'related', 'invisible', 'eye', 'contradiction', 'machete', 'smashed', 'confirm', 'leave', 'divorce', 'say belong', 'say have', 'claim to show', 'say to want', 'later', 'window', 'bamboo stick', 'wait', 'equal' Chopsticks, spirit, mental illness, disputes, economic disputes, Zhai Lu, Zhai Luji, old man, uncle, wife, husband, old man, old man, old lady, old lady, old man, wife, old man, contact number, ribs 'KFC', 'neck', 'washbasin', 'automatic', 'self-harm', 'claim', 'self', 'call', 'English translation', 'kitchen knife', 'abuse', 'screwdriver', 'clothes', 'clothes', 'hangers', 'replenishment', 'equipment', 'installation', 'west gate', 'explanation', 'guard room', 'police' 'equipment', 'inquiry', 'household', 'gambling', 'after leaving', 'drive out', 'cause', 'passers-by', 'roadside', 'approach', 'body', 'transfer', 'minor injury', 'minor injury', 'car', 'Xinggou Road', 'passing police', 'looking over', 'also called' entering the house' 'chase him', 'escape', 'escape', 'notice', 'call', 'neighbor', 'wine bottle', 'drunk', 'key bridge', 'iron bar', 'iron', 'lock thing', 'spatula', 'hammer', 'doorman', 'doorman', 'doorway', 'outside', 'gate', 'Minhang', 'protection' 'Auntie', 'Chen Lu', 'next door', 'shoes', 'Korea', 'necklace', 'wound examination', 'fracture', 'black', 'nose', 'Longzhou', 'Dragon Boat'] C:\ Users\ Windows\ Anaconda3\ lib\ site-packages\ sklearn\ feature_extraction\ image.py:167: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations dtype=np.int): Building prefix dict from the default dictionary... Loading model from cache C:\ Users\ Windows\ AppData\ Local\ Temp\ jieba.cacheLoading model cost 0.922 seconds.Prefix dict has been built successfully.1.0Process finished with exit code 0

The accuracy is basically 100%. Mom is no longer worried that I will be abused!

On how to understand Python to learn the basic operation of NLP natural language processing domestic violence classification is shared here, I hope the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report