Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What is the process of python machine learning to create a rule-based chat robot?

2025-02-22 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/02 Report--

This article mainly explains "what is the process of creating a rule-based chat robot in python machine learning". The content in the article is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn "what is the process of creating a rule-based chat robot in python machine learning"?

While True: AI = input ('I:') print (AI.replace). Replace ('?','!). Replace ('?' ,'!')

The above code is our topic today, rule-based chat robots.

Chat robot

The chatbot itself is a machine or software that mimics human interaction through text or sentences. In short, you can chat using software similar to talking to humans.

Why try to create a chat robot? Maybe you are interested in a new project, or the company needs one, or you want to invest in it. Whatever the motivation, this article will try to explain how to create a simple rule-based chat robot.

Rule-based chat robot

What is a rule-based chat robot? It is a kind of chat robot that answers human text based on specific rules. Because it is based on imposed rules, the response generated by the chatbot is almost accurate; however, if we receive a query that does not match the rule, the chatbot will not answer. Another version of it is a model-based chat robot, which answers a given query through a machine learning model. (the difference between the two is that we specify each rule based on the need for rules, and the model-based ones automatically generate rules through the training model. Remember our last article, "introduction to Machine Learning", "Machine Learning provides the system with the ability to learn and improve automatically based on experience without explicit programming." )

Rule-based chatbots may be based on rules given by humans, but that doesn't mean we don't use datasets. The main goal of chatbots is still to automate human questions, so we still need data to make specific rules.

In this paper, we will use cosine similarity distance as the basis to develop rule-based chat robots. CoSine similarity is a measure of similarity between vectors (especially non-zero vectors in inner product space), which is often used to measure the similarity between two texts.

We will use cosine similarity to create a chat robot to answer the questions raised by the query by comparing the similarity between the query and the corpus we developed. This is why we need to develop our corpus in the first place.

Create a corpus

For this chatbot example, I want to create a chatbot to answer all the questions about cats. In order to collect data about the cat, I will grab it from the Internet.

Import bs4 as bsimport urllib.request#Open the cat web data pagecat_data = urllib.request.urlopen ('https://simple.wikipedia.org/wiki/Cat').read()#Find all the paragraph html from the web pagecat_data_paragraphs = bs.BeautifulSoup (cat_data,'lxml'). Find_all (' p') # Creating the corpus of all the web page paragraphscat_text =''# Creating lower text corpus of cat paragraphsfor p in cat_data_paragraphs: cat_text + = p.text.lower () print (cat_text)

Using the above code, you get a collection of paragraphs from the wikipedia page. Next, you need to clean up the text to remove useless text such as parenthesis numbers and spaces.

Import recat_text = re.sub (r'\ s subscription,'', re.sub (r'\ [[0-9] *\]',', cat_text)

The above code will remove the parentheses from the corpus. I didn't remove these symbols and punctuation because it sounds natural when talking to a chatbot.

Finally, I will create a list of sentences based on the corpus I created earlier.

Import nltkcat_sentences = nltk.sent_tokenize (cat_text)

Our rule is simple: measure the cosine similarity between the query text of the chatbot and each text in the sentence list. which result produces the closest similarity (the highest cosine similarity), then it is the answer of our chatbot.

Create a chat robot

The corpus above is still in the form of text, and the cosine similarity does not accept text data; so we need to convert the corpus into numerical vectors. The common practice is to convert the text into a word bag (word counting) or use the TF-IDF method (frequency probability). In our example, we will use TF-IDF.

I'll create a function that takes the query text and gives an output based on the cosine similarity in the following code.

Let's look at the code.

From sklearn.metrics.pairwise import cosine_similarityfrom sklearn.feature_extraction.text import TfidfVectorizerdef chatbot_answer (user_query): # Append the query to the sentences list cat_sentences.append (user_query) # Create the sentences vector based on the list vectorizer = TfidfVectorizer () sentences_vectors = vectorizer.fit_transform (cat_sentences) # Measure the cosine similarity and take the second closest index because the first index is the user query vector_values = cosine_similarity (sentences_vectors [- 1] Sentences_vectors) answer = cat_ sentries [vector _ values.argsort () [0] [- 2]] # Final check to make sure there are result present. If all the result are 0, means the text input by us are not captured in the corpus input_check = vector_values.flatten () input_check.sort () if input_check [- 2] = 0: return "Please Try again" else: return answer

We can use the following flowchart to represent the above function:

Finally, create a simple answer interaction using the following code.

Print ("Hello, I am the Cat Chatbot. What is your meow questions?:") while (True): query = input (). Lower () if query not in ['bye',' good bye', 'take care']: print ("Cat Chatbot:", end= ") print (chatbot_answer (query)) cat_sentences.remove (query) else: print (" See You Again ") break

The above script will receive queries and process them through the chatbot we developed earlier.

As you can see from the picture above, the result is acceptable, but there are also some strange answers. But we have to think about the results from only one data source so far, and no optimizations have been made. If we improve it with additional datasets and rules, it will certainly answer questions better.

Thank you for your reading, the above is the content of "what is the process of creating rule-based chatbots in python machine learning". After the study of this article, I believe you have a deeper understanding of what the process of creating rule-based chatbots in python machine learning is, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report