Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What is the application of machine learning in the actual operation of Quora

2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly explains "what is the application of machine learning in the actual operation of Quora". The content of the explanation in the article is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn "what is the application of machine learning in the actual operation of Quora"?

Ranking

Ranking can be said to be one of the most important machine learning applications online. Companies large and small build business models around rankings, such as the results returned by query strings. Quora uses different ranking algorithms in different environments and for different purposes.

An interesting example is the answer ranking. Suppose there are several answers to a question, and we are interested in how to sort them in descending order so that the "best" answer comes first and the worst answer comes last. (see screenshot below).

Determining the correct order of answers to a question involves a variety of characteristics. To determine the order, we first need to determine how Quora defines a "good answer". Our machine learning algorithm implements a special machine learning ranking method, which uses a variety of features to try to encode multiple dimensions associated with the above abstract concepts. For example, we use features that describe the quality of writing information, as well as features that describe the interactions received by the answer (such as likes, steps, and the number of unfolds). We also use features related to the author of the answer, such as his professionalism in the question field.

There are many other ranking apps on Quora, some of which go unnoticed. For example, a user name that is liked by an answer is also sorted to put the users we think are the most knowledgeable about the question / answer at the top. Similarly, when possible respondents are displayed for a particular question, those recommended users are also sorted.

Let's take a closer look at two special cases of the machine learning ranking algorithm: search and personalized ranking.

Search algorithm

For applications like Quora, the search algorithm can be seen as another application of ranking. In fact, search can be broken down into two steps: text matching and ranking. The first step is to somehow return the document (question) that matches the query string entered in the search box. Then, these documents are ranked as candidate questions for the second step in order to optimize the click probability and other aspects.

Many of the features in the second step can be used, and it is indeed another example of a machine learning ranking algorithm. It includes simple text features that have been used in the initial text matching phase, as well as other features related to user behavior, or object attributes such as popularity.

Personalized ranking

In some of the scenarios described above, a global optimal ranking of all users may be sufficient. In other words, we can assume that for a given question, the ranking of the most "helpful" answers is independent for the user who reads the answer. However, this assumption is not true in many important situations. One of the occasions is Quora Feed, which is basically a home page visible to anyone who logs in to the product. On this home page, we try to select and rank the most "interesting" stories for specific users at a specific time (see example below). This is a typical personalized ranking of machine learning, similar to the ranking of movies and TV series on the Netflix home page.

The use case of Quora is more challenging than the ranking of Netflix movies and TV dramas. In fact, our use case can be seen as a combination of Netflix, Facebook, and Google News optimized personalized rankings. On the one hand, we need to make sure that the top stories are thematically relevant to users. On the other hand, there is a clear relationship between Quora and users. Your behavior on the "social network" should also have an impact on rankings. Third, stories on Quora may sometimes be associated with ongoing trend events. Timeliness is another factor that should influence model decisions to determine whether a story should be ranked higher or lower.

Because of this, Quora's personalized ranking involves a variety of different characteristics. Several are listed below:

Quality of questions / answers

Topics of interest to users

Other users that this user follows

Hot event

...

In fact, it's important to keep in mind that at Quora we are interested not only in how to attract users to read interesting content, but also in submitting questions to users who can write interesting content. Therefore, we must include both the interesting features of the answers and the characteristics of the questions. To obtain these characteristics, we use information derived from the behavior of users, authors, and objects (such as answers / questions). These behaviors are taken into account and accumulated in different time windows and provided to the ranking algorithm. In fact, we can get many different features to add to our personalized push model, and we have been trying to add more features.

Another important consideration for our Feed ranking application is that we need to be able to respond to users' behavior, perception, and even hot events. Our millions of questions and answers are still growing, so we can't try to rank every user in real time. In order to optimize the experience, we have implemented a multi-stage ranking solution, in which candidates are selected and ranked in advance, and then the final ranking is actually implemented.

Recommend

The above personalized ranking is already a form of recommendation. Similar methods are used in different cases. For example, the popular Quora email selection includes a series of stories selected and recommended for you. This is a different machine learning ranking model, which is optimized according to different objective functions. In addition to the ranking algorithm, we have other personalized recommendation algorithms in different parts of the product. For example, you can see recommendations for characters or themes in several places (see figure below).

Related problems

Another source of recommendation is to show users other issues that have something to do with the current problem.

Related problems are determined by another machine learning model, which takes into account a variety of different features, such as text similarity, shared data (co-visit data), or the same features such as topics. Characteristics related to popularity or problem quality should also be considered. It is important to point out that a good "similar problem" recommendation is not only how similar an item is to the source problem, but also the "interest" of the target question. In fact, for any "related item" machine learning model, the most troublesome problem is the tradeoff between similarity and other relevance factors.

The related questions model is particularly effective in attracting logged-out users to visit the question page from an external search. This is one of the reasons why this recommendation model has not been personalized so far.

Repetitive question

The repetition problem is the extreme case of the above-mentioned related problems. This is a challenge for Quora because we want to make sure that the user's effort to answer a particular question is shared and focused on the right place. Similarly, it is necessary to point out the existing answers for users who want to ask questions on the site. Therefore, we put a lot of effort into detecting repetitive problems, especially at the initiating stage.

Our existing solution is based on binary classifiers trained with repetitive / non-repetitive tags. We use a variety of semaphores, from text vector space models to usage-based features.

User credibility / professional inference

In applications like Quora, it is very important to master the credibility of users. In fact, we are not only limited to answering the question itself, but also interested in its relevance to the relevant topic. One user may be knowledgeable about some topics, but not necessarily in other areas. Quora uses machine learning techniques to infer the professionalism of users. Not only do we know what answers users write about a given topic, but we also know how many likes, how many steps, and what kind of comments they get. We also know how many "recommendations" this user has received in this area. An Endorsements is a very clear recognition of someone's professionalism from the perspective of other users.

Another important thing to keep in mind is that credibility / professionalism spreads over the Internet, which also needs to be considered by algorithms. For example, if a machine learning expert gives a compliment to my answer in the field of machine learning, it should exceed the likes given by random users who are not experts in the field. The same applies to recommendations and other user-to-user features.

Spam Detection and Control (Moderation)

Sites like Quora, which prides itself on keeping content of high quality, must be wary of fooling the system with spam, malicious, or very low-quality content. The purely manual review model cannot be extended. The solution to the problem, as you might guess, is to use machine learning models to detect these problems.

Quora has several models to detect content quality-related problems. In most cases, the output of these classifiers is not directly used for decision-making, but these questions / answers are provided to the control queue and then reviewed manually.

Prediction of content creation

It's important for Quora to remember that we optimize many parts of the system not only to attract readers, but also to produce the best quality and most popular content. Therefore, we have a machine learning model to predict the possibility that a user will write an answer to a question. This allows our system to give priority to these issues in a variety of ways. One of these is the system's automatic A2A (Ask to Answer), which sends questions to potential respondents by prompting them. Other ranking systems mentioned above also use this model to predict probabilities.

Model

Quora has tried many different models for the different cases described above. Sometimes we use open source implementations, but more often we end up with more efficient and flexible builds. I won't discuss the details of the model, but I will list the models that our system uses:

Logical regression

Elastic network

Gradient enhanced decision tree

Random forest

Neural network

LambdaMART

Matrix decomposition

Vector model and other natural language processing techniques

...

Thank you for your reading, the above is the content of "what is the application of machine learning in the actual operation of Quora". After the study of this article, I believe you have a deeper understanding of the application of machine learning in the actual operation of Quora, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report