Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Analysis of basic concepts and code examples of Mahout, collaborative filtering and CF recommendation algorithms

2025-01-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)05/31 Report--

The content of this article mainly focuses on the basic concepts and code example analysis of Mahout, collaborative filtering and CF recommendation algorithm. The content of the article is clear and clear. It is very suitable for beginners to learn and is worth reading. Interested friends can follow the editor to read together. I hope you can get something through this article!

Collaborative filtering

Collaborative filtering is a typical method of using collective intelligence. To understand what Collaborative Filtering (CF) is, first think of a simple question: if you want to see a movie now, but you don't know which one to watch, what will you do? Most people will ask their friends to see if there are any good movie recommendations recently, while we are generally more likely to get recommendations from friends with similar tastes. This is the core idea of collaborative filtering.

Collaborative filtering generally finds a small number of users with similar tastes to yours. In collaborative filtering, these users become neighbors and then organize a sorted directory according to other things they like to recommend to you. Of course, there is a core problem:

How to determine whether a user has similar taste to you?

How to organize the preferences of neighbors into a sorted directory?

Compared with collective wisdom, collaborative filtering retains individual characteristics to a certain extent, that is, your taste preference, so it can be used as an algorithm idea of personalized recommendation.

Algorithm evaluation criteria: recall rate (recall) and precision rate (precision)

Mahout provides two metrics to evaluate the recommender, precision and recall (recall), which are classic metrics in search engines.

Is it relevant?

A C was retrieved

B D was not retrieved

A: retrieved, relevant (what is found is also wanted)

B: not found, but relevant (not found, but actually wanted)

C: retrieved, but irrelevant (found but useless)

D: unsearched and irrelevant (useless if not found)

The more it is retrieved, the better. This is the pursuit of "recall", that is, A / (AbeliB), the bigger the better.

To be retrieved, the more relevant, the better, and the less irrelevant, the better. This is the pursuit of "precision", that is, A / (A / C), the bigger the better.

In large-scale data sets, the two indicators restrict each other. When you want to index more data, the accuracy will drop, and when you want the index to be more accurate, you will index less data.

The basic principles of user-based CF

Based on the collaborative filtering of users, the similarity between users is evaluated by the scores of different users, and recommendations are made based on the similarity between users. To put it simply: recommend other users' favorite items that are similar to their interests.

The basic idea based on the user's CF is quite simple, based on the user's preference for items to find the neighboring users, and then recommend what the neighboring users like to the current user. In calculation, the similarity between users is calculated by taking a user's preference for all items as a vector. after finding K neighbors, according to the similarity weight of neighbors and their preference for items, predict the unrelated items that the current user has no preference, and calculate a sorted list of items as a recommendation. The following figure shows an example. For user A, according to the user's historical preference, only one neighbor, user C, is calculated, and then the item D that user C likes is recommended to user A.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report