Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to realize user similarity calculation based on CoSine similarity in spark mllib Collaborative filtering algorithm

2025-02-14 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)05/31 Report--

This article mainly introduces the spark mllib collaborative filtering algorithm how to achieve cosine similarity based user similarity calculation, has a certain reference value, interested friends can refer to, I hope you can learn a lot after reading this article, the following let Xiaobian take you to understand.

Run the code as follows / * collaborative filtering algorithm, user similarity calculation based on cosine similarity * generally Euclidean similarity is used to show the absolute difference of different targets. Analyze the similarities and differences between targets. * the cosine similarity is more likely to distinguish the target from the forward trend. * / package spark.collaborativeFilteringimport org.apache.spark. {SparkConf SparkContext} import scala.collection.mutable.Mapobject sparkCollaborativeFiltering {val conf = new SparkConf () .setMaster ("local") .setAppName ("CollaborativeFilteringSpark") / set the environment variable val sc = new SparkContext (conf) / / instantiate the environment val users = sc.parallelize (Array ("Zhang San", "Li Si", "Wang Wu", "Zhu Liu", "Zhuo Qi")) / / set the user val films = sc.parallelize (Array (gone with the Wind) "Dragon Inn", "Romeo and Juliet", "Macau Fengyun", "Wolf Totem") / / set the movie name / / use a source nested map as the storage of the movie name and score val source = Map [String,Map [String,Int]] () val filmSource = Map [String,Int] () / set a map def getSource () to store the movie points: Map [String] Int] = {/ / set movie rating val user1FilmSource = Map (gone with the Wind-> 2, Dragon Inn-> 3, Romeo and Juliet-> 1, Macau Fengyun-> 0, Wolf Totem-> 1) val user2FilmSource = Map (gone with the Wind-> 1, Dragon Inn-> 2, Romeo and Juliet-> 2, Macau Fengyun-> 1) Wolf Totem-> 4) val user3FilmSource = Map (gone with the Wind-> 2, Dragon Inn-> 1, Romeo and Juliet-> 0, Macau Fengyun-> 1, Wolf Totem-> 4) val user4FilmSource = Map (gone with the Wind-> 3, Dragon Inn-> 2, Romeo and Juliet-> 0, Macau Fengyun-> 5 Wolf Totem-> 3) val user5FilmSource = Map (gone with the Wind-> 5, Dragon Inn-> 3, Romeo and Juliet-> 1, Macau Fengyun-> 1 "Wolf Totem"-> 2) source + = ("Zhang San"-> user1FilmSource) / / A pair of names are stored source + = ("Li Si"-> user2FilmSource) source + = ("Wang Wu"-> user3FilmSource) source + = ("Zhu Liu"-> user4FilmSource) source + = ("Zhuo Qi"-> user5FilmSource) source / / returns nested map} / / to calculate the score Using cosine similarity def getCollaborateSource (user1:String) User2:String): Double = {val user1FilmSource = source.get (user1) .get.values.toVector / / get the score of the first user val user2FilmSource = source.get (user2) .get.values.toVector / / get the score of the second user val member = user1FilmSource.zip (user2FilmSource) .map (d = > d.values1 * d.values2). Reduce (_ + _). ToDouble// calculates the molecular part of the formula. Zip compresses several RDD into a RDD val temp1 = math.sqrt (num = > {/ / find the first variable value of the denominator math.pow (num,2) / / Mathematical}). Reduce (_ + _)) / / superimpose val temp2 = math.sqrt (num = > {/ / find the second variable value of the denominator math.pow (num) 2) / / Mathematical calculation}). Reduce (_ + _)) / / superimpose val denominator = temp1 * temp2 / / find the denominator member / denominator// to calculate} def main (args: Array [String]) {getSource () / / initialization fraction val name = "Li Si" / / set the target object users.foreach (user = > {/ / Iterative calculation of println (name + "relative to" + user + "similarity score is:" + getCollaborateSource (name) User)})} the result is shown in the figure

Thank you for reading this article carefully. I hope the article "how to achieve user similarity calculation based on cosine similarity in spark mllib Collaborative filtering algorithm" shared by the editor is helpful to everyone. At the same time, I also hope that you will support and pay attention to the industry information channel. More related knowledge is waiting for you to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report