Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to calculate the query word URL optimal Rank in SogouQ

2025-01-20 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)05/31 Report--

In this issue, the editor will bring you about how to calculate the optimal Rank of the query word URL in SogouQ. The article is rich in content and analyzes and describes it from a professional point of view. I hope you can get something after reading this article.

PS1: the original format of the log is GB2312 encoding. Be sure to convert it to UTF-8.

PS2: log format and format description:

Access time\ t user ID\ t [query word]\ t ranking of the URL in the returned result\ t sequence number clicked by user\ t URL clicked by user

This format has pits, pits:

The separator between the fields "the ranking of the URL in the returned result\ t the sequence number clicked by the user" is not a tab\ t, but a space

Val sogouQRdd = sc.textFile ("hdfs://node1:9000/sogouQ/input") sogouQRdd.cache # caches log files in memory on the next Action operation

Find the total number of log file entries

Val itemCountRdd = sogouQRdd.countitemCountRdd: Long = 1724264

For each query word, find out the total number of entries in which the URL ranks 1 in the returned result and the sequence number is 1.

This shows that the URL of this search result has the best Rank.

Val suitableRankRdd = sogouQRdd.filter (_ .split ('\ t'). Length = = 5). Map (_ .split ('\ t'). Filter (_ (3). Split ('') (0). ToInt = = 1). Filter (_ (3). Split (') (1). ToInt = = 1). CountsuitableRankRdd: Long = 279859

Calculate the frequency of the optimal Rank for the query term URL:

Optimal Rank frequency = URL optimal Rank times / total number of entries

SuitableRankRdd / itemCountRdd = 0.1623

Therefore, the frequency of the query word URL optimal Rank is 16.23%.

This is how to calculate the optimal Rank of the query word URL in the SogouQ shared by the editor. If you happen to have similar doubts, you might as well refer to the above analysis to understand. If you want to know more about it, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report