In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/02 Report--
This article is to share with you about the order of search engine retrieval. The editor thinks it is very practical, so share it with you as a reference and follow the editor to have a look.
The order of search engine retrieval: 1, crawling web pages from the Internet; 2, establishing index database; 3, searching and sorting in index database; 4, processing and sorting search results.
The order of search engine retrieval:
Search engine refers to a system that uses specific computer programs to collect information on the Internet according to a certain strategy, and provides retrieval services for users after organizing and processing the information. The search engine is not the real Internet, it actually searches for a pre-organized web index database. A search engine in the real sense usually refers to collecting tens of millions to billions of web pages on the Internet and indexing every word (that is, keyword) in mine. Establish a full-text search engine for indexing the database. Nowadays, hyperlink analysis technology has been widely used in search engines. In addition to analyzing the content of the index page itself, it also analyzes and indexes all the links to the page, such as URL, Anchor, Text, and even the text around the link. So, sometimes, even if a word does not appear in a page A, such as
"Information retrieval", but if a web page B points to this web page A with a link to "information retrieval", then users can also find web page A when searching for "information retrieval". Moreover, if there are more "information retrieval" links to page A, then page A will be considered more relevant and ranked higher when users search for "information retrieval".
The principle of search engine can be divided into four steps: crawling web pages from the Internet, establishing index database, searching and sorting in index database, processing and sorting search results.
(1) crawling web pages from the Internet: use a spider system program that can automatically collect web pages from the Internet, automatically access the Internet, and climb along all the URL in any web page to other web pages, repeat this process, and collect all the crawled web pages back.
(2) Establishment of index database: the collected web pages are analyzed by the analysis index system program, and the relevant web page information (including URL, coding type, keywords contained in the page content, keyword location, generation time, size, link relationship with other web pages, etc.) is extracted, and a large number of complex calculations are carried out according to a certain correlation algorithm. Get the relevance (or importance) of each web page for each keyword in the page content and in the hyperlink, and then use these related information to build a web index database.
(3) search and sort in the index database: when the user enters the keyword, the search system program finds all the relevant web pages that match the keyword from the web index database. Because the relevance of the relevant web pages for this keyword has long been calculated, so only need to sort according to the ready-made correlation values, the higher the correlation, the higher the ranking. Finally, the page generation system organizes the link address and page content summary of the search results and returns them to the user.
(4) process and sort the search results: the relevant information of all relevant web pages for the keyword is recorded in the index database. You only need to integrate the relevant information and the page level to form the relevant numerical degree, and then sort it. The higher the relevance, the higher the ranking. Finally, the page generation system organizes the link address and page content summary of the search results and returns them to the user.
Thank you for reading! This is the end of the article on "what is the order of search engine retrieval". I hope the above content can be of some help to you, so that you can learn more knowledge. If you think the article is good, you can share it for more people to see!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.