In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/01 Report--
This article is to share with you about the analysis of incomplete url or inexplicable url capture in the website log. The editor thinks it is very practical, so I share it with you to learn. I hope you can get something after reading this article.
In the process of analyzing the log, it is often found that there are some or a lot of incomplete url or the inexplicable crawl of url which has more fields than the original url. In the group, someone asked about a similar situation and felt that this was a common problem that everyone would encounter. There was speculation that it might be collected by others that led to such a problem. I asked Guoping boss in class. At that time, Guoping boss said that it was possible that the reptile was grabbing url without complete download, but there was no specific data to support it, and I always felt hazy and uncertain.
Now, the google administrator tool can clearly use the data to reveal the surprise for you. The google administrator tool is a seo tool highly respected by the boss of Guoping. Some people think that people who do Baidu do not need google, because the algorithms of Baidu and google are different, so the function of google administrator below can tell you that this idea is very wrong. The administrator tool is the most authoritative seo tool developed in accordance with the assessment standards of the website seo. Most of the data needed to do seo can be obtained from it. The following is an introduction to this function of the administrator tool (it seems to be recently updated, but it was not like this before).
On the home page, take a look at the weird 404 in Baidu's log.
Where are the entrances to these url? Where do search engines get these url?
Google told you
First of all, we will introduce the functions of the google administrator tool on crawling errors.
The following is another site due to the revision problem did not do jump and crawl intercept caused by the failure to find the crawl error, there is a very complete curve so that you can clearly see the changing trend of the problems in this aspect of the site.
And server problems caused by crawling errors
Follow the first picture (same website)
At first, these errors were observed in the log, but the source of these errors was not known.
Now we can know where these wrong url come from.
Click on Article 102 and pop up the box below. Url is not in sitemap, but appears in other websites, indicating that the URL exists in the site itself but has been deleted.
Click on item 110 and pop up the box below to see that the search engine is coming from another website (or collection station, or other).
Click to go to the specific source page to see
At this point, you can clearly know, in the end, the site log inside the emergence of those inexplicable url is what is going on, is the existence of their own station or the composition of errors outside the station.
The above is the analysis of incomplete url or inexplicable url crawls in the website log. The editor believes that there are some knowledge points that we may see or use in our daily work. I hope you can learn more from this article. For more details, please follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.