In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-26 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/02 Report--
This article is about how to avoid duplicate URL being included by Baidu. The editor thinks it is very practical, so I share it with you. I hope you can get something after reading this article. Let's take a look at it.
URL parameter
Also known as URL query, is the most complex, the most easily overlooked, the most likely to be compromised. He is an essential element in the operation of the website, and if it is simply removed, other departments will not be able to work. Static is a topic, and the URL parameter is often used in the following areas:
Different states of the same entity, such as the same hotel, will have different room inventory at different times: http://www.travel.com/hotel/123/?checkindate=2015-06-09&checkoutdate=2015-06-10
To count the traffic in different channels: http://www.a.com/?tracking=website_a
In order to count the number of clicks of specific modules in different channels: http://www.a.com/?tracking=website_a&click_spot=zone_abc
Debugging: http://www.a.com/product/item123/?debug=true
The strangest thing in the world is Amazon, which puts the statistical parameters in the path http://www.amazon.cn/abc/dp/B005TZHJEQ/ref=lp_2130608051_1_1.
The disadvantages of this problem are as follows:
1. Waste of search engine quotas on your site, thereby affecting other normal pages.
two。 Lose a lot of links that should have been added points, links from off-site channels are often of the best quality. The score of the same URL may be divided into dozens of parts.
3. The traffic of SEO is counted to other channels (because the tracking field is written in other channels and is included and clicked)
4. Often form a situation, products with one set of URL,SEO with another set of URL, or even different channels with different URL, the cost of later development and maintenance is extremely high.
In order to solve this problem, we must first clarify the definition of URL. As far as I understand it, each URL is a static, independent, non-repetitive, meaningful entity, and generally has retrieval significance (that is, someone will search). Such as a person, a car, a road, a part. And can not be mixed with a variety of "states", such as when this person is sick, isn't it himself? Is it another product when it is on sale?
In theory, canonical tags can solve this problem, but from the actual test results, Baidu's support priority for this tag is so low that it is almost negligible. So my solution goes like this:
1. Set up the mind map and meta-information of the website.
two。 All parameters related to SEO meta-information are put into the path.
3. All parameters that are not related to the web meta-information are placed after #, because it does not affect the content returned by the SEO server. To put it simply, replace "?" with "#".
4. Each page uses js to obtain the parameter pairs after # and send them back to the statistics server through a second request.
5. If the parameters after # affect the content of the page, such as the check-in date of the hotel. Then this part of the content can be loaded with ajax, it is unstable and is not part of the content of the page. Of course, there are workarounds, so I won't repeat them for the time being.)
6. The original # anchor definition is bound to conflict, define a variable after #, and use js to control the screen scrolling to ensure the role of the original anchor.
One might think that, according to ua, if it is a search engine crawler, remove the URL parameter by jumping. But the most efficient approach must be not to show the wrong URL in the first place. Then the previous example is optimized to:
Http://www.travel.com/hotel/123/#checkindate=2015-06-09&checkoutdate=2015-06-10
Http://www.a.com/#tracking=website_a
Http://www.a.com/#tracking=website_a&click_spot=zone_abc
Http://www.a.com/product/item123/#debug=true
In fact, many websites have used this way for a long time, but there are still many websites that can not be implemented in time due to development efficiency. Therefore, for the general small website, we must consider the development cost, not rash forward. There are many flexible ways as long as problems can be avoided.
Use unnecessary elements in the path
Many websites follow Amazon's example by displaying the product name in URL, and then determining the content of the page through id: http://www.amazon.cn/ Collection Collection 043 Earl of Monte Cristo-Alexander Dumas / dp/B005TZHJEQ/
This can improve some correlation, but it is dangerous. In the long-term or even short-term time, the names of a large number of goods are very likely to change, then the URL will also change. The cost is also very high, because it increases the difficulty of technical realization, no matter from inside or outside the station, it is a very troublesome thing to add links every time.
Before I took over eLong SEO, URL was all changed to this, which placed a huge burden on my early work: http://www.a.com/Shangrila_International_Hotel-12345678-hotel/
Through log analysis, it is found that almost all requests initiated by Baidu spiders have been redirected by 301 (log analysis method can refer to SEO health). After careful investigation, it is found that the Chinese characters and translation data from the SEO splicing rules to the background have been modified all the time. In other words, the elements related to this URL are:
1. Chinese (unnecessary elements)
two。 English translated by Chinese (non-essential elements)
3. Id (essential elements)
At that time, the colleague in charge of SEO spliced English and id into URL, so such a URL has successively become:
Http://www.a.com/Shangrila_International_Hotel-12345678-hotel/
Http://www.a.com/Xianggelila_International_Hotel-12345678-hotel/
Http://www.a.com/XiangGeLiLa_International_Hotel-12345678-hotel/
Http://www.a.com/Shangrila_guoji_Hotel-12345678-hotel/
The uniqueness and stability of URL are more important than "relevance". So to solve this problem, the best strategy of URL should be: http://www.a.com/hotel/12345678/
If the id belongs to a category, such as a city, then it can be: http://www.a.com/hotel/beijing/123/
From a technical point of view, id is generally the primary key of the database, which can be a number or a string, so the URL is one-dimensional; id can also be the joint unique index, then the URL is two-dimensional, just like the above (bejing,123) is indispensable. The list pages of e-commerce websites are often used in more than three dimensions.
Upper and lowercase
If the technical architecture of the website uses an open source system, there is generally no such problem. If you use Microsoft's technical architecture, this problem is very common:
Http://www.a.com/newyork/
Http://www.a.com/Newyork/
Http://www.a.com/NewYork/
My suggestion is to uniformly use lowercase, and uppercase automatically jumps to lowercase (beware of 301 dead loops!).
Specification of catalogue
Many websites have such URL at the same time, which virtually doubles the number of entries:
Http://www.a.com/product/123
Http://www.a.com/product/123/
The first path above means that there is a 123 file in the product directory. The second path means that there is a 123 directory under the product directory, which may have many files, but it represents the highest priority file such as index.html or index.php or default.aspx among the many files. To avoid ambiguity, my definition files end with ".html".
In order to reduce repeated inclusion, then according to my habit is:
Http://www.a.com/product/123 = > http://www.a.com/product/123/
Http://www.a.com/product/123 = > http://www.a.com/product/123.html
Summary
1. All departments uniformly use URL defined by SEO to shield non-SEO URL entrances.
two。 Replace "?" with "#"
3. Uniform use of lowercase
4. Ensure the specification of the catalogue
5. Jump non-standard URL to standard URL
The above is how to avoid site repetition of URL included by Baidu, the editor believes that there may be some knowledge points that we will see or use in our daily work. I hope you can learn more from this article. For more details, please follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 265
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.