Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Sample analysis of WordPress specific articles hidden from search engines or allowed to be viewed only by search engines

2025-02-22 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/03 Report--

This article is about sample analysis of specific WordPress articles that are hidden from search engines or are only allowed to be viewed by search engines. The editor thinks it is very practical, so share it with you as a reference and follow the editor to have a look.

Hide specific articles from search engines

The source of the problem is as follows:

As we all know, with the improvement of the search engine, collection and pseudo-original will be more and more excluded, especially Baidu also launched the origin algorithm, K station on the collection station and other measures. If it is labeled as a collection station, all efforts may be in vain.

I believe that many webmasters also want more original content, and do not want to rely entirely on collecting other people's articles. However, a new station, especially the individual webmaster, the enrichment speed of its content must be very slow, we should not only please the search engine, but also please the readers. If the reader can not get more rich information in your site, the experience is certainly not good. In fact, the old famous stations also have a considerable proportion of collected or adapted content, which is in line with the spirit of Internet sharing. Most of the major TV stations and newspapers are reprints and abstracts, which are valuable as long as they are well picked to meet the specific needs of the content.

The key is: don't use collected articles to swindle search traffic for your website. This should be in line with the Internet ethics and consensus. If only the original content is allowed to participate in the rules of the search engine game, rather than the original part blocking the search engine. In this way, the interests of search engines, website owners and users can be met symmetrically.

So the problem boils down to one point: how to effectively and reliably let "some articles block search engines"?

I don't know if this is a common problem, if a website not only hopes to satisfy the audience through rich articles, but is also afraid of being judged as a collection station by search engines, then this is a real, critical, core problem related to the survival and development of the website.

Recently, I have also been learning relevant knowledge. in my humble opinion, there are several ways to block search engines:

First, use robots.txt

Second, the WP station can judge the characteristics of users (thought of it after reading your blog post)

Third, encapsulate the link through JS

Fourth, through redirection, such as short links, PHP background redirection, etc.

Compare the above ways

The first method: robots.txt is like putting a seal on the door: "Hey, spider, I have something here that you can't search." This is the so-called gentleman's agreement, the search engine must have the ability to see what is inside your sealed door, but it is not included. In order to determine whether a station has a large amount of collection, spiders may have a motive to snoop.

This method has the lowest cost and should be able to meet most situations. It seems that Baidu can rest assured in this respect, such as not indexing Taobao's content, and also hates 360's index of Baidu's content.

The further problem with this approach is:

In the station built by WP, how can we efficiently allow "some articles to block search engines"?

1. Add features to the title of the article: for example, add a special character to the title of each article. Is this method feasible? is it OK for robots.txt to use disallow:* special note *?

2, the tag identification of the article: this seems to be the most convenient at the operational level, but the tag seems to be a dynamic tag and cannot be filtered in robotx.txt?

3, the article into a specific directory: this robots.txt is easier to write, but in the WP article content management how to easily operate?

The second method: for example, check the ID card of the entrant, and if the visitor is a search engine, then traffic is prohibited. This method is specific to WP, and then its advantage is that it can be treated differently in great detail. For example, Baidu has a tight attitude towards collection, while GOOGLE is different, so some articles can close the door to Baidu and open the door to Google. Another big advantage is that judgments can be integrated into the WP environment, such as automating operations through plug-ins or themes.

The third method: like changing a house number on the door, the search engine only knows how to track the number on the door number mechanically, while the browser points the number to another correct entrance through JS. However: search engines may be getting better at analyzing JS, and according to some Google statements, search engines don't like your content that is different from that of search engines.

This method is widely used in the hidden aspects of Taobao guest links, the validity of this method is not estimated to be too long, and the operation is more troublesome, more suitable for static individual pages, not suitable for the structure of database organization articles like WP.

The fourth method: it is like encrypting the house number, only when you knock on the door (click), will you change it to the correct house number. The average visitor will click, and the search engine will not simulate the click action.

This method is relatively thorough and "safe", with the following disadvantages:

1, and the third method of operation is a bit complex, suitable for static individual pages, or local links in the page, not suitable for the WP environment.

2. Too many redirects should consume the computing resources of the server. If a large number of articles have to be redirected, the server may be overwhelmed.

Implementation code

How to achieve WordPress to hide specific articles from search engines? No more nonsense, just add the PHP code, put it in the functions.php of the current topic and use it (save as with UTF-8 coding):

/ / it should be noted that if page caching is enabled on your WordPress site, this function is invalid function ludouse_add_custom_box () {if (function_exists ('add_meta_box')) {add_meta_box (' ludou_allow_se', 'search engine', 'ludou_allow_se',' post', 'side',' low') Add_meta_box ('ludou_allow_se',' search engine', 'ludou_allow_se',' page', 'side',' low');}} add_action ('add_meta_boxes',' ludouse_add_custom_box'); function ludou_allow_se () {global $post; / / add verification field wp_nonce_field ('ludou_allow_se',' ludou_allow_se_nonce') $meta_value = get_post_meta ($post- > ID, 'ludou_allow_se', true); if ($meta_value) echo' blocking search engine'; else echo 'blocking search engine';} / / Save option settings function ludouse_save_postdata ($post_id) {/ / verify if (! isset ($_ POST ['ludou_allow_se_nonce'])) return $post_id; $nonce = $_ POST [' ludou_allow_se_nonce'] / / verify whether the field is legal if (! wp_verify_nonce ($nonce, 'ludou_allow_se')) return $post_id; / / determine whether to automatically save if (defined (' DOING_AUTOSAVE') & & DOING_AUTOSAVE) return $post_id; / / verify user permissions if ('page' = = $_ POST [' post_type']) {if (! current_user_can ('edit_page', $post_id)) return $post_id } else {if (! current_user_can ('edit_post', $post_id)) return $post_id;} / / Update settings if (! empty ($_ POST [' ludou-allow-se'])) update_post_meta ($post_id, 'ludou_allow_se',' 1'); else update_post_meta ($post_id, 'ludou_allow_se',' 0') } add_action ('save_post',' ludouse_save_postdata'); / / for setting do not allow crawling articles and pages / / disable search engine crawling, return 404function do_ludou_allow_se () {/ / this feature is only valid for articles and pages if (is_singular ()) {global $post; $is_robots = 0; $ludou_allow_se = get_post_meta ($post- > ID, 'ludou_allow_se', true) If (! empty ($ludou_allow_se)) {/ / the following is an array of crawler Agent judgment keywords / / it's a little simple. Optimize yourself $bots = array ('spider',' bot', 'crawl',' Slurp', 'yahoo-blogs',' Yandex', 'Yeti',' blogsearch', 'ia_archive',' Google', 'baidu') $useragent = $_ SERVER ['HTTP_USER_AGENT']; if (! empty ($useragent)) {foreach ($bots as $lookfor) {if (stristr ($useragent, $lookfor)! = false) {$is_robots = 1; break }} / / if the current article / page forbids search engine crawling, of course you can change it to 403 if ($is_robots) {status_header (404); exit;} add_action ('wp',' do_ludou_allow_se')

Usage

After successfully adding the above code to the functions.php of the current theme, we can use it normally, completely stupid. On the editing page of the WordPress background article and page, we can see the following box at the bottom of the right column:

If the current article / page needs to be disabled by search engines, check it. When checked, the article / page will return a 404 status with no content when it is accessed by a search engine. If you don't like returning 404 to search engines and are worried that too many dead chains will affect SEO, you can use the following in the code:

Status_header; exit

Change it to:

Echo "\ n"

Then:

Add_action ('wp',' do_ludou_allow_se')

Change it to:

Add_action ('wp_head',' do_ludou_allow_se')

This adds the meta declaration directly to the head section of the page:

Tell search engines not to index this page and not to display snapshots. It is important to note that the following code must be found in the header.php under your theme directory:

Wp_head ()

Set articles to be viewed only by search engines

Some articles are only published for SEO, so that these articles are only allowed to be crawled by search engines and cannot be viewed by ordinary visitors, how can they be done in WordPress?

Implementation code

If your WordPress site does not enable page caching, this requirement is not difficult to achieve. We can refer to the code in the specific articles hidden from the search engine above and modify it slightly. Add the following php code to the functions.php of the current topic and save it with UTF8 encoding:

/ / add options to the editing pages of articles and pages function ludouseo_add_custom_box () {add_meta_box ('ludou_se_only',' search engine exclusive', 'ludou_se_only',' post', 'side',' low'); add_meta_box ('ludou_se_only',' search engine exclusive', 'ludou_se_only',' page', 'side',' low') } add_action ('add_meta_boxes',' ludouseo_add_custom_box'); function ludou_se_only () {global $post; / / add authentication field wp_nonce_field ('ludou_se_only',' ludou_se_only_nonce'); $meta_value = get_post_meta ($post- > ID, 'ludou_se_only', true); if ($meta_value) echo' only allow search engines to view' Else echo 'only allow search engines to view';} / Save options set function ludouseo_save_postdata ($post_id) {/ / verify if (! isset ($_ POST ['ludou_se_only_nonce'])) return $post_id; $nonce = $_ POST [' ludou_se_only_nonce']; / / verify whether the field is legal if (! wp_verify_nonce ($nonce, 'ludou_se_only')) return $post_id / / determine whether to automatically save if (defined ('DOING_AUTOSAVE') & & DOING_AUTOSAVE) return $post_id; / / verify user permissions if (' page' = = $_ POST ['post_type']) {if (! current_user_can (' edit_page', $post_id)) return $post_id;} else {if (! current_user_can ('edit_post', $post_id) return $post_id) } / / Update settings if (! empty ($_ POST ['ludou-se-only'])) update_post_meta ($post_id,' ludou_se_only','1'); else delete_post_meta ($post_id, 'ludou_se_only');} add_action (' save_post', 'ludouseo_save_postdata') Function do_ludou_se_only () {/ / this feature is only valid for articles and pages if (is_singular ()) {global $post; $is_robots = 0; $ludou_se_only = get_post_meta ($post- > ID, 'ludou_se_only', true) If (! empty ($ludou_se_only)) {/ / below is an array of keywords judged by the search engine Agent / / it's a bit simple. Optimize it yourself $bots = array ('spider',' bot', 'crawl',' Slurp', 'yahoo-blogs',' Yandex', 'Yeti',' blogsearch', 'ia_archive',' Google') $useragent = $_ SERVER ['HTTP_USER_AGENT']; if (! empty ($useragent)) {foreach ($bots as $lookfor) {if (stristr ($useragent, $lookfor)! = false) {$is_robots = 1; break } / / if it is not a search engine, an error message is displayed / / the logged-in user is not affected by if (! $is_robots & &! is_user_logged_in ()) {wp_die ('you are not authorized to view this article!') ;} add_action ('wp',' do_ludou_se_only')

Usage

After successfully adding the above code to the functions.php of the current theme, we can use it normally, completely stupid. On the editing page of the WordPress background article and page, we can see the following box at the bottom of the right column:

If the current article / page needs to be disabled by search engines, check it. When checked, the following error message will be displayed when this article / page is accessed by ordinary visitors (search engines and logged-in users are not affected):

Thank you for reading! On "WordPress specific articles hidden from search engines or only allow search engines to view the sample analysis" this article is shared here, I hope the above content can be of some help to you, so that you can learn more knowledge, if you think the article is good, you can share it out for more people to see it!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report