Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How PHP records the footprints of search engine spiders visiting websites

2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/03 Report--

This article is about how PHP keeps track of web sites visited by search engine spiders. Xiaobian thinks it is quite practical, so share it with everyone for reference. Let's follow Xiaobian and have a look.

The detailed analysis is as follows:

Search engine spiders visit the website through remote crawling pages to carry out, we can not use JS code to obtain spider Agent information, but we can through the image tag, so that we can get spider agent information, through the analysis of agent information, you can determine the type of spider, gender and other factors, we can record through the database or text statistics.

Database structure:

The following are quoted:

##TABLE STRUCTURE `naps_stats_bot`#CREATE TABLE `naps_stats_bot`(`botid` int(10) unsigned NOT NULL auto_increment,`botname` varchar(100) NOT NULL default '',`botagent` varchar(200) NOT NULL default '',`bottag` varchar(100) NOT NULL default '',`botcount` int(11) NOT NULL default '0',`botlast` datetime NOT NULL default '0000-00-00 00:00:00',`botlasturl` varchar(250) NOT NULL default '',UNIQUE KEY `botid` (`botid`),KEY `botname`(`botname`) TYPE=MyISAM AUTO_INCREMENT=9 ;##Export data from tables `naps_stats_bot`#INSERT INTO `naps_stats_bot` VALUES (1, 'Googlebot', 'Googlebot/2.X (+http://www.googlebot.com/bot.html)', 'googlebot', 0, '0000-00-00 00:00:00', ''); INSERT INTO `naps_stats_bot` VALUES (2, 'MSNbot', 'MSNBOT/0.1 (http://search.msn.com/msnbot.htm)', 'msnbot', 0, '0000-00-00 00:00:00', '');INSERT INTO `naps_stats_bot` VALUES (3, 'Inktomi Slurp', 'Slurp/2.0', 'slurp', 0, '0000-00-00 00:00:00', '');INSERT INTO `naps_stats_bot` VALUES (4, 'Baiduspider', 'Baiduspider+(+http://www.baidu.com/search/spider.htm)', 'baiduspider', 0, '0000-00-00 00:00:00', ''); INSERT INTO `naps_stats_bot` VALUES (5, 'Yahoobot', 'Mozilla/5.0+(compatible;+Yahoo!+ Slurp;+http://help.yahoo.com/help/us/ysearch/slurp)', 'slurp', 0, '0000-00-00 00:00:00', '');INSERT INTO `naps_stats_bot` VALUES (6, 'Sohubot', 'sohu-search', 'sohu-search', 0, '0000-00-00 00:00:00', '');INSERT INTO `naps_stats_bot` VALUES (7, 'Lycos', 'Lycos/x.x', 'lycos', 0, '0000-00-00 00:00:00', '');INSERT INTO `naps_stats_bot` VALUES (8, 'Robozilla', 'Robozilla/1.0', 'robozilla', 0, '0000-00-00 00:00:00', '');

PHP program is as follows:

The following are quoted:

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report