In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-10 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/02 Report--
This article introduces the relevant knowledge of "is web crawler illegal?". In the operation of actual cases, many people will encounter such a dilemma. Then let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!
1. Technical innocence?
Many friends left me a message: technology is innocent, technology itself is not right or wrong, but people who use technology are right or wrong, if companies or programmers know that using their technology is illegal, then companies or people have to pay for it.
After the promulgation of the Cyber Security Law of the people's Republic of China this year, many businesses that used to be in grey areas cannot be done.
Can't you see that most of the social work library sites that used to be very popular have disappeared? Because the new security law emphasizes that selling more than 50 pieces of personal information is "serious" and needs to pursue its legal liability.
Many grassroots webmasters have taken the initiative to shut down websites; there are many websites involving copyright information, such as books, movies and TV dramas, courses and so on, will also face more and more stringent censorship, this is the current big situation.
Renren Film and Television subtitle site posted on Weibo on December 20, 2014 that Renren Film and Television was officially closed, and said it would continue to provide translation services for genuine publishers, or it could be transformed into a discussion community.
In June 2019, my love cracked the site for rectification due to copyright issues.
As China's economy continues to move forward, the issue of intellectual property rights will be paid more and more attention. Illegal crawlers are now an important part of the crackdown. If programmers walk on the gray edge and stop as soon as possible, do not break the law because of a small gain. So the loss outweighs the gain.
The technology is innocent, but the cost of using it in the wrong place is also huge.
two。 The reptile post is in danger.
I searched on the retractor: crawler engineer, there are 217 related recruitment messages, salary range from 10 to 60k, indicating that the market demand for crawlers is very large.
After the article was posted the day before yesterday, many programmers left me messages:
Is it a crime that our leader arranged for me to crawl information within the company?
Is it a crime to crawl public information on the Internet?
Write a piece of code and upload it to Github, is it illegal to be used?
Answer these questions briefly:
It is not a crime to crawl the company's internal information with the company's authorization, but the company does not use the interface and uses the crawler to know why.
It is not illegal to crawl public information on the Internet, but it is also illegal if a large number of crawlers cause the other party's server to crash, which belongs to the category of violent attacks.
When you write a piece of code and upload it to Github, someone uses your code to do other illegal things, most of which are fine, but it's hard to say if your software involves intrusions, brute force cracking, viruses, etc.
Some friends think that the responsibility for this lies with the programmer. In the daily work, the initial design and final launch of the project need to be approved by the company's legal affairs, and all the code must be reviewed and approved by other programmers before it can be submitted.
This friend is quite right, according to reason, every company should have legal affairs and risk control in front, and then product design and programmer development, but if a company is for profit, the boss can directly shut up these two departments. can programmers quit behind?
What's more, many companies actually do not have these two departments or exist in name only. Then as a programmer, you also need to worry about it. All programs involving intrusions cannot be done, because there is one thing called unit crime.
A unit crime refers to an act that endangers society that a company, enterprise, institution, organ or organization seeks benefits for a unit and is decided by the decision-making body or the person in charge of the unit to carry it out.
The criminal law of our country adopts the double punishment system for unit crime in principle, that is, if the unit commits a crime, the unit shall be sentenced to a fine, and the person-in-charge and other persons directly responsible shall be sentenced to punishment.
3. What kind of reptiles are illegal?
Crawlers can not involve personal privacy!
If the crawler program collects personal information such as a citizen's name, identity document number, communication and contact information, address, account password, property status, whereabouts, and so on, and uses it in an illegal way, it certainly constitutes an illegal act of illegally obtaining citizens' personal information.
In other words, there is no problem for your crawler to crawl information, but it cannot involve personal privacy issues. If it is involved and gains through illegal means, it is definitely illegal.
In addition, there are three situations where reptiles may break the law, or even constitute a crime.
If the crawler program circumvents the anti-crawler measures set by the website operator or cracked the server anti-crawling measures and illegally obtains relevant information, if the circumstances are serious, it may constitute the crime of illegally obtaining computer information system data.
If the crawler program interferes with the normal operation of the visited website or system, if the consequences are serious, it shall violate the criminal law and constitute the crime of destroying the computer information system.
If the information collected by the crawler belongs to the citizen's personal information, it may constitute an illegal act of illegally obtaining the citizen's personal information, and if the circumstances are serious, it may constitute the crime of infringing upon the citizen's personal information.
Now there are many paid courses online, such as geek time, Gitchat, Mutu.com, knowledge Planet and so on. It is illegal for these paid inside information to be sold for profit by illegal crawling.
Before I met a netizen, I caught all the contents of various knowledge planets and sold them by myself. I thought I had found a big business opportunity. In fact, I didn't know that this behavior was actually very dangerous, and the risks and benefits were obviously not equal.
When I looked at it in the past two days, one of his official accounts was blocked, and then I transferred a second account to continue to work on it. Sooner or later, it was the fate of being blocked again, which was really not worth it. The most pitiful are the users who bought his service, because when he advertised, he promised that it would never be forever.
4. What kind of reptiles are legal?
(1) abide by the Robots agreement
The Robots protocol, also known as robots.txt (uniform lowercase), is an ASCII-encoded text file stored in the root directory of the website, which usually tells the rover of the web search engine (also known as the web spider) what content in this site should not be accessed by the rover of the search engine and which can be accessed by the rover.
Robots protocol is to tell crawlers which information can be crawled and which information can not be crawled. Crawling website-related information in strict accordance with the Robots protocol generally does not have too much problem.
(2) do not cause paralysis of the server of the other party
But this is not to say that as long as the crawler complies with the Robots protocol there is no problem, but also involves two factors: first, large-scale crawler can not cause the other server to paralyze, which is tantamount to a network attack.
In the measures for the Management of data Security (draft for soliciting opinions) issued by the State Internet Information Office on May 28, 2019, it is proposed to restrict the use of reptiles in the form of administrative regulations:
Network operators adopt automatic means to access and collect website data, which shall not hinder the normal operation of the website; such behavior seriously affects the operation of the website, if the flow of automated access collection exceeds 1/3 of the average daily flow of the website, when the website is required to stop automatic access collection, it should be stopped.
(3) do not make illegal profits.
Those who maliciously use crawler technology to grab data, seize the advantage of unfair competition, or even seek illegal interests may violate the law. In practice, there are a large number of disputes caused by the illegal use of crawler technology to grab data, most of which are filed for litigation on the grounds of unfair competition.
For example, if you grab all the public information on Dianping, copy an identical website, and make a lot of profits from this site, it will also be problematic.
In general, crawlers are for corporate profits, so the need for the moral self-support of crawler developers and the conscience of business operators is the fundamental to avoid touching the legal bottom line.
This is the end of the content of "is web Crawler illegal?" Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.