Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What are the common mistakes in the environment of DIY Hadoop big data

2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

This article introduces the relevant knowledge of "what are the common mistakes in the environment of DIY Hadoop big data?". In the operation of actual cases, many people will encounter such a dilemma. Then let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

Dijcks lists five common mistakes made by IT leaders in DIY Hadoop clustering:

1. They tried to build Hadoop in a cheap way.

Many IT departments don't know what a Hadoop cluster should do (except for analyzing certain types of data), so they buy servers as cheaply as possible.

"Hadoop is considered to be self-healing, so when one node of the server fails, it's not a big problem," Dijcks said. "but if you buy a cheap server and a lot of nodes fail, then you have to spend more time repairing the hardware, which can cause a big problem if a lot of nodes are not running."

If your Hadoop cluster is just an experiment, then these may not be a problem. However, many experimental projects usually enter the production environment. According to the IT department, "We have invested a lot of time, we have done a lot of work, and now we need to put it into production," Dijcks said. "during the experiment, if there is something wrong with the environment, just restart, but in the production environment, the cluster needs to be able to withstand hardware failures, human interaction failures, and anything that may happen."

In its second quarter 2016 report big data Hadoop Optimization system, Forrester pointed out that we need a lot of time and effort to install, configure, debug, upgrade and monitor the infrastructure of the general Hadoop platform, while the pre-configured Hadoop optimization system provides faster time value, cost reduction, minimized management and modular expansion functions.

two。 Too many "cooks"

Most IT departments divide themselves into software, hardware, and network groups, and the Hadoop cluster spans these groups, so the DIY Hadoop cluster will eventually become the product of many persuasive "chefs."

"in this case, you have a recipe to refer to, but people in charge of different areas don't completely follow the recipe because they like to be slightly different from the recipes," Dijcks said. " So in the end, the Hadoop cluster will not work as expected.

After troubleshooting, the system should be able to boot and allow IT operators to run in a production environment, but Dijcks said: "this is another place where the learning curve begins. They may not be familiar with Hadoop clusters, and you will see a lot of human errors, downtime and so on."

3. They didn't realize that the Hadoop DIY project was a Trojan horse.

After the Hadoop cluster is moved to a production environment, enterprises often find that they need to arrange dedicated staff to keep it running. "of course, the staff spends most of their time on maintenance rather than innovation," Dijcks said. " In addition, this staff member needs to know about the Hadoop system.

"you can't expect people to become Hadoop experts in a very short period of time," he warned. Even if you hire experienced staff, IT environments vary widely-as do DIY Hadoop cluster components. Therefore, all configurations, connections, and relationships in your particular environment take time to understand.

4. They underestimated the complexity and frequency of updates.

New versions of Hadoop (for example, from Cloudera and Hortonworks) are released every three months, and these usually include new features, new features, updates, bug fixes, and so on.

"in addition to all the human operations required to keep the Hadoop cluster running, there are new upgrades every three months," Dijcks said. "the moment you finish the upgrade, you have to start planning the next upgrade. It's quite complicated, so some people start to skip updates." Even if you skip a few updates, you will eventually need to update, for example, from 5.4 to 5.7.

Although Cloudera and Hortonworks try to test as many scenarios as possible, "they can't test your specific operating system version or the impact on specific work operations," Dijcks said. "your environment may have Cisco routers or Red Hat operating systems or IBM hardware, and if the cluster is being used for big data production projects and you need to update, it may create significant downtime."

5. They are not ready for security challenges.

In the early days of Hadoop, security was not considered a big issue because the cluster was still behind the firewall. Now, security has become a problem.

At present, Kerberos authentication has been built into Hadoop to solve these problems, but some IT enterprises do not know how to deal with this protocol. "integrating Kerberos into enterprise Active Directory is very complex," he said. "you need to do a lot of integration between Active Directory and a series of components. And there is very little documentation on this, and most crucially, it involves security administrators and other IT teams who use almost completely different languages. "

Some IT departments eventually sign contracts with Cloudera, Hortonworks or other third parties to protect their DIY Hadoop clusters. "it takes some time to complete setup, testing, etc.," Dijcks said. "and then every three months, you need to do it again to make sure everything works properly, such as application and configuration."

This is the end of the content of "what are the common mistakes in the environment of DIY Hadoop big data". Thank you for your reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report