In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
What are the two major misunderstandings in big data's industry? I believe many inexperienced people are at a loss about it. Therefore, this paper summarizes the causes and solutions of the problems. Through this article, I hope you can solve this problem.
The word big data is probably one of the hottest words in the IT circle in the past two years. Big data must be talked about in various forums and meetings. In the IT world, the word "big data" has become an "arcade" or "street ci" like a certain fruit. It is embarrassing to tell people that they are engaged in IT without saying "big data is long and big data is short". To some extent, big data's "circle" is too messy and is no better than the "expensive circle" at all.
Conceptually, what is big data? In fact, data processing has existed since the birth of human beings. the ancient knot notes are basic statistics, counting how many meals they have eaten, how many times they hunt, and so on. More recently, the emperor's brand of concubines every night is also data processing. Before turning the brand, it is necessary to analyze indicators such as "convenience", "high heat" and "freshness" from a large number of brands. More recently, data warehouse matured and developed for decades before the word big data appeared. Therefore, big data is not new, but some technologies, such as Hadoop, MR, Storm, and Spark, have developed to a certain stage, adapting to the concepts fired by these technologies, but these concepts are all based on a basic concept of "open source", which has never been seen before, which can save money and improve efficiency, so everyone throws matches into the industry. Personally, I don't think it's a bad thing.
Here I still want to recommend the big data Learning Exchange Group I built myself: 529867072, all of them are developed by big data. If you are studying big data, the editor welcomes you to join us. Everyone is a software development party. Irregularly share practical information (only related to big data software development), including the latest big data advanced materials and advanced development tutorials sorted out by myself. Welcome to join us if you want to go deep into big data. Misunderstanding one: only those who engage in big data's technological development are the real "insiders."
The author has attended several meetings, 70% of which are technical, and the attendees are all data-related project managers and technical leaders in China. The topics discussed are what are the problems when upgrading the CDH version, which is the better way to deal with Hive jobs, how to be more efficient when matching Storm and Kafka, and how to release memory in Spark applications. The attendees all have the same attitude: those who do not understand big data's technology are not qualified to comment on big data. If you do not understand the resource allocation in Hadoop 2.0, do not understand the tuning of Spark residence time in memory, and do not participate in this meeting if you do not understand Kafka collection! By the way, Google has completely abandoned MR and only Dataflow recently, do you understand? I don't know. Get out!
Here, I would like to say that technological progress is driven by business. Can a certain treasure be called big data only when he goes to IOE? as a deaf-mute teacher, I have completed a note with a knot for people of different body shapes. What method is used for full-process treatment? is it not called big data analysis? To what extent technology develops, only a small part is driven by scientists' pursuit of the ultimate spirit, mostly because the business develops to a certain extent, which requires that technology must make progress in order to achieve the goal.
Therefore, the real big data "insider" should include at least the following kinds of people:
First, business operation personnel. For example, the product manager of the Internet requires the technical staff to calculate today's mood index when the user arrives at the website, and in order to achieve dynamic monitoring, it can only be handled by Storm or Spark. For example, when a telecom operator requires real-time marketing, when a user enters the business hall, he must immediately push a text message to the user, reminding him that there is a suitable blind date in this business hall (showing indicators such as height, BWH, weight, etc.). But buy a 4G phone before you meet For example, when the patient came to the bank to open an account, the bank learned that the user had been to the hospital clinic twice in the last week, traveled abroad three times, and took the child for swimming twice, and the customer manager immediately recommended the relevant bank insurance + wealth management products to the customer. These business people are often the core reasons for driving technological progress.
Second, architect. How important an architect is, when a business person and an engineer, one speaking the business language and the other talking about the technical terms, are there to discuss the problem, the engineer often thinks about what kind of code can shut him up right away. And the architect will often jump out and say, "No, you can't do that, you can only solve one problem and create a number of subsequent problems, according to my plan." It can solve some follow-up problems! " At the IT system level of a non-technical enterprise, more than 70% of the standards are often in the hands of architecture designers. As soon as possible, many excellent architects are slowly developed and learned from engineers. Many enterprises realize the importance of IT architecture, that is, many enterprises have two positions of CTO and CIO, which are equally important! The beauty of architecture, when the IT system runs smoothly, no one can feel it, but in the eyes of people walking through an environment full of chimneys and chaotic architecture, IT development must be the current architecture, behind the development!
Third, investors. Boss, needless to say, the boss gives you food and clothing, you work hard for the boss, and you are a natural provider of basic data. the boss says there are mountains where there are mountains, and when the boss says he wants to do real-time data processing and analysis, there is Storm, and when the boss says he wants to do open source, he has Hadoop, and the boss also says that he wants to do iterative mining, so he has Spark.
Fourth, scientists. They are the Geek in the eyes of others, they are tall in the eyes of others, they are mysterious men and women like Hawking, who come out early and return late, day and night, and they are the core force driving the technological progress of the world. In addition to the world's top IT companies (often the world's technical direction is in their hands), other companies generally need 1 or 2 scientists who are really committed to science, so don't let them think about business scenarios, don't let them think about business processes, don't let them calculate costs, don't let them think about project schedule. The only thing they need to think about is how to beat their opponents on a certain index. An increase of 0.1% on a certain index has enabled them to keep fighting and not sleeping. Let's all cheer and cheer for these scientists. In China, I think there are no more than 100 real big data scientists.
5. Engineer. Engineers are such a group of lovely people, they are young, impulsive, have ideals, and are known as "losers" and "keyboard parties". They work tirelessly for their ideals, and every time they make a little progress, they are thinking about whether the egg pancakes at the entrance of the subway have gone up another 50 cents. They are sensitive, conceited and never disdain to argue with business people. The difference between engineers and scientists is that engineers need to change the code frequently, test programs frequently, and go online frequently, but the final system is composed of several engineers' code. Every conceited engineer will disdain "hum, this junk code" when he sees the historical code of the system, and then devote himself to the coding work that later generations continue to despise.
Sixth, followers. Some of them are trainers, some are killing Matt to wash, cut and blow, some are coal bosses and some are stumbling girls. Their characteristic is speculation, the only difference with real estate speculators is that they do not have to pay money, they think that as long as they have anything to do with the data, they are called big data, and some of them have never even touched the IT system. They are experts at fishing in troubled waters and making up numbers. They are invisible people who are despised by the former kinds of people. However, I would like to say that welcome to speculation, the more fierce speculation in an industry, the more valuable people will be able to play their own role.
Myth 2: only big data can save the world.
Big data's current technology and application are in data analysis, data warehouse and other aspects, mainly for OLAP (Online Analytical System). From a technical point of view, it includes two legs I summarized: one leg is batch data processing (including MR, MPP, etc.), and the other leg is real-time data flow processing (Storm, in-memory database, etc.). On this basis, some scenes also found that the MR framework or real-time framework can not meet the needs of near-line and iterative mining, so there is a very popular in-memory data processing Spark framework. The current big data framework of many enterprises is that, on the one hand, the Hive and Pig frameworks on top of Hadoop 2.0 handle the underlying data processing and processing, and send the data processed according to business logic directly to the application database; on the other hand, the Storm flow processing engine processes real-time data and triggers the corresponding marketing scene according to the rules of business marketing. At the same time, the cluster based on Spark processing technology is used to meet the needs of real-time data processing and mining. Big data Exchange Group: 251956502
As can be seen from the above description, big data has not yet entered the real trading system and has not made much contribution to OLTP (Online Transaction system). As for many articles linking big data with the Internet of things, ubiquitous Internet, and smart city, I think big data is only one of the conditions. Whether the rest of the OLTP system is available or not, the physical network and even the organizational structure are all important factors.
Finally, I would also like to say that big data processing technology, such as Google Dataflow or mature such as Hadoop 2.0, data warehouse, Storm, etc., are essentially data processing tools, for many engineers, as long as the data processing flow is clear, it is enough to use fixed templates and scripts for data processing on this platform. After all, more than 70% of the value of data is for business applications, and if a dazzling word is not helpful to the business, it will only be the art of killing dragons. Any technology and IT architecture must meet the requirements of business planning and business development, otherwise technology will only hinder the development of business and productivity.
After reading the above, have you mastered what are the two major misunderstandings in big data's industry? If you want to learn more skills or want to know more about it, you are welcome to follow the industry information channel, thank you for reading!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 281
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.