In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
This article introduces the relevant knowledge of "how to use Hadoop MapReduce". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!
Apache Hadoop:
It is a distributed computing open source framework of Apache open source organization, which provides a distributed file system subproject (HDFS) and a software architecture that supports MapReduce distributed computing.
The core of Hadoop is HDFS and MapReduce, which Chinese people like to sum up as "divide and rule".
Divide and rule
From the "Qunjing deliberation of Zhou Guan er" and "Corporal Wu Ma, two people treat four people": "where there is a disease in the state, those who choose to make it, they will be divided and treated, and they will not cure themselves." Simply speaking, it can be understood as the meaning of separate governance.
This is similar to the idea of classification in design thinking, such as:
Portrait the user in UX and type TAG for the user
UED design language that decomposes design goals and sets design rules for different sub-goals; it can also be used to decompose design elements and formulate design strategies for each element
UI & graphic design, with different research on color matching, composition, font style, etc.
UX design, focusing on the optimization of function, layout, usage path, information architecture, etc.
Architecture / landscape design, focusing on the experience of space, material, function, line of sight, etc.
Hadoop is widely used in big data to process hundreds of GB to TB or PB data. Using HDFS, cluster N ordinary computers (such as hard disk 128 GB, memory 4 G) to form a "large" computer with N X 128 GB hard disk and N X 4 G memory. Hadoop plays the role of data distribution here, and it is convenient to send each part of the original data to multiple computers in the cluster at any time to save and calculate.
In computing, the MapReduce model is used to divide the work into a set of independent tasks to process large amounts of data in parallel.
In MapReduce, records are handled in isolation by a task called Mappers. The output of Mappers is then combined into a second set of tasks called Reducers, where results from different mappers can be merged.
An example of MapReduce-word statistics:
Count the number of times words appear in different files. We have two documents:
Foo.txt: Sweet, this is the foo file
Bar.txt: This is the bar file
The result of the output should be:
Sweet 1
This 2
Is 2
The 2
Foo 1
Bar 1
File 2
The pseudo-code written as MapReduce is as follows:
Mapper (filename, file-contents):
For each word in file-contents: emit (word, 1)
Reducer (word, values): sum = 0 for each value in values: sum = sum + value emit (word, sum)
Hadoop is not a substitute for the database, but a computing framework, which can be understood as a "calculator" for big data. Hadoop stores data in files and does not index them. If you want to find something, you must run the MapReduce job to see all the data. This takes time and means that you cannot directly use Hadoop as a database replacement. And Hadoop does not support the update of database and the operation of changing data.
That's all for the content of "how to use Hadoop MapReduce". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.