In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/02 Report--
This article will explain in detail the example analysis of the Hadoop 0.20 update for everyone, Xiaobian thinks it is quite practical, so share it with you as a reference, I hope you can gain something after reading this article.
Hadoop 0.20 Update Notes
Recently learning hadoop 0.20.1, online found an article "What's New in Hadoop Core 0.20", incomplete to translate a bit, for the convenience of future retrieval, sent up to save a copy. If you can read English, do not read the following Chinese.
Hadoop Core 0.20.0 was released on April 22, 2009. This release is a lot different from the 0.19 release, and there are a lot of changes at the user level.
Core
The two main components of Hadoop are the Distributed File System (HDFS) and MapReduce, which are moved into separate subprojects so they can have their own release cycles and are easier to manage. But in the 0.20 release, the two components are still released together. In this release, hadoop-size.xml is split into three configuration files: core-site.xml, hdfs-site.xml, and mapred-site.xml (HADOOP-4631). You can also continue to use a single hadoop-site.xml, hadoop will only raise a warning. Default configuration files have been moved out of the conf folder and into.jar files, whose contents can be seen in html files in the docs folder. Start-dfs.sh, start-mapred.sh, stop-dfs.sh, stop-mapred.sh are recommended instead of start-all.sh and stop-all.sh.
The above are some of the major changes, but allowing comments (HADOOP-4454) in slaves files is more useful in practice.
Hadoop configuration files support the Xinclude element for introducing additional configuration files (HADOOP-4944(url:https://issues.apache.org/jira/browse/HADOOP-4944)). This mechanism makes configuration files more modular and reusable.
Hadoop has made a series of moves around security issues. 0.20.0 added service-levelauthorization(HADOOP-4348). Developers can restrict client communication with hadoop daemons.
LZO compression libraries moved out of hadoopcore for licensing reasons, but if your code is licensed under the GPL, you can still get LZO from the hadoop-gpl-compression project.
HDFS
HSFSappend defaults to disable from 0.19.1.
Hadoop adds a new admin command: hadoop dfsadmin-saveNamespace. In safe mode, this command causes namesode to dump namespaces to disk.
MapReduce
In the Hadoop 0.20 update, *** a new Java API called "ContextObjects" was added. Mapper and Reduce were made abstract classes (not interfaces) by introducing ContextObject to make the API easier to evolve in the future.
1. JobConf no longer exists, Job configuration information is held by Configuration;
2. Job configuration information is now easier to get in map() or reduce() methods. Just call context.getConfiguration().
3. The new API supports iteration in the pull form. Before that, if you wanted to traverse records in a mapper, you had to save them to instance variables of the Mapper class. In the new API, you just need to call nextKeyValue().
4. You can also override the run() method to control how the mapper works.
5. IdentityMapper and IdentityReducer classes are no longer in the new API because the default Mapper and Reducer perform identity functions.
The new API is not backward compatible, so you have to rewrite your app. Note that the new API is in the org.apache.hadoop.mapreduce package and its subpackages, while the old API is in org.apache.hadoop.mapred.
Multipletask assignment. This optimization allows JobTracker to assign multiple tasks to tasktracker in a heartbeat cycle, increasing utilization. A new configuration parameter mapred.reduce.slowstart.completed.maps is also introduced (default 0.05).
Inputformats adds some interesting improvements. FileInputFormat does a better job of selecting which host has more files to split. 0.20 on the other hand introduced the CombineFileInputFormat class, which can turn many small files into a split.
Gridmix2 is the second generation MapReduce workload benchmark model suite.
Contrib
Two new donated modules appear in branch 0.20:
HDFS Proxy, which exposes HDFS to a read-only HSFTP interface to provide secure, read-only access.
Vaidya, a tool for diagnosing errors after MapReduce job runs by examining job history and configuration information. It provides suggestions for improvements to common problems so that your code is error-free.
About "Hadoop 0.20 update example analysis" This article is shared here, I hope the above content can be of some help to everyone, so that you can learn more knowledge, if you think the article is good, please share it to let more people see.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.