Build cdh version in Hadoop environment 04/21 Update SLTechnology News&Howtos

Build cdh version in Hadoop environment

2025-04-21 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

Build a stand-alone machine in hadoop environment (cdh version)

1. Download the installation package

Download address http://hadoop.apache.org/

Download hadoop-2.6.0-cdh6.12.2

two。 Install tar-zxvf hadoop-2.6.0-cdh6.12.2.tar.gz

3. Create a folder to store hadoop and rename hadoop

4. Go to the etc folder and select the hadoop folder cd hadoop

5. Start configuring hadoop-env.sh to modify the JAVA_HOME path in it

6. Configure core-site.xml

Fs.defaultFS default file system name, used to determine hosts, ports, etc., for a file system.

Hadoop.tmp.dir is the temporary file directory for hadoop

Hadoop.proxyuser.root.users users who log in remotely using root

User login for hadoop.proxyuser.root.groups remote root group

7. Modify hdfs-site.xml

The number of copies of dfs.replication files. The number of copies is 128m by default. If it is less than this value, it is not sliced! A file, when uploaded to hdfs, specifies as many copies as possible. Later, if you change the number of copies, it will have no effect on the files that have already been uploaded.

8. Copy mapred-site.xml.template and modify content

Use the Yarn framework to execute map-reduce handlers

9. Configure yarn-site.xml

Yarn.resourcemanager.address provides the address to which the client accesses. The client submits the application to RM through this address, kills the application, etc.

Yarn.nodemanager.aux-services through this configuration item, users can customize some services, for example, the shuffle function of Map-Reduce is implemented in this way, so that they can extend their services on NodeManager.

Shuffle converts a set of regular data into a set of irregular data as far as possible, the more random the better. The whole process from Map output to Reduce input can be broadly called Shuffle.

10 modify etc/profile to add the following

Export HADOOP_HOME=/opt/bigdata/hadoop260

Export HADOOP_MAPRED_HOME=$HADOOP_HOME

Export HADOOP_COMMON_HOME=$HADOOP_HOME

Export HADOOP_HDFS_HOME=$HADOOP_HOME

Export YARN_HOME=$HADOOP_HOME

Export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native

Export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin

Export HADOOP_INSTALL=$HADOOP_HOME

11. Formatting

Hdfs namenode-format

12. Start start-all.sh

Turn it off is stop-all.sh

13. Jps to check whether all processes are enabled

14. Visit

Http://192.168.56.110:50070

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.