Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

CDH5 offline installation (latest version 5.3.3 built-in hadoop2.5.0)

2025-01-21 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/03 Report--

First, the official offline installation http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/installation_installation.html#../topics/cm_ig_install_path_c.html is given.

Due to the lack of information on the latest version of the Internet, many questions can only be answered. As long as you strictly follow my steps step by step, you will be able to install successfully, all the steps have been personally tested on the company cluster, which also referred to a large number of online materials, here to thank those selfless devotees!

The two ways of online installation are not introduced. You can go to the official website to read the documentation. It is not recommended that you install online. It will be very troublesome if you encounter problems in the installation process.

Let's get straight to the point and download the required offline installation package.

Cloudrea Manager5.3.3 download address

Http://archive.cloudera.com/cm5/

Download the cloudera-manager-el6-cm5.3.3_x86_64.tar.gz file

CDH5.3.3 download address

Http://archive.cloudera.com/cdh6/

Download CDH-5.3.3-1.cdh6.3.3.p0.5Muel 6.parcelphilic 5.3.3-1.cdh6.3.3.p0.5Muel6.parcel.sha1recovery.json

Of these three files, the file name of CDH-5.3.3-1.cdh6.3.3.p0.5-el6.parcel.sha1 needs to be changed to

CDH-5.3.3-1.cdh6.3.3.p0.5-el6.parcel.sha

Separate download address of tar package and source code for each component of CDH5.3.3

Http://archive.cloudera.com/cdh6/cdh/5/

Environmental preparation

1.CentOS release 6.5 (Final)-cat / etc/issue

2.3hosts: cdh2.hadoop.com

Cdh3.hadoop.com

Cdh4.hadoop.com

3. Each machine has 16G of memory (32G recommended) and 1T of hard disk.

4. Each machine ensures that it can connect to the external network.

All nodes turn off the firewall

With the firewall on, execute the following two commands:

Temporary shutdown: service iptables stop

Permanently turn off the firewall: chkconfig iptables off

Run the two commands at the same time, and check the firewall shutdown status after the run is complete:

Service iptables status

All nodes turn off SELINUX

Change SELINUX=enforcing to SELINUX=disabled under / etc/selinux/config (effective after restart)

Setenforce 0 takes effect provisionally

View SELinux status:

1. / usr/sbin/sestatus-v # # if SELinux status parameter is enabled, it is enabled.

SELinux status: enabled

2. Getenforce # # can also use this command to check

Configure hostname and IP addr

Modify the / etc/hosts file to add:

192.168.1.105 cdh2.hadoop.com

192.168.1.106 cdh3.hadoop.com

192.168.1.107 cdh4.hadoop.com

Modify hostname

Modify the / etc/sysconfig/network file:

HOSTNAME=cdh2.hadoop.com

Execute the hostname cdh2.hadoop.com command to make the hostname take effect immediately

Execute command: servicenetwork restart

Get through to SSH and set ssh login without password (all nodes)

(1) all nodes (cdh2.hadoop.com, cdh3.hadoop.com, cdh4.hadoop.com):

Generate a key pair without a password: ssh-keygen-t rsa enter all the way to generate a key pair without a password.

(2) Master node (cdh2.hadoop.com): add the public key to the authentication file:

Cat ~ / .ssh/id_rsa.pub > > ~ / .ssh/authorized_keys

(3) scp file to datenode node (cdh3.hadoop.com):

Scp / .ssh/authorized_keys root@cdh3.hadoop.com:~/.ssh/

(4) add the public key of cdh3.hadoop.com to the authentication file:

Cat ~ / .ssh/id_rsa.pub > > ~ / .ssh/authorized_keys

(5) copy the authentication file of cdh3.hadoop.com to cdh4.hadoop.com:

Scp / .ssh/authorized_keys root@cdh4.hadoop.com:~/.ssh/

(6) add the public key of cdh4.hadoop.com to the authentication file:

Cat~/.ssh/id_rsa.pub > > ~ / .ssh/authorized_keys

(7) set the access permission for authorized_keys:

Chmod 600 ~ /. Ssh/authorized_keys.

(8) copy the resulting authentication file to all nodes:

Scp / .ssh/authorized_keys root@cdh2.hadoop.com:~/.ssh/

Scp~/.ssh/authorized_keys root@cdh3.hadoop.com:~/.ssh/

(9) Test (can log in directly without a password):

Sshcdh2.hadoop.com

Sshcdh3.hadoop.com

Sshcdh4.hadoop.com

Install NTP service, synchronize time

All nodes install related components: yum installntp.

After the configuration is completed, boot: chkconfig ntpd on

Check whether the setting is successful: chkconfig--list ntpd, where 2-5 is on status indicates success.

Master node configuration (cdh2.hadoop.com)

Before configuration, use ntpdate to manually synchronize the time, lest the time gap between the local machine and the timing center is too big, so that the ntpd can not be synchronized normally. Choose 202.120.2.101 (NTP server address of Shanghai Jiaotong University Network Center) as the timing center, enter the command:

Ntpdate-u 202.120.2.101

Modify the configuration file / etc/ntp.conf and comment out the useless ones:

Driftfile / var/lib/ntp/drift

Restrict 127.0.0.1

Restrict-6:: 1

Restrict default nomodify notrap

Server 202.120.2.101 prefer # remote server address

Includefile / etc/ntp/crypto/pw

Keys / etc/ntp/keys

After the configuration file is modified, execute the following command:

1 service ntpd start

2 chkconfig ntpd on (set boot up)

Ntpstat command to view synchronization status

Configure the ntp client (cdh3.hadoop.com,cdh4.hadoop.com)

Modify the configuration file / etc/ntp.conf:

Driftfile / var/lib/ntp/drift

Restrict127.0.0.1

Restrict-6:: 1

Restrict default kod nomodify notrap nopeernoquery

Restrict-6 default kod nomodify notrapnopeer noquery

Server cdh2.hadoop.com # here is the hostname or ip of the host node

Includefile / etc/ntp/crypto/pw

Keys / etc/ntp/keys

Manual synchronization time ntpdate-u cdh2.hadoop.com

Start the service service ntpdstart

Set up the boot chkconfig ntpd on

Install Java for Oracle (all nodes)

CDH5.3.3 needs Java7 support. Use rpm-qa | grep java to query java-related packages. Individuals may vary depending on the system. Here is the openJDk that needs to be uninstalled on my machine.

Rpm-e-- nodeps java-1.5.0-gcj-1.5.0.0-29.1.el6.x86_64

Rpm-e-- nodeps java-1.7.0-openjdk-devel-1.7.0.45-2.4.3.3.el6.x86_64

Rpm-e-- nodeps java-1.7.0-openjdk-1.7.0.45-2.4.3.3.el6.x86_64

Rpm-e-- nodeps java-1.6.0-openjdk-1.6.0.0-1.66.1.13.0.el6.x86_64

Rpm-e-- nodeps java-1.6.0-openjdk-devel-1.6.0.0-1.66.1.13.0.el6.x86_64

Go to Oracle's official website to download jdk's rpm installation package and install it using the rpm-ivh package name.

Rpm-ivh jdk-7u79-linux-x64.rpm

Modify file / etc/profile to change environment variables

Export JAVA_HOME=/usr/java/jdk1.7.0_79

Export JRE_HOME=$JAVA_HOME/jre

Export PATH=$JAVA_HOME/bin:$PATH

To make it effective: source / etc/profile

Install and configure Mysql (primary node)

Execute command: yum installmysql-server

Set up boot boot: chkconfig mysqld on

Start mysql:service mysqld start

Set the initial password for root: mysqladmin-u root password 'root'

Mysql-uroot-proot go to the mysql command line and create the following database (depending on the service installed, I installed the core components):

# hive

Create database hive DEFAULT CHARSET utf8 COLLATE utf8_general_ci

# activity monitor

Create database amon DEFAULT CHARSET utf8 COLLATE utf8_general_ci

# hue

Create database hue DEFAULT CHARSET utf8 COLLATE utf8_general_ci

# authorized root users have access to all databases on the master node

Grant all privileges on *. * to 'root'@'cdh2.hadoop.com'identified by' root' with grant option

Flush privileges

Install the ClouderaManager Server and Agent master node (cdh2.hadoop.com) unzip installation package

Copy the ClouderaManager package cloudera-manager-el6-cm5.3.3_x86_64.tar.gz to the / opt directory and unzip it:

Sudo tar-zxvf cloudera-manager*.tar.gz

Then copy the jar package mysqldemysql-connector-java-5.1.33-bin.jar of mysql to

/ opt/cm-5.3.3/share/cmf/lib/ directory.

The primary node initializes the database for Cloudera Manager5 and executes the following command:

/ opt/cm-5.3.3/share/cmf/schema/scm_prepare_database.sh mysql cm-h localhost-uroot-proot-- scm-host localhost scm scm scm

Agent profile modification

Modify / opt/cm-5.3.3/etc/cloudera-scm-agent/config.ini

Server_host=cdh2.hadoop.com

Copy files to the Agent node

Scp-r / opt/cm-5.3.3 root@cdh3.hadoop.com:/opt/

Scp-r / opt/cm-5.3.3 root@cdh4.hadoop.com:/opt/

Create cloudera-scm users on all nodes, including servers

Sudo useradd-system-home=/opt/cm-5.3.3/run/cloudera-scm-server-no-create-home-shell=/bin/false-comment "Cloudera SCM User" cloudera-scm

Install CDH5.3.3

Copy the three previously downloaded CHD5.3.3 installation files to the / opt/cloudera/parcel-repo/ directory of the primary node

Execute the following command to modify the parcel-repo folder permissions and give the user cloudera-scm permissions:

Sudo chown-R cloudera-scm:cloudera-scm / opt/cloudera-manager/cloudera/parcel-repo

Create a parcels folder and modify permissions:

Sudo mkdir-p / opt/cloudera-manager/cloudera/parcels

Sudo chown-R cloudera-scm:cloudera-scm / opt/cloudera-manager/cloudera/parcels

Start Server and Agent

The master node / opt/cm-5.3.3/etc/init.d/cloudera-scm-server start starts the server.

All nodes / opt/cm-5.3.3/etc/init.d/cloudera-scm-agentstart start the Agent service.

After startup, http://cdh2;hadoop.com:7180 can be accessed through a browser. The user name and password defaults to admin.

Problems with the installation process of CDH5.3.3

1.yarn error: nodemanager cannot be started, Error found before invokingsupervisord: dictionary update sequence element # 78 has length2; 2

This error is a bug of CM, and the solution is to modify

/ opt/cm-5.3.3/lib64/cmf/agent/src/cmf/util.py file. Put the code in it:

Pipe= subprocess.Popen (['/ bin/bash','- caching,.% s;% s; env "% (path,command)], stdout=subprocess.PIPE, env=caller_env)

Modified to:

Pipe = subprocess.Popen (['/ bin/bash','- centering, ".% s;% s; env | grep-v {| grep-v}"% (path, command)], stdout=subprocess.PIPE,env=caller_env)

2.hive error: unable to load driver

All nodes copy the jar package of mysql to

/ opt/cloudera/parcels/CDH-5.3.3-1.cdh6.3.3.p0.5/lib/hive/lib directory

3.sqoop2 error: it probably means that the database cannot be created, because it doesn't seem to work if you have your own derby driver. Download the latest derby.jar from the official website. I downloaded db-derby-10.11.1.1-lib.tar.gz here, and the decompression package contains derby.jar. Follow these steps to solve the problem:

(1) Delete / opt/cloudera/parcels/CDH-5.3.3-1.cdh6.3.3.p0.5qoop2andwebappsUniplicatesqoopandWebByINFhand Libby-{version} .jar soft connection

(2) copy derby.jar to the / opt/cloudera/parcels/CDH-5.3.3-1.cdh6.3.3.p0.5/jars directory

(3) create a new connection ln-s / opt/cloudera/parcels/CDH-5.3.3-1.cdh6.3.3.p0.5/jars/derby.jar/opt/cloudera/parcels/CDH-5.3.3-1.cdh6.3.3.p0.5/lib/sqoop2/webapps/sqoop/WEB-INF/lib/derby.jar

4. Modify swappiness

Add this line at the end of etc/sysctl.conf: vm.swappiness=0

Echo 0 > / proc/sys/vm/swappiness temporary modification

5.Oozie second startup failed: DB scheme exist

Rm-rf / var/lib/oozie/*

6. HDFS cannot be reformatted

Delete the / dfs folder on all nodes. Rm-rf / dfs

Basic use of CDH5.3.3

Start

After the installation is complete, all services are started. Here, the main consideration is that when you use a virtual environment, after you turn it on:

Service node (CDH1):

/ opt/cm-5.3.0/etc/init.d/cloudera-scm-server start

Agent nodes (CDH2, CDH3):

/ opt/cm-5.3.0/etc/init.d/cloudera-scm-agent start

After starting the services on the system, start all Custer1 services on the page, and then start the Cloudera Management Service service.

Close

Contrary to the startup order.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report