Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

CDH5 offline installation manual

2025-04-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/03 Report--

Cloudera manage (offline) installation manual

(follow the steps)

1. Preparatory work 1.1. System environment

Hardware device: vm virtual machine

Network composition:

192.168.210.131master

192.168.210.132slave1

192.168.210.134slave2

192.168.210.133slave3

Operating system: Linux version 2.6.32-220.el6.x86_64

1.2. Install JDK

Download the rpm package from the official website, and this time use version 1.7.0v79 (CDH5 may support versions prior to 1.7, which has not been tested), and execute the command:

Rpm-ivh jdk-7u79-linux-x64.rpm

Since the rpm package does not require us to configure environment variables, we just need to configure a global JAVA_HOME variable and execute the command:

Echo "JAVA_HOME=/usr/java/latest/" > > / etc/profile.d/java.sh

Execute the command to see if Jdk is installed correctly

Java-version

Javac-version

1.3. Modify hostname

Modify the / etc/sysconfig/network file:

NETWORKING=yes

HOSTNAME=master

NETWORKING_IPV6=no

Where HOSTNAME matches the hostname.

If the hostname is inconsistent with the installation of the system, execute the hostname command to make it take effect immediately, otherwise it will affect the access of nodes to each other, as follows:

Hostnamemaster

Modify the / etc/hosts file to add:

192.168.1.101master

192.168.1.102slave1

192.168.1.103slave2

192.168.1.104slave3

Restart the network service and execute the command:

Servicenetwork restart

1.4. Turn off the firewall

Firewalls can cause all kinds of anomalies in the communication of hadoop-related components. Firewall:

Serviceiptables stop (temporary shutdown)

Chkconfigiptables off (takes effect after restart)

Set up the SELINUX:

Setenforce0 (temporary)

Modify the SELINUX=disabled under / etc/selinux/config (take effect after restart)

View firewall status: service iptables status

1.5. Modify wap swap space

Temporary effect:

Sudo sysctl vm.swappiness=0

The modified configuration file / etc/sysctl.conf will still take effect after restart:

Echo "vm.swappiness=0" > > / etc/sysctl.conf

1.6. SSH has no key authentication

All nodes execute the following command, and enter all the way when prompted:

Ssh-keygen-t rsa

Cat~/.ssh/id_rsa.pub > > ~ / .ssh/authorized_keys_$hostname

Scp each data node file authorized_keys to the primary node:

Scp / .ssh/authorized_keyshadoop@master:~/.ssh/

Merge the node file authorized_keys_$hostname into the file authorized_keys:

Cat~/.ssh/authorized_keys_$hostname > > authorized_keys

Master hosts distribute all node hosts of the merged authorized_keys:

Scp / .ssh/authorized_keyshadoop@slave1:~/.ssh/

Then you can enter the password, and you don't need the password to go to other machines later.

If you wait a long time to enter your password when ssh logs in, you can modify the following configuration file:

Modify / etc/ssh/sshd_config configuration item UseDNS, set it to no, restart the sshd service: servicesshd restart

Modify / etc/ssh/ssh_conf configuration item GSSAPIAuthentication to set to no

When multiple hosts do ssh mutual trust, you can use sshpass tool to improve efficiency. The specific operations are as follows:

1. Install the sshpass tool

2. Execute the command on all hosts: sshpass-p 123456 ssh root@bigdata-2 "ssh-keygen-t rsa"

3. Merge the public keys of all hosts:

Sshpass-p 123456 ssh root@bigdata-2 "cat/root/.ssh/id_rsa.pub"

> > / root/.ssh/authorized_keys

4. Distribute public key files to each host

1.7. Check the time zone

View current time zone: date-R

Modify the time zone to Shanghai: ln-sf / usr/share/zoneinfo/posix/Asia/Shanghai / etc/localtime

1.8. Close the large page configuration item:

Execute the command: echo never > / sys/kernel/mm/transparent_hugepage/defrag "to disable this setting, and then add the same command to the initial script such as / etc/rc.local to set it when the system restarts

Vi / etc/rc.local added configuration

Echo never > / sys/kernel/mm/transparent_hugepage/defrag

1.9. Modify the number of files allowed to be opened

Modify the file: / etc/rc.local, and add the following configuration:

Ulimit-SHn 65535

1.10. Install the NTP service

The client modifies the ntp configuration file: / etc/ntp.conf, and adds ntp server configuration:

Server 192.168.10.188

Start the native ntp service: service ntpd start

Manually synchronize ntp:ntpdate-u 192.168.10.188

View ntp synchronization status: watch ntpq-p

1.11. Install Mysql database to check the installation environment

To find out if mysql was previously installed, command:

Rpm-qa | grep-I mysql

You can see two packages for mysql:

Mysql-4.1.12-3.RHEL4.1

Mysqlclient10-3.23.58-4.RHEL4.1

Delete mysql

Delete command: rpm-e-- nodeps package name

(rpm-ev mysql-4.1.12-3.RHEL4.1)

Delete the development header files and libraries of the old version of mysql

Rm-fr / usr/lib/mysql

Rm-fr / usr/include/mysql

Note: the data and / etc/my.cnf in / var/lib/mysql will not be deleted after uninstallation. If determined to be useless, delete them manually.

Rm-f / etc/my.cnf

Rm-fr / var/lib/mysql

Install the server side

Download the mysql installation package and execute the installation command:

Rpm-ivh MySQL-server-5.6.24-1.el6.x86_64.rpm (installation package name)

The initial password storage location for root users is:

ARANDOM PASSWORD HAS BEEN SET FOR THE MySQL root USER!

Youwill find that password in'/ root/.mysql_secret'.

You can log in with the above password and change it to the desired password.

Install the client

Download the mysql client and execute the command to install:

Rpm-ivh MySQL-client-5.6.24-1.el6.x86_64.rpm

1.12. Configure the mysql database

Start the mysql database

Servicemysql start

Change the initial password:

Change the initial password of the root user in the MySQL database. Check the initial password first.

Cat/root/.mysql_secret

Log in to MySQL database with the initial password, and the login command is as follows:

Mysql-uroot-p

Execute the following command to change the database root user password:

Setpassword=password ('123456')

Refresh the permissions table:

Flushprivileges

The test logs in with a new password.

two。 Install CM

1.

two。

2.1. Download the installation package

Download the address http://archive-primary.cloudera.com/cm5/cm/5/ and choose the appropriate version according to your system. Cloudera-manager-el6-cm5.7.1_x86_64.tar.gz is selected for this installation. After the download is completed, you can only upload to the master node. Then extract it to the / opt directory, and you cannot extract it anywhere else, because the source of cdh6 will be found in / opt/cloudera/parcel-repo by default. How to make the local source file of cdh6 will be described later.

2.2. Install CM

Add cloudera-scm users to all nodes:

Useradd--system-home=/opt/cm-5.7.1/run/cloudera-scm-server/-no-create-home--shell=/bin/false-comment "Cloudera SCM User" cloudera-scm

Modify the server_host under / opt/cm-5.7.1/etc/cloudera-scm-agent/config.ini

# Hostname of the CM server.

Server_host=master

Upload mysql driver package mysql-connector-java-5.1.26-bin.jar to / opt/cm-5.7.1/share/cmf/lib/ directory:

Cp/root/data/mysql-connector-java-5.1.26-bin.jar / opt/cm-5.7.1/share/cmf/lib/

Set up a database for Cloudera Manager 5:

/ opt/cm-5.7.1/share/cmf/schema/scm_prepare_database.shmysql cm- hlocalhost-uroot-p123456-scm-host localhost scm scm scm

The format is: scm_prepare_database.sh database type database server username password-scm-host Cloudera_Manager_Server where the machine, the latter three do not know what to represent, directly copy the official website.

Open the ClouderaManager 5 Server:

/ opt/cm-5.7.1/etc/init.d/cloudera-scm-serverrestart

Note that do not immediately shut down or restart the first startup of server, because the first startup will automatically create related tables and data. If you quit halfway due to special reasons, please delete all tables and data before starting again, otherwise the startup will not succeed.

Open the ClouderaManager 5 Agents:

First scp/opt/cm-5.7.1 to all datanode nodes, then open the Agents side on each machine:

Scp-r / opt/cm-5.7.1 root@slave1:/opt/cm-5.7.1

Wait for the copy to succeed and start on all datanode nodes: (note that it must be started with administrator privileges)

/ opt/cm-5.7.1/etc/init.d/cloudera-scm-agentrestart

The browser launches the ClouderaManager 5 console (the default port number is 7180), and you will see the login page when you launch successfully.

2.3. Add Service Monitor service

You can add service components directly to the cm management page

3. Install CDH

3.

3.1. Download the CDH version

To download the CDH version to local http://archive-primary.cloudera.com/cdh6/parcels/5.0.0/, you need to download two things:

1. Parcel package corresponding to the operating system version

2. Manifest.json file.

3.2. Install CDH local source production

After the download is completed, put these two files under / opt/cloudera/parcel-repo of the master node (the directory has been generated when installing Cloudera Manager 5). Open the manifest.json file, which contains the configuration in json format, and find the hash code corresponding to our system version. Because we are using redhat6.4, find the following location:

Find the value corresponding to "hash" at the bottom of the curly braces.

Copy the value of "hash" and create a file with the same name as your parel package, followed by the .sha suffix:

There are 3 files in the installation directory. Copy the value of "hash" to the newly created sha file, save it, and then restart cm server.

CDH installation

Open http://192.168.1.101:7180, log in to the console, the default account and password are admin, choose the free version when installing, and then because cm5 has strong support for Chinese, just follow the prompts to install, if there are any problems in the system configuration during the installation process, you can install components to the system according to the prompts.

If you choose to install Hive during installation, you may encounter the problem of installation failure. Check the log and find that you need to install the JDBC driver when installing Hive, so similarly, we copy the Mysql driver package to the / opt/cloudera/parcels/CDH-5.7.1-1.cdh6.7.1.p0.6/lib/hive/lib directory, and then continue the installation will not encounter problems.

3.3. Configure HA configure HDFS and YARN highly available

Go to the HDFS and YARN service management pages, select ACTION (Action), click enable HA, and follow the steps.

4. Common problem handling

1. When installing hive, Impala and other components, you need to support mysql source data, create a new database and perform authorization operations. The details are as follows:

-- create a database

Createdatabase cm DEFAULT CHARSET utf8 COLLATE utf8_general_ci

Createdatabase hive DEFAULT CHARSET utf8 COLLATE utf8_general_ci

Createdatabase amon DEFAULT CHARSET utf8 COLLATE utf8_general_ci

Createdatabase smon DEFAULT CHARSET utf8 COLLATE utf8_general_ci

Createdatabase hmon DEFAULT CHARSET utf8 COLLATE utf8_general_ci

Createdatabase hiverep DEFAULT CHARSET utf8 COLLATE utf8_general_ci

-- Database authorization

Grantall on *. * to root@ "%" Identified by "123456"

2. Solution to garbled codes in Hive or impala table notes:

Alter table COLUMNS_V2 modify column COMMENT varchar (256) characterset utf8

Alter table TABLE_PARAMSmodify column PARAM_VALUE varchar (4000) character set utf8

5. List of CDH service ports

module

Port number

Port usage

Whether it will be accessed by the outside

CM

7180

CM Management Port

Yes

Cloudera Management Service

8087

Activity Monitor Web UI port

No

9999

Activity Monitor listening port

No

9998

Activity Monitor nozzle port

No

10101

Alerts: listening port

No

7184

Event release port

No

7185

Event query port

No

8084

Event Server Web UI port

No

8091

Host Monitor Web UI port

No

9995

Host Monitor listening port

No

9994

Host Monitor nozzle port

No

7186

Navigator Audit Server Port

No

8089

Navigator Audit Server Web UI end

No

7187

Navigator Metadata Server port

No

5678

Reports Manager server port

No

8083

Reports Manager Web UI port

No

8086

Service Monitor Web UI port

No

9997

Service Monitor listening port

No

9996

Service Monitor nozzle port

No

Zookeeper

2181

Client port

Yes

3181

Arbitration port

No

4181

Election port

No

9010

JMX remote port

Yes

Yarn

10020

MapReduce JobHistory Server port

No

19888

MapReduce JobHistory Web Application HTTP Port

Yes

19890

MapReduce JobHistory Web Application HTTPS Port (TLS/SSL)

No

10033

MapReduce JobHistory Server Management Interface Port

No

8042

NodeManager Web Application HTTP Port

No

8044

NodeManager Web Application HTTPS Port (TLS/SSL)

No

8041

NodeManager IPC address

No

8040

Localizer port

No

8032

ResourceManager address

No

8030

Scheduler address

No

8031

Resource tracker address

No

8033

Management address

No

8088

ResourceManager Web Application HTTP Port

Yes

8090

ResourceManager Web Application HTTPS Port (TLS/SSL)

No

Kafka

9092

TCP Port

Yes

9393

JMX Port

Yes

9394

Yes

9093

TLS/SSL Port

No

24042

HTTP Metric Report Port

No

Hive

9083

Hive Metastore server port

No

10000

HiveServer2 port

No

10002

HiveServer2 WebUI Port

Yes

50111

WebHCat Server port

No

HDFS

50020

DataNode protocol port

No

50010

DataNode Transceiver Port

No

50075

DataNode HTTP Web UI port

Yes

50475

Secure DataNode Web UI Port (TLS/SSL)

No

14000

REST port

No

14001

Management Port

No

8485

JournalNode RPC port

No

8480

JournalNode HTTP port

No

8481

Secure JournalNode Web UI Port (TLS/SSL)

Yes

2049

NFS Gateway server port

No

4242

NFS Gateway MountD port

No

one hundred and eleven

Port mapping (or Rpcbind) port

No

8020

NameNode port

No

8022

NameNode Service RPC Port

No

50070

NameNode Web UI port

Yes

50470

Secure NameNode Web UI Port (TLS/SSL)

Yes

50090

SecondaryNameNode Web UI port

Yes

50495

Secure SecondaryNameNode Web UI Port (TLS/SSL)

Yes

Hbase

20550

HBase REST server port

No

8085

HBase REST Server Web UI port

No

9090

HBase Thrift server port

No

9095

HBase Thrift server Web UI port

No

60000

HBase Master port

No

60010

HBase Master Web UI port

Yes

60020

HBase Region Server port

Yes

60030

HBase Region Server Web UI port

Yes

Spark

7337

Spark Shuffle Service Port

No

18088

History Server WebUI Port

Yes

Oozie

11000

Oozie HTTP port

No

11001

Oozie Management Port

Yes

twenty-five

Oozie email operation SMTP port

No

Solr

8983

Solr HTTP port

Yes

8984

Solr Management Port

Yes

6. CDH enables firewall configuration

Overall plan:

The internal security of the cluster is whitelist, and the machines in the cluster can access each other freely; for the machines connected to the external network, port filtering is used, and the host rejects all external access by default and opens it according to the characteristic ports. The specific operations are as follows:

Modify the firewall configuration file: / etc/sysconfig/iptables

Add a firewall policy as follows:

# Firewall configuration written by system-config-firewall

# Manual customization of this file is not recommended.

* filter

: INPUT DROP [0:0]

: FORWARD ACCEPT [0:0]

: OUTPUT ACCEPT [0:0]

# whitelist configuration

-N whitelist

-A whitelist-s 172.16.0.250-j ACCEPT

-A whitelist-s 172.16.0.78-j ACCEPT

-A whitelist-s 172.16.0.5-j ACCEPT

-A whitelist-s 172.16.0.168-j ACCEPT

-A whitelist-s 172.16.0.113-j ACCEPT

-A whitelist-s 172.16.0.133-j ACCEPT

-An INPUT-m state-- state ESTABLISHED,RELATED-j ACCEPT

-An INPUT-p icmp-j ACCEPT

-An INPUT-I lo-j ACCEPT

# Open ssh port

-An INPUT-m state-- state NEW-m tcp-p tcp-- dport 22-j ACCEPT

# Open application service port

# cm

-An INPUT-m state-- state NEW-m tcp-p tcp-- dport 7180-j ACCEPT

# zookeeper

-An INPUT-m state-- state NEW-m tcp-p tcp-- dport 2181-j ACCEPT

-An INPUT-m state-- state NEW-m tcp-p tcp-- dport 9010-j ACCEPT

# yarn

-An INPUT-m state-- state NEW-m tcp-p tcp-- dport 19888-j ACCEPT

-An INPUT-m state-- state NEW-m tcp-p tcp-- dport 8088-j ACCEPT

# Kafka

-An INPUT-m state-- state NEW-m tcp-p tcp-- dport 9092-j ACCEPT

-An INPUT-m state-- state NEW-m tcp-p tcp-- dport 9393-j ACCEPT

-An INPUT-m state-- state NEW-m tcp-p tcp-- dport 9394-j ACCEPT

# hive

-An INPUT-m state-- state NEW-m tcp-p tcp-- dport 10002-j ACCEPT

# hdfs

-An INPUT-m state-- state NEW-m tcp-p tcp-- dport 50075-j ACCEPT

-An INPUT-m state-- state NEW-m tcp-p tcp-- dport 8481-j ACCEPT

-An INPUT-m state-- state NEW-m tcp-p tcp-- dport 50070-j ACCEPT

-An INPUT-m state-- state NEW-m tcp-p tcp-- dport 50470-j ACCEPT

-An INPUT-m state-- state NEW-m tcp-p tcp-- dport 50090-j ACCEPT

-An INPUT-m state-- state NEW-m tcp-p tcp-- dport 50495-j ACCEPT

# hbase

-An INPUT-m state-- state NEW-m tcp-p tcp-- dport 60010-j ACCEPT

-An INPUT-m state-- state NEW-m tcp-p tcp-- dport 60020-j ACCEPT

-An INPUT-m state-- state NEW-m tcp-p tcp-- dport 60030-j ACCEPT

# spark

-An INPUT-m state-- state NEW-m tcp-p tcp-- dport 18088-j ACCEPT

# oozie

-An INPUT-m state-- state NEW-m tcp-p tcp-- dport 11001-j ACCEPT

# Solr

-An INPUT-m state-- state NEW-m tcp-p tcp-- dport 8983-j ACCEPT

-An INPUT-m state-- state NEW-m tcp-p tcp-- dport 8984-j ACCEPT

-An INPUT-m state-- state NEW-m tcp-p tcp-j whitelist

-An INPUT-j REJECT-- reject-with icmp-host-prohibited

-A FORWARD-j REJECT-- reject-with icmp-host-prohibited

COMMIT

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report