In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
Catalogue
Host list
Basic environment
Basic configuration of cluster host
Configure the NTP service
Configure the MySQL server
Install Cloudera Manager Server and AgentServer
Configure the Server side
Configure the Agent side
Install CDH
Configure and assign CDH5 parcel packages
Install Hadoop cluster and related components
Browse related layouts on the CDH Web side
Install Kafka components
Configure and assign Kafka parcel packages
Install the Kafka service in the cluster
Configure HDFS LZO compression
Configure and assign LZO parcel packages
HDFS related LZO configuration
YARN related LZO configuration
Host list
| | hostname | IP | Memory | CPU | role and service |-|--: |::-- | test1.lan | 192.168.22.11 | 9G | 4 core | cm-agent, Namenode, YARN | test2.lan | 192.168.22.12 | 9G | 4 core | cm-agent, SecondNameode, HBase-Master | | test3.lan | 192.168.22.13 | 9G | 4 core | cm-agent, Datanode, zk-server, kafka-broker | Regionserver | | test4.lan | 192.168.22.14 | 9G | 4 cores | cm-agent, Datanode, zk-server, kafka-broker, Regionserver | | test5.lan | 192.168.22.15 | 9g | 4 cores | cm-agent, Datanode, zk-server, kafka-broker, Regionserver | | test6.lan | 192.168.22.16 | 9G | 4 cores | cm-server MySQL-Server |
Basic environment
CentOS 6 x86_64
Jdk-8u101-linux-x64.rpm
MySQL-5.6.x
NTPd = > On
CDH-5.12.0-1.cdh6.12.0.p0.29-el6.parcel (offline parcel)
Cloudera-manager-el6-cm5.12.0x8664.tar.gz
KAFKA-2.2.0-1.2.2.0.p0.68-el6.parcel
Basic configuration of cluster host
Make sure / directory is at least 100g or more.
SELinux shuts down
IPtables shuts down
Disable Transparent Hugepage Compaction
Set vm.swapiness to 1
Ntp service is enabled, time synchronization (ntpdate is not recommended)
Configure the NTP service
The following configuration should be done once for each host in the cluster
`
Vim / etc/sysconfig/ntpdate SYNC_HWCLOCK=yes / / turn on the hardware clock to save ntpdate time.windows.com synchronously
/ / manually synchronize the clock for the first time to avoid that the ntpd service cannot synchronize vim / etc/ntp.conf server time.windows.com prefer / / add the time synchronization server service ntpd start & & chkconfig ntpd on / / run the time synchronization service ```for the first time due to excessive time deviation
Configure the MySQL server for cm-server
The MySQL service can be installed on the cm-server server or shared with other services
> rpm- qa | grep-I-E "mysql-libs | mariadb-libs" > yum remove-y mysql-libs mariadb-libs & & yum install-y-q crontabs postfix > tar xf MySQL-5.6.35-1.el6.x86_64.rpm-bundle.tar > rpm- ivh MySQL-client-5.6.35-1.el6.x86_64.rpm\ MySQL-shared-*\ MySQL-server-5.6.35-1.el6.x86_64.rpm\ MySQL-devel-5.6.35-1.el6.x86_64.rpm
Install MySQL-Server ```
Wget https://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-5.1.44.tar.gz tar xf mysql-connector-java-5.1.44.zip ```download mysql-connector.jar (for installation on cm-server server)
> vim / etc/my.cnf [mysqld] character-set-server = utf8 / / initial password is in ~ / .mysql-secret file > mysql- p`default _ secret` sql_cli > SET PASSWORD = PASSWORD ("new_secret") sql_cli > exit
Install Cloudera Manager Server and AgentServer
Cloudera Manager Server is installed in test6.lan AgentServer and each host in the cluster needs to be installed separately. Download address: http://archive-primary.cloudera.com/cm5/cm/5/cloudera-manager-el6-cm5.12.0x8664.tar.gz
Configure the Server side
After downloading cloudera-manager, upload it to test6.lan and unzip it to / opt directory (only under this directory), because the source of cdh6 will be found in / opt/cloudera/parcel-repo by default.
> tar xf cloudera-manager-el6-cm5.12.0_x86_64.tar.gz-C / opt/
Add cloudera-scm user ```to all nodes in the cluster
Useradd-system-home=/opt/cm-5.12.0/run/cloudera-scm-server/-no-create-home-shell=/bin/false-comment "Cloudera SCM User" cloudera-scm ```
Configure mysql-connector-java on the cm-server node, and create the initial database > cp / path/to/mysql-connector-java-5.1.44-bin.jar / opt/cm-5.12.0/share/cmf/lib/ for Cloudera Manager 5. Then, create the initial database (- psecret is the password of the corresponding account in the database) ```
/ opt/cm-5.12.0/share/cmf/schema/scmpreparedatabase.sh mysql cm- hlocalhost-uroot-psecret-- scm-host localhost scm scm scm ```see Successfully connected to database. All done, you SCM database is configured correctly! That is, the structure of the library structure table is configured successfully.
The Server side running Cloudera Manager 5:
> / opt/cm-5.12.0/etc/init.d/cloudera-scm-server start
Note: when you run Server for the first time, it will take about 5-10 minutes to initialize the data (the server process takes up about 1.5 GB of memory). After initialization, java programs will listen on port 7180 7182.
Configure the Agent side
Modify the host address of server_host in Agent configuration file on the Server side
> vi / opt/cm-5.12.0/etc/cloudera-scm-agent/config.iniserver_host=test6.lan
Copy the Agent program on the Server side to all nodes in the cluster / opt/ directory
> for i in {1.. 5}; do echo "- Start scp to test$ {I} .lan -" scp-r-Q / opt/cm-5.12.0/ test$ {I} .lan: / opt/ echo "# Done #" done
Wait for the replication to succeed, then you can start the Agent program in all nodes of Agent
> / opt/cm-5.12.0/etc/init.d/cloudera-scm-agent start
The Agent program is a Python process, which actively registers information to the server_host node in the configuration file. The Agent is also used to receive relevant instructions sent by the Server side and heartbeat information monitoring.
Install CDH configuration and assign CDH5 parcel package
You need to go back to the test6.lan shell terminal separately and configure CDH5's parcel package (cloudera uses a precompiled bundle to support Hadoop offline installation). The download address of the corresponding CDH parcel package is: http://archive-primary.cloudera.com/cdh6/parcels/5.12.0/ ```
Cd / opt/cloudera/parcel-repo curl-O http://fileserver.lan/CDH5/CDH5-5.12.0-1.cdh6.12.0.p0.29-el6.parcel curl-O http://fileserver.lan/CDH5/CDH5-5.12.0-1.cdh6.12.0.p0.29-el6.parcel.sha1 mv CDH5-5.12.0-1.cdh6.12.0.p0.29-el6.parcel.sha1\ CDH5-5.12.0-1. Cdh6.12.0.p0.29-el6.parcel.sha ```here you need to rename the sha1 file of the corresponding parcel package to CDH5-5.12.0-1.cdh6.12.0.p0.29-el6.parcel.sha Otherwise, cm-server will not recognize the parcel package.
Restart the cloudera-scm-server server
> / opt/cm-5.12.0/etc/init.d/cloudera-scm-server restart
Open http://test6.lan:7180/ and start installing CDH. The default login user password is admin admin.
Agree to the relevant terms
Select the relevant service version
Service packages and information related to this version
Add a cluster host where the currently managed host (5) indicates that it is normal for the Agent side to register with the Server side. If there is only one option here, that is, the new host, then the Agent registration is not normal, please check whether the network or service is normal. You can also choose to connect to the remote node by specifying the hostname or IP.
Select host
Select the parcel package of the relevant supporting components for the cluster installation.
Start parcel package deployment for nodes in the cluster
The deployment alarm at the end of this figure indicates that the cloudera-scm user has not been created. It is true that the node has been forgotten to be created, and you can reverify it after the user has created it.
Overview of deployment Information
Install Hadoop cluster and related components
CDH officially has a matching scheme that has been packaged, and you can also match the components on your own.
Select a few components here, including HBase, HDFS, YARN and Zookeeper (Kafka is provided by a separate parcel package and will be installed separately later)
Configuration parameters of related components
In deployment.
The installation path of the relevant service components in the server file system is as follows
Installation completed
Browse related layouts on the CDH Web side
Modify NameNode's initial default configuration for Heapsize size (1-4G size is recommended). After modifying the configuration, you need to restart the service. Wait a moment after restarting the service, and then the alarm disappears after the relevant subordinate child processes of the service are started. (modify the Heapsize size of NameNode, of course, you also need to modify the Heapsize of SecondaryNameNode)
Install Kafka component configuration and assign Kafka parcel packages
On the Web page, the parcel packages configured and assigned by the current cluster are listed in CVM-> Parcel. Currently, only CDH5,Kafka is configured to exist separately in other parcel packages, so you need to load parcel separately and assign it to each node in the cluster.
The download address of the parcel package for the official Kafka component of Cloudera is as follows: download the percel file and the sha1 string of the file as usual, and then rename * *. Sha1 to *. Sha.
After downloading the above two files, put them in the / opt/cloudera/parcel-repo/ directory of the cm-server node without restarting the server daemon, which can be refreshed, allocated and activated online on the page.
Install the Kafka service in the cluster
Here you need to confirm and modify 2 default configurations
Replication process, default is 1, modified to 3 (depending on business volume)
The number of partitions. The default number of partitions is 50, which is reserved for the time being.
Delete the old topic, open it by default, and make no changes.
Business port is 9092.
Configure HDFS LZO compression configuration and assign LZO parcel packages
The LZO function is also packaged in a separate parcel package to select the package for the corresponding platform. The download address is: http://archive-primary.cloudera.com/gplextras/parcels/latest/, the sha file is not provided directly here, so you need to check the manifest.json file, find the hash value of the corresponding parcel package, and manually save it to the local file.
Download the parcel package and its sha file and store it in the / opt/cloudera/parcel-repo/ directory of cm-server. Just like installing Kafka bundles, refresh, registration, allocation, and activation operations can be completed on the page.
After activating LZO, several dependent services will prompt you to restart and load the new configuration. Don't restart yet, there are several configurations that need to be manually modified separately.
HDFS related LZO configuration
Add a new line to io.compression.codecs and enter com.Hadoop.compression.lzo.LzopCodec to save the configuration.
YARN related LZO configuration
Add a new line to the attribute value of mapreduce.application.classpath and fill in / opt/cloudera/parcels/HADOOP_LZO/lib/hadoop/lib/*
Add the attribute value of mapreduce.admin.user.env to / opt/cloudera/parcels/HADOOP_LZO/lib/hadoop/lib/native
Just save and restart the dependent services.
Last preview of related services
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.