In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
Our Cloudera Manager and cdh version is 5.14, and now the company needs to upgrade to cdh7.2
You need to upgrade Cloudera Manager first and then upgrade cdh.
1.Cloudera Manager upgrade
(refer to
Https://www.cloudera.com/documentation/enterprise/upgrade/topics/ug_cm_upgrade.html)
Before upgrading, make sure that the version of linux has been upgraded to the version supported by Cloudera Manager6.2
1.1 backup 1.1.1 backup Cloudera Manager Agent
# View database information
$sudo cat / etc/cloudera-scm-server/db.properties
Get information similar to the following:
... com.cloudera.cmf.db.type=...com.cloudera.cmf.db.host=database_hostname:database_portcom.cloudera.cmf.db.name=scmcom.cloudera.cmf.db.user=scmcom.cloudera.cmf.db.password=SOME_PASSWORD performs the following backup operations on each machine on which Cloudera Manager agent is installed:
Create a top level backup directory.
$export CM_BACKUP_DIR= "`date +% F`-CM5.14" $echo $CM_BACKUP_DIR$ mkdir-p $CM_BACKUP_DIR
Back up the Agent directory and the runtime state.
$sudo-E tar-cf $CM_BACKUP_DIR/cloudera-scm-agent.tar-- exclude=*.sock / etc/cloudera-scm-agent / etc/default/cloudera-scm-agent / var/run/cloudera-scm-agent / var/lib/cloudera-scm-agent
Back up the existing repository directory.
$sudo-E tar-cf $CM_BACKUP_DIR/repository.tar / etc/yum.repos.d1.1.2 backup Cloudera Manager Service
Execute on the machine where Service Monitor is installed:
$sudo cp-rp / var/lib/cloudera-service-monitor / var/lib/cloudera-service-monitor- `date +% F`-CM5.14
Execute on the machine where Host Monitor is installed:
$sudo cp-rp / var/lib/cloudera-host-monitor / var/lib/cloudera-host-monitor- `date +% F`-CM5.14
Execute on the machine where Event Server is installed:
$sudo cp-rp / var/lib/cloudera-scm-eventserver / var/lib/cloudera-scm-eventserver- `date +% F`-CM5.141.1.3 backup Cloudera Manager Databases$ mysqldump-- databases database_name--host=database_hostname-- port=database_port-u user_name-p > $HOME/database_name-backup- `date +% F`-CM5.14.sql1.1.2 backup Cloudera Manager Server
Create a top-level backup directory.
$export CM_BACKUP_DIR= "`date +% F`-CM5.14" $echo $CM_BACKUP_DIR$ mkdir-p $CM_BACKUP_DIR
$Back up the Cloudera Manager Server directories:
$sudo-E tar-cf $CM_BACKUP_DIR/cloudera-scm-server.tar / etc/cloudera-scm-server / etc/default/cloudera-scm-server
Back up the existing repository directory.
$sudo-E tar-cf $CM_BACKUP_DIR/repository.tar / etc/yum.repos.d1.2 upgrade access to Cloudera Manager Server1.2.1 setup software (replace yum source)
Log in to the Cloudera Manager Server node and delete the original yum source
$sudo rm / etc/yum.repos.d/cloudera*manager.repo*
Create a new yum source file
$sudo vim / etc/yum.repos.d/cloudera-manager.repo [cloudera-manager] # Packages for Cloudera Managername=Cloudera Managerbaseurl= https://archive.cloudera.com/cm6/6.2.0/redhat6/yum/gpgkey=https://archive.cloudera.com/cm6/6.2.0/redhat6/yum/RPM-GPG-KEY-clouderagpgcheck=11.2.2 install or configuration java8
Configure java_home in the configuration file of server:
In / etc/default/cloudera-scm-server
Add JAVA_HOME
Export JAVA_HOME= "/ usr/java/jdk1.8.0_162"
1.2.3 upgrade Cloudera Manager Server
1. Log in to the Cloudera Manager Server host.
two。 Stop the Cloudera management service. (important: not stopping Cloudera Management Service at this time may cause the administrative role to crash or Cloudera Manager Server may not be able to restart. )
Steps:
A.Log in to the Cloudera Manager Admin Console.b.Select Clusters > Cloudera Management Service.c.Select Actions > Stop.
3. Stop Cloudera Manager Server.
$sudo service cloudera-scm-server stop
4. Stop Cloudera Manager Agent.
$sudo service cloudera-scm-agent stop
5. Upgrade Cloudera packages.
$sudo yum clean all$ sudo yum upgrade cloudera-manager-server cloudera-manager-daemons cloudera-manager-agent-y
6. Make sure the package is installed.
$rpm-qa 'cloudera-manager-*'
7. Start Cloudera Manager Agent.
$sudo service cloudera-scm-agent start
8. Start Cloudera Manager Server.
$sudo service cloudera-scm-server start
If you have any problems during startup, you can refer to the log file:
$tail-f / var/log/cloudera-scm-server/cloudera-scm-server.log$ tail-f / var/log/cloudera-scm-agent/cloudera-scm-agent.log$ tail-f / var/log/ messages
9. Normally, you can see the upgrade situation by opening the cdh upgrade page.
Http://cloudera_Manager_server_hostname:7180/cmf/upgrade
1.2.4 upgrade Cloudera Manager Agent
a. Upgrade using the CDH interface
Click on the Cloudera Manager Agent package
Option 1: select the agent repository
We can just use the common library.
Select Public Cloudera Repository
two。 Install JDK
If it's already installed, you don't have to choose.
3. Install agent
Just configure your root or sudo account. You need access to all agent nodes.
Option 2: upgrade using the command
Clear old repo files
$sudo rm / etc/yum.repos.d/cloudera*manager.repo*
Create a new repo file:
$sudo vim / etc/yum.repos.d/cloudera-manager.repo
Contents of repo file:
[cloudera-manager] # Packages for Cloudera Managername=Cloudera Managerbaseurl= https://archive.cloudera.com/cm6/6.2.0/redhat6/yum/gpgkey=https://archive.cloudera.com/cm6/6.2.0/redhat6/yum/RPM-GPG-KEY-clouderagpgcheck=1
Stop the Cloudera Manager agent service
$sudo service cloudera-scm-agent stop
Upgrade Cloudera Manager agent
$sudo yum clean all$ sudo yum repolist$ sudo yum upgrade cloudera-manager-daemons cloudera-manager-agent-y
After all the machines are completed, each agent node executes the
$sudo service cloudera-scm-agent start
View
Http://192.168.0.254:7180/cmf/upgrade
Shows that the agent of all machines has been upgraded and all have a heartbeat
Click Host Inspector to check the status of the node
When you are finished, click to show the results of the inspector, view the items in question, and fix them.
One of the more important issues shown is that if you need to use python2.7 to run CDH6,hue later, remember, forget it for the time being.
Then, start Cloudera Management Service
At this point, the upgrade of Cloudera Manager is completed, followed by the upgrade of cdh
If the upgrade fails and needs to be restored, you can refer to the official steps:
Https://www.cloudera.com/documentation/enterprise/upgrade/topics/ug_cm_downgrade.html
2.CDH upgrade
Before upgrading, make sure that the version of linux has been upgraded to the version supported by CDH6.2, and the java version is 1.8.
2.1. Preparatory work
Log in to the CDH management page and start the hdfs service
Then run the following command to check the cluster
If there is a problem, fix it
Check hdfs:
$sudo-u hdfs hdfs fsck /-includeSnapshots$ sudo-u hdfs hdfs dfsadmin-report
Check the consistency in the hbase table:
$sudo-u hdfs hbase hbck
If kudu is used, check kudu:
$sudo-u kudu kudu cluster ksck
The following services are no longer available in 6.0.0 and need to be stopped and deleted before upgrading
Accumulo
Sqoop 2
MapReduce 1
Spark 1.6
Record Service
2.2 backup cdh
The following CDH components do not require backup:
MapReduce
YARN
Spark
Pig
Impala
Complete the following backup steps before upgrading CDH
1.Back Up Databases
We use mysql, so take mysq as an example
1) if you have not stopped, please stop the service. If Cloudera Manager indicates that there is a dependent service, then stop relying on the service as well.
2) back up the databases of various services (Sqoop, Oozie, Hue,Hive Metastore, Sentry). Replace the database name, hostname, port, user name, and backup directory path, and then run the following command:
$mysqldump-- databases database_name-- host = database_hostname-- port = database_port-u database_username-p > backup_directory_path / database_name-backup-`date +% F`-CDH 5.14.sql
2.Back Up ZooKeeper
On each zookeeper node, back up the data storage directory of the zookeeper configured in cdh, as shown in
$sudo cp-rp / var/lib/zookeeper/ / var/lib/zookeeper-backup- `date +% F` CM-CDH5.14
3.Back Up HDFS
(the data path in the command is changed according to the actual configuration in cdh)
a. Back up journal data and execute it on each JournalNode
$sudo cp-rp / data/dfs/jn / data/dfs/jn-CM-CDH5.14
b. Back up the runtime directory of each namenode and run: $mkdir-p / etc/hadoop/conf.rollback.namenode$ cd / var/run/cloudera-scm-agent/process/ & & cd `ls-T1 | grep-e "- NAMENODE\ $" | head-1` $cp-rp * / etc/hadoop/conf.rollback.namenode/$ rm-rf / etc/hadoop/conf.rollback.namenode/log4j.properties$ cp-rp / etc/hadoop/conf.rollback.namenode/log4j.properties / etc/hadoop/conf.rollback.namenode/
These commands create temporary rollback directories. If you need to roll back to CDH 5.x later, the rollback process requires you to modify the files in this directory.
c. Back up the runtime directory of each datanode $mkdir-p / etc/hadoop/conf.rollback.datanode/$ cd / var/run/cloudera-scm-agent/process/ & & cd `ls-T1 | grep-e "- DATANODE\ $" | head-1` $cp-rp * / etc/hadoop/conf.rollback.datanode/$ rm-rf / etc/hadoop/conf.rollback.datanode/log4j.properties$ cp-rp / etc/hadoop/conf.cloudera.hdfs/log4j.properties / etc/hadoop/conf.rollback.datanode/
4.Back Up Key Trustee Server and Clients
The service is not used
5.Back Up HSM KMS
The service is not used
6.Back Up Navigator Encrypt
The service is not used
7.Back Up HBase
Because the rollback process also rolls back the HDFS, the data in the HBase is also rolled back. In addition, HBase metadata stored in ZooKeeper is restored as part of the ZooKeeper rollback process.
8.Back Up Search
The service is not used
9.Back Up Sqoop 2
The service is not used
10.Back Up Hue
Back up the app registry file on all hosts running the Hue Server role
$mkdir-p / opt/cloudera/parcels_backup$ cp-rp / opt/cloudera/parcels/CDH/lib/hue/app.reg / opt/cloudera/parcels_backup/app.reg-CM-CDH5.142.3 Service change: hue:
For centos6 versions of the system:
Python2.7 needs to be installed on the node of hue
Enable the Software Collections Library:
$sudo yum install centos-release-scl
$Install the Software Collections utilities:
$sudo yum install scl-utils
$Install Python 2.7:
$sudo yum install python27
Verify that Python 2.7 is installed:
$source / opt/rh/python27/enable$ python-versionhbase:
1.HBase 2.0 does not support PREFIX_TREE block encoding. You need to delete it before upgrading, otherwise hbase2.0 cannot start.
If you have already installed CDH6. Then ensure that all tables or snapshots do not use PREFIX_TREE block encoding by running the following tools:
$hbase pre-upgrade validate-dbe$ hbase pre-upgrade validate-hfile
two。 Upgrade coprocessor classes
External coprocessors are not automatically upgraded. There are two ways to handle coprocessor upgrades:
Before continuing with the upgrade, manually upgrade the coprocessor jar.
Temporarily cancel the setting of the coprocessor and continue the upgrade.
After you upgrade manually, you can reset them.
Attempting to upgrade without upgrading the coprocessor jar may result in unpredictable behavior such as HBase role startup failure, HBase role crash, and even data corruption.
If you already have CDH 6 installed, you can run it to ensure that your coprocessor and upgrade are compatible with the hbase pre-upgrade validate-cp tool.
2.4 considerations for upgrading a cluster
When upgrading a cluster from Cloudera Manager 5.13 or earlier to CDH 6.0 or later using Cloudera Manager Backup and Disaster Recovery (BDR), backing up data using Cloudera Manager Backup and Disaster Recovery (BDR) will not work.
The Cloudera Manager of the minor version used to perform the upgrade must be equal to or greater than the minor version of CDH. To upgrade Cloudera Manager
Note:
When upgrading CDH using a rolling reboot (for upgrades only):
Automatic failover does not affect the rolling restart operation.
After the upgrade is complete, do not delete the old block if you are currently running a MapReduce or Spark job. These jobs still use the old blocks and must be restarted to use the newly upgraded blocks.
Make sure that Oozie work is idempotent.
Do not use Oozie Shell Actions to run Hadoop-related commands.
Rolling upgrade of Spark Streaming jobs is not supported. Restart the flow job after the upgrade is complete to start using the newly deployed version.
The runtime library must be packaged as part of the Spark application.
You must use distributed caching to propagate the job profile from the client gateway computer.
Do not build "super" or "fat" JAR files that contain third-party dependencies or CDH classes, as these files may conflict with classes that Yarn,Oozie and other services automatically add to CLASSPATH.
Build Spark applications without bundling CDH JAR.
2.4.1 backup cloudera manager
We backed up once before the cloudera manager upgrade, and we need to back up again after the upgrade.
1. View database information $cat / etc/cloudera-scm-server/db.properties
For example:
Com.cloudera.cmf.db.type=...com.cloudera.cmf.db.host=database_hostname:database_portcom.cloudera.cmf.db.name=scmcom.cloudera.cmf.db.user=scmcom.cloudera.cmf.db.password=SOME_PASSWORD2. Backup Cloudera Manager Agent
Execute on each agent node:
Create a backup directory $export CM_BACKUP_DIR= "`date +% F`-CM5.14" $echo $CM_BACKUP_DIR$ mkdir-p $CM_BACKUP_DIR
Back up the agent directory and runtime status
$sudo-E tar-cf $CM_BACKUP_DIR/cloudera-scm-agent.tar-- exclude=*.sock / etc/cloudera-scm-agent / etc/default/cloudera-scm-agent / var/run/cloudera-scm-agent / var/lib/cloudera-scm-agent backup the current repo directory $sudo-E tar-cf $CM_BACKUP_DIR/repository.tar / etc/yum.repos.d
Backup Cloudera Management Service
Execute on the Service Monitor node
$sudo cp-rp / var/lib/cloudera-service-monitor / var/lib/cloudera-service-monitor- `date +% F`-CM5.14
Execute on the Host Monitor node
$sudo cp-rp / var/lib/cloudera-host-monitor / var/lib/cloudera-host-monitor- `date +% F`-CM5.14
Execute on the Event Server node
$sudo cp-rp / var/lib/cloudera-scm-eventserver / var/lib/cloudera-scm-eventserver- `date +% F`-CM5.143. Stop Cloudera Manager Server & Cloudera Management Service
Stop Cloudera Management Service in the CDH management interface and select:
Clusters- > Cloudera Management Service.
Actions > Stop.
Stop Cloudera Manager Server:
$sudo service cloudera-scm-server stop4. Backup Cloudera Manager database $mysqldump-- databases database_name--host=database_hostname-- port=database_port-u user_name-p > $HOME/database_name-backup- `date +% F`-CM5.14.sql
The database information is the information obtained from viewing the file in the first step.
5. Backup Cloudera Manager Server
Execute on the Cloudera Manager Server node:
1. Create a backup directory:
$export CM_BACKUP_DIR= "`date +% F`-CM5.14" $echo $CM_BACKUP_DIR$ mkdir-p $CM_BACKUP_DIR
two。 Back up the directory of the Cloudera Manager Server
$sudo-E tar-cf $CM_BACKUP_DIR/cloudera-scm-server.tar / etc/cloudera-scm-server / etc/default/cloudera-scm-server
3. Back up the current repo directory
4.
$sudo-E tar-cf $CM_BACKUP_DIR/repository.tar / etc/yum.repos.d2.4.2 enters maintenance mode
To avoid unnecessary alerts during the upgrade process, enter maintenance mode on the cluster before starting the upgrade. Entering maintenance mode stops sending email alerts and SNMP traps, but does not stop checking and configuring validation. After completing the upgrade, be sure to exit maintenance mode to re-enable Cloudera Manager alerts.
2.4.3 complete the pre-upgrade migration step yarn
Decommission and recommission the YARN NodeManagers but do not start the NodeManagers.
A decommission is required so that the NodeManagers stop accepting new containers, kill any running containers, and then shutdown. (drop YARN's NodeManagers to Decommission, then recommission, but do not start NodeManagers, need Decommission so that NodeManagers stops accepting new containers, terminates all running containers, and then closes. ) procedure:
1. Ensure that new applications, such as MapReduce or Spark applications, are not committed to the cluster until the upgrade is complete.
two。 Open the CDH management interface and go to the YARN service to be upgraded
3. On the instance tab, select all NodeManager roles. This can be done by filtering roles under role types.
4. Click the selected action-> remove authorization
If the cluster is running CDH version 5.9 or later and is managed by Cloudera Manager version 5.9 or later, and you have configured normal de-authorization, a timeout countdown is initiated.
A timeout is provided before smooth decommissioning starts the decommissioning process. Timeouts create a time window to consume already running workloads from the system and allow them to complete. Search for the Node Manager Graceful Decommission Timeout field on the Configuration tab of the YARN service and set this property to a value greater than 0 to create a timeout.
5. Wait until the deregistration is complete. After completion, the status of NodeManager is stopped, and the authorization status is de-authorization.
6. Select all NodeManagers and click on the selected action-> re-authorize.
(6 if you do not do this step, an error will be reported later in the upgrade process, and it is difficult to find the reason, such an one will be reported in the process of yarn upgrade:
Caused by: org.apache.hadoop.ipc.RemoteException (java.io.IOException): Requested replication factor of 0 is less than the required minimum of 1 for / user/yarn/mapreduce/mr-framework/3.0.0-cdh7.2.0-mr-framework.tar.gz)
Important: do not start all NodeManager selected. Hive
Query syntax, DDL syntax, and Hive API have all changed. Before upgrading, you may need to edit the HiveQL code in the application workload.
Sentry
If the cluster uses Sentry policy file authorization, you must migrate the policy file to a Sentry service supported by the database before upgrading to CDH 6.
Spark
If the cluster uses Spark or Spark Standalone, you must perform several steps to ensure that the correct version is installed.
Delete spark standalone
After the upgrade, if spark2 is installed, spark2-submit is replaced with spark-submit, and you need to replace the command to submit the job before submitting the job 2.4.4 to run Hue document cleanup
If the cluster uses Hue, perform the following steps (maintenance version is not required). These steps clean up the database tables used by Hue and can help improve performance after upgrade.
1. Back up the Hue database.
two。 Connect to the Hue database.
3. Check the size of the desktop_document,desktop_document2,oozie_job,beeswax_session,beeswax_savedquery and beeswax_ query tables for reference points. If any of these lines exceed 100000, run cleanup.
2.4.5 download and distribute packages
1. Open the CDH management interface and click Host-> Parcels-> configuration
two。 Update the Parcel repository for CDH with the following remote parcel repository URL:
Https://archive.cloudera.com/cdh7/6.2.0/parcels/
a. In the remote Parcel repository URL section, add the url above on the stand-alone "+" icon, and click Save changes
b. Locate the row in the table that contains the new CDH parcel, and then click the download button.
c. After downloading the package, click the assign button.
d. When all packages have been distributed, click the upgrade button.
2.4.6 run the upgrade CDH Wizard
1. After entering the upgrade wizard, there may be some problems with the results of check,check running some clusters, which will affect subsequent upgrades. Solve these problems first. There will also be a prompt to back up the database. If you have already ok, click Yes, I have performed these steps, and then click continue.
two。 Click full cluster restart (full cluster downtime) and click continue. (this step will restart all services)
Some problems were encountered during the upgrade:
Oozie exception prompt during upgrade:
1.E0103: Could not load service classes, Cannot create PoolableConnectionFactory (Table 'oozie.validate_conn' doesn't exist)
Solution:
2. "java.lang.ClassNotFoundException:org.cloudera.log4j.redactor.RedactorAppender" could not find the class.
Refer to this article and make a soft connection to the missing logredactor-2.0.7.jar from / opt/cloudera/parcels/CDH/lib/oozie/lib to / opt/cloudera/parcels/CDH/lib/oozie/libtools directory
3.ERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only errors to the console.
The reason is that the exception information cannot be displayed because log4j.xml is not configured, so you can also put a template of log4j.xml in the / opt/cloudera/parcels/CDH/lib/oozie/libtools directory.
2.4.7 Migration after upgrade 1.spark to CDH 6, multiple Spark services may be configured, each with its own set of configurations, including the event log location. Determine which services to keep, and then manually merge the two services.
The command (spark2-submit) for submitting Spark 2 jobs in CDH 5 is deleted in CDH 6
Replace with spark-submit. In CDH 5 clusters with built-in Spark 1.6 services and Spark 2 services, spark-submit is used with Spark 1.6 services and with spark2-submit and Spark 2 services. After upgrading to CDH 6, spark-submit uses CDH's built-in Spark 2 service, and spark2-submit no longer works. Be sure to use these commands to update any workflows that submit Spark jobs.
Manually merge the Spark service by performing the following steps:
1. Copy all relevant configurations from the service you want to delete to the service you want to retain. To view and edit the configuration:
a. In Cloudera Manager Admin Console, go to the Spark service that you want to delete.
b. Click the configuration tab.
c. Record the configuration.
d. Go to the Spark service you want to keep and copy the configuration.
e. Click Save changes.
To keep a historical event log:
Determine the location of the event log for the service you want to delete:
In Cloudera Manager Admin Console, go to the Spark service that you want to delete.
Click the configuration tab.
Search: spark.eventLog.dir
Pay attention to the path.
Log in to the cluster host and run the following command:
Hadoop fs-mv / * /.
Using Cloudera Manager, stop and delete the Spark service you chose to delete
Restart the remaining Spark services: click the drop-down arrow next to the Spark service, and then select restart. 2.impala
Impala is mainly used for real-time queries, not for online tasks, so the importance is not so high. Refer to the official website.
Https://www.cloudera.com/documentation/enterprise/upgrade/topics/impala_upgrading.html
That's it.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.