What are the basic knowledge points of CDH5 04/28 Update SLTechnology News&Howtos

What are the basic knowledge points of CDH5

2025-04-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article mainly explains "what are the basic knowledge points of CDH5". The content of the explanation is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn "what are the basic knowledge points of CDH5".

0. System structure

CM is divided into two parts: Server and Agent and database (embedded Postgresql with changes). It mainly does three events:

1. Manage and monitor cluster hosts.

two。 Unify the management configuration.

3. Manage and maintain Hadoop platform system.

Agent is responsible for executing the commands sent by the server for the client, which is generally implemented by invoking the corresponding service shell script using python. The Server side is a Java REST service, which provides the REST API,Web management side to call the server side functions through REST API, and the Web interface uses rich client technology (Knockout).

1. The Server body is implemented using Java.

2. The Agent body uses Python, and the service is started by calling the corresponding shell script. If the startup fails, the startup script will be called 4 times.

3. Agent and Server keep the heartbeat, using the Thrift RPC framework.

1. Related catalogue

/ opt/cloudera/parcels/: installation directory for Hadoop related services.

/ opt/cloudera/parcel-repo/: downloaded service package data in parcels format.

/ opt/cloudera/parcel-cache/: the downloaded service package caches data.

/ opt/cloudera/parcels/CDH/jars: directory where all jar packages are located

/ var/log/cloudera-scm-installer: installation log directory

/ var/log/cloudera-scm-*: related log files (related services and CM)

/ usr/share/cmf/: program installation directory

/ usr/lib64/cmf/: Agent program code

/ var/lib/cloudera-scm-server-db/data: embedded postgresql database directory

/ var/lib/cloudera-scm-server: server directory

/ usr/bin/postgres: embedded database program

/ etc/cloudera-scm-agent/: the configuration directory for cm agent.

/ etc/cloudera-scm-agent/config.ini: configure the configuration of the connection server, such as server_host

/ etc/cloudera-scm-server/: the configuration directory for cm server.

/ etc/cloudera-scm-server/db.properties: database settings

/ etc/hadoop/*: hadoop client configuration directory

/ etc/hive/: the configuration directory of hive

two。 Configuration and environment variables

After the CDH installation is completed, the Hadoop component configuration files will be placed in the / var/run/cloudera-scm-agent/process/ directory when the service starts.

For example: / var/run/cloudera-scm-agent/process/193-hdfs-NAMENODE/core-site.xml. These configuration files are generated when the corresponding service (such as HDFS) is started through Cloudera Manager, and the contents are obtained from the database (that is, parameters configured through the interface).

Changing the configuration on the CM interface is not immediately reflected in the configuration file, and the information is stored in the database until the next time the service is restarted. And a new configuration file is generated each time it starts.

The main database of CM Server is configs, which is the data table configured for placement in scm/cmf. It contains the configuration information of the service

Each configuration change will save the configuration modification history by adding all the configuration contents of the current page to the database.

View configuration content

a. Directly query the contents of the configs data table in the cmf database.

b. Visit REST API: http://172.16.101.66:7180/api/v4/cm/deployment and return deployment configuration information in JSON format.

Configure the generation mode

CM generates a separate configuration directory (file) for each service process. All configurations are generated by querying the database on the server (because the scm/cmf database can only be accessed under localhost) to generate configuration files, and then agent downloads the zip package containing the configuration files through the network and decompresses them locally to the specified directory.

Configuration modification

CM is predefined for configurations that need to be modified, and for configurations that are not predefined, it is configured by using xml configuration fragments in advanced configuration items. The configuration file under / etc/hadoop/ is the client configuration, which can be generated by deploying the client in CM.

Environment variable script: / opt/cloudera/parcels/CDH/meta/cdh_env.sh

CDH_HADOOP_HOME=/opt/cloudera/parcels/CDH/lib/hadoop

HADOOP_BIN=/opt/cloudera/parcels/CDH/lib/hadoop-0.20-mapreduce/bin/Hadoop

# the configuration files of CM client / server and Hadoop components are all in the / etc directory

HDFSActive NameNode data directory dfs.name.dir/dfs/nnStandby NameNode data directory dfs.name.dir/dfs/nn

Secondary NameNode HDFS checkpoint directory fs.checkpoint.dir/dfs/nn

Log directory hadoop.log.dir/var/log/hadoop-hdfs

MapReduceJobTracker local data directory mapred.local.dir/mapred/jtTaskTracker local data directory list mapred.local.dir/mapred/local log directory hadoop.log.dir/var/log/hadoop-0.20-mapreduceHive warehouse directory hive.metastore.warehouse.dir/user/hive/warehouseHiveServer2 log directory / var/log/hiveZookeeper data directory dataDir/var/lib/zookeeper transaction log directory dataLogDir/var/lib/zookeeper3. CM common commands

Service cloudera-scm-server start | stop | restart | status

Service cloudera-scm-server-db start | stop | restart | status

Service cloudera-scm-agent start | stop | restart | status

View the process: jps / jps-l

4. Hadoop Shell

Hadoop fs-ls /: lists directories and files under the root directory of the hdfs file system

Hadoop fs-ls-R /: lists all directories and files of the hdfs file system

Hadoop dfsadmin-report: view basic information and statistics of the file system

Thank you for your reading, these are the contents of "what are the basic knowledge points of CDH5". After the study of this article, I believe you have a deeper understanding of what are the basic knowledge points of CDH5, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.