Example Analysis of Ambari HDP Cluster Construction 10/26 Update SLTechnology News&Howtos

Example Analysis of Ambari HDP Cluster Construction

2025-10-26 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

In this issue, the editor will bring you an example analysis on the construction of Ambari HDP clusters. The article is rich in content and analyzed and described from a professional point of view. I hope you can get something after reading this article.

Recently, due to the need to re-use Ambari to build a Hadoop cluster at work, the process of building was recorded.

The latest version of Ambari Ubuntu14.04 2.2.1

The latest version of HDP Ubuntu14.04 2.4.3.0

What is Ambari?

Apache Ambari is a Web-based tool that supports provisioning, management, and monitoring of Apache Hadoop clusters.

Ambari already supports most Hadoop components, including HDFS, MapReduce, Hive, Pig, Hbase, Zookeeper, Sqoop, and Hcatalog.

Apache Ambari supports centralized management of HDFS, MapReduce, Hive, Pig, Hbase, Zookeepr, Sqoop, and Hcatalog. It is also one of the top five hadoop management tools. (is an open source hadoop one-click installation service)

What can we do with him? Why should we use it?

We can use ambari to quickly build and manage hadoop and frequently used service components.

Such as hdfs, yarn, hive, hbase, oozie, sqoop, flume, zookeeper, kafka and so on. (to put it bluntly, it means how lazy you can be)

Tell me again why we use it.

The first is that ambari is an early Hadoop management cluster tool.

The second is that now the official website of Hadoop also recommends the use of Ambari.

Cluster provisioning is simplified through a step-by-step installation wizard.

With key operation and maintenance indicators (metrics) pre-configured, you can directly check whether Hadoop Core (HDFS and MapReduce) and related projects (such as HBase, Hive and HCatalog) are healthy.

Support the visualization and analysis of job and task execution to better view dependencies and performance.

The monitoring information is exposed through a complete RESTful API, which integrates the existing operation and maintenance tools.

The user interface is very intuitive, and users can easily and effectively view information and control the cluster.

Ambari uses Ganglia to collect metrics, uses Nagios to support system alarms, and sends emails to administrators when they need attention (such as node downtime or insufficient disk space).

In addition, Ambari can install secure (Kerberos-based) Hadoop clusters to support Hadoop security, provide role-based user authentication, authorization and audit functions, and integrate LDAP and Active Directory for user management.

Cluster building

1. Let's do some preparatory work before installation.

# # first tell the servers who they are What are the nicknames (modify the configuration hosts file) vim / etc/hosts10.1.10.1 master10.1.10.2 slave110.1.10.3 slave2## and then let us enter and leave their house freely with a key card (configure secret-free login) ssh-keygen-t rsa # # execute cat ~ / .ssh/id_rsa.pub # # on all machines to check the public key cat ~ /. Ssh/id_rsa.pub > > ~ / .ssh/authorized_keys # # write the public key to the authorized_keys file # first, write all the public keys to the master server # secondly, write the public key of master to slave1 Slave2### finally uses the scp command to tell someone the password (I won't tell you my password is "what time is the old wolf") scp ~ / .ssh/authorized_keys slave1:~/.ssh/authorized_keysscp ~ / .ssh/authorized_keys slave2:~/.ssh/authorized_keys## update time zone and system localization configuration apt-get install localepurge # # Don't worry about enter (uninstall those local that are not in use Dpkg-reconfigure localepurge & & locale-gen zh_CN.UTF-8 en_US.UTF-8 # # forget about apt-get update & & apt-get install-y tzdata echo "Asia/Shanghai" > / etc/timezone # # change the time zone to Shanghai rm / etc/localtimedpkg-reconfigure-f noninteractive tzdatavi / etc/ntp.confserver 10.1.10.1

2. Then do some optimization of Ubuntu system

# 1.1 close swap partition swapoff-avim / etc/fstab # # delete comment swap line similar to the following # swap was on / dev/sda2 during installation#UUID=8aba5009-d557-4a4a-8fd6-8e6e8c687714 none swap sw 00 # 1.2 modify the number of file descriptor openings to add ulimitvi / etc/profileulimit-SHn 512000vim / etc/security/limits.conf # increase the size by 10 times * soft nofile 600000 * hard nofile 655350 * soft nproc 600000 * hard nproc 655350 kernel configuration # 1.2 use the command to modify effective source / etc/profile###1.3 modify kernel configuration vi / etc/sysctl.conf### and paste it fs.file-max = 65535000net.core.somaxconn = 30000vm.swappiness = 0net.core.rmem_max = 16777216net.ipv4.tcp_rmem = 409687380 16777216net.ipv4.tcp_wmem = 409616384 16777216net.core.netdev_max_backlog = 16384net. Ipv4.tcp_max_syn_backlog = 8192net.ipv4.tcp_syncookies = 1net.ipv4.tcp_tw_reuse = 1net.ipv4.tcp_tw_recycle = 1net.ipv4.ip_local_port_range = 1024 65000net.ipv6.conf.all.disable_ipv6=1net.ipv6.conf.default.disable_ipv6=1net.ipv6.conf.lo.disable_ipv6=1### execute the command to let the configuration take effect sysctl-paired kernel 1.4 configure kernel turn off THP function echo never > / sys/kernel / mm/transparent_hugepage/enabled## shuts down permanently. Vi / etc/rc.local if test-f / sys/kernel/mm/transparent_hugepage/enabled; then echo never > / sys/kernel/mm/transparent_hugepage/enabled fi if test-f / sys/kernel/mm/transparent_hugepage/defrag; then echo never > / sys/kernel/mm/transparent_hugepage/defrag fi

3. Install and deploy ambari-server (environment: Ubuntu 14.04 + Ambari 2.2.1)

# # download source wget-O / etc/apt/sources.list.d/ambari.list http://public-repo-1.hortonworks.com/ambari/ubuntu14/2.x/updates/2.2.1.0/ambari.listapt-key adv-- recv-keys-- keyserver keyserver.ubuntu.com B9733A7A07513CADapt-get update## install ambari-server apt-get install ambari-server-y on master nodes # install ambari-agent apt-get install ambari-agent-y on all nodes

4. Modify the ambari-agent configuration to point to ambari-server

Vi / etc/ambari-agent/conf/ambari-agent.ini## modifies hostname [server] hostname=masterurl_port=8440secured_url_port=8441## initializes ambari-server configuration ambari service Database, JDK (default 1.7), LDAP generally chooses default ambari-server setup # # crazy point enter## to start ambariambari-server startambari-agent start

5. After the headache Shell command, began to connect the things in the world.

Use your browser to access http://10.1.10.1:8080/ account password defaults to amdin/admin Click LAUNCH INSTALL WIZARD Let's start happily

6. Give the cluster a name

7. Pay attention to this and make sure your hdp version, or there will be trouble later.

* * 8. What I configure here is HDP2.4.3 * *

Example: http://public-repo-1.hortonworks.com/HDP/debian7/2.x/updates/2.4.3.0

Click next to check whether the data source is normal. If you report an error here, you can click "Skip Repository Base URL validation (Advanced)" to skip check.

9. Enter hostname master slave1 slave2 because you install ambari-agent in slave, so you directly choose not to use ssh.

10. Check the server status-- you need to wait here. If the waiting time is too long, you can restart ambari-server.

11. Choose the service we need HDFS YARN ZK

12. Directly use the default allocation method of Ambari and click next to start the installation.

13. Here is the time to consider the speed of the Internet.

14. After the installation is completed, Next refreshes the main page all the way to see our Hadoop cluster, which is started by default.

15. Click restart ALL under HDFS to restart all components

16. Verify that the installation is successful. Click NameNodeUI.

17. Basic information page

18. Hadoop has been built and finished. don't you want to try a task?

# # go to the server and execute # create a hdfs directory through the http://master:50070/explorer.html#/ interface hdfs dfs-mkdir-p / data/input # upload files from the server to hdfs hdfs dfs-put file / data/input/### use the example provided on the official website to test the hadoop jar hdfs://tesla-cluster/data/hadoop-mapreduce-examples-2.7.1.2.4. 0.0-169.jar wordcount / data/input / data/output1

19. The result is generated as follows: _ SUCCESS and file

The above is the example of building an Ambari HDP cluster shared by Xiaobian. If you happen to have similar doubts, please refer to the above analysis to understand. If you want to know more about it, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.