In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
What is cdh
CDH is Cloudera's 100% open source Hadoop distribution, built specifically to meet enterprise demands
That is, an open source distributed storage system
What software and functions are included in cdh5
First of all, hbase,hadoop,zookeeper, these are essential.
Secondly, hive,oozie,Map/Reduce can also be integrated into it.
HBase is a distributed, column-oriented open source database. This technology comes from the Google paper "Bigtable: a distributed Storage system for structured data" written by Chang et al.
Hadoop is a distributed system infrastructure developed by the Apache Foundation. Users can develop distributed programs without knowing the underlying details of the distribution. Make full use of the power of clusters for high-speed computing and storage
ZooKeeper is an official sub-project of Hadoop. It is a reliable coordination system for large-scale distributed systems. It provides functions such as configuration maintenance, name service, distributed synchronization, group service, etc.
Hive is a data warehouse tool based on Hadoop, which can map structured data files to a database table, provide complete sql query functions, and convert sql statements into MapReduce tasks to run.
Oozie is a framework that allows us to combine multiple Map/Reduce jobs into a single logical unit of work
MapReduce is a programming model for parallel operations on large datasets (larger than 1TB). The concepts "Map" and "Reduce", and their main ideas, are borrowed from functional programming languages, as well as features borrowed from vector programming languages. It greatly facilitates programmers to run their programs on distributed systems without distributed parallel programming.
III. Installation of cdh5
Generally speaking, the popular way to install cdh5 is to log on to the official website http://www.cloudera.com/blog/2012/02/introducing-cdh5/.
Download the required rpm package, install it all the way through yum according to the official documentation, and finally configure it.
What I want to introduce here is the installation process of installing cdh5 through cloudera-manager.
Cloudera-manager is also a product of the apache Foundation. At present, there are two versions: the free version and the commercial version. The free version only supports 50 nodes, and the commercial version is not limited.
Of course, in general, 50 nodes will be enough. Here we use the free version of cloudera-manager.
Official download address: https://ccp.cloudera.com/display/SUPPORT/Downloads
1. Installation environment
Node1:192.168.1.124 centos6.2 system
Node2:192.168.1.163 centos6.2 system
Iptables shuts down
Selinux shuts down
two。 Install cloudera-manager
Node1:
After the official download, you will get an executable file cloudera-manager-installer.bin
Here we need to install the X Window System package group in advance, the reason is very simple, graphical installation interface
When it is installed here, it will automatically yum install the packages he needs. There are about more than 100m yum installed and downloaded automatically. Because it is a foreign source, coupled with the company's speed limit, China's various policies, and so on, it often leads to the situation that the card will not move and the installation will not be finished in a day.
My installation method is to directly interrupt the installation of the graphical interface, that is, to kill directly. At this time, the yum source that he needs to import has been imported into our system.
According to the connection http://archive.cloudera.com/cm4/redhat/6/x86_64/cm/4.0.4/ in the yum source
Download it manually, such as the package below.
After the download is complete, use yum to install locally
Yum localinstall-- nogpgcheck * .rpm
After the yum installation is complete, rerun cloudera-manager-installer.bin to complete the installation (if the installation fails and prompts you to install it, go to the / usr/share/cmf directory and delete the uninstall-cloudera-manager.sh file)
Attachment 1: both hosts need to be installed, only one running graphical interface, as a console, the other does not need to move, here I am using the node1 node as the console
Attachment 2: the two host jdk should also be installed, otherwise they will be downloaded and installed automatically. It is recommended to use the jdk installed in the rpm package.
3. Install cdh5
After the installation of ①. Cloudera-manager is completed, it will start automatically, and ports such as 7182 and 7180 can be found through netstat-tnlp.
Connect to http://192.168.1.124:7180 through the web page to enter the web management entry of cloudera-manager. By default, the administrator user admin and password admin
After logging in, you will be prompted as follows, that is, whether to use the free version or the commercial version, we choose to use it for free.
two。 Then there is the installation of the full cloudera-manager console web interface, which is very simple.
First search for the host, fill in the two host ip, search for the host, and then select install
Install the version of cdh5, etc., and then there is the installation page of the reader bar, which is the same as installing cloudera-manager. After the Yum source file comes out, it is interrupted directly, and then go back to the system kill to drop the yum process and close the page.
Check the required download software through / etc/yum.repos.d/cloudera-cdh5.repo, connect to http://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/4/ to download the following rpm package
And then, as above, yum localinstall-- nogpgcheck *. Rpm
Finally, reopen the http://192.168.1.124:7180 page and reinstall the host.
Attached 1:cloudera-manager console does not redownload and install software packages that have already been installed
Attachment 2: if the network speed is good, you can wait for the installation to be completed without interruption, but if you fail, do not click retry, which will uninstall the installed content, that is, start all over again, due to foreign sources. Internet speed is known to all.
3. After installing and playing with the above content, there will be a host detection. If there are many hosts, it will be relatively slow. This depends on the individual. After testing, you can choose the service. Here I choose hbase,hadoop,zookeeper, and then start the service.
Real-time detection of service status
Real-time detection of host condition
Enter the mainframe and open the hbase shell test
At this point, the cdh5 framework can be used
Attachment: for services that are not selected, they are not started by default. Don't worry about this. If you need to use hive, etc., you can execute it manually.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.