The fourth of big data's learning series-Hadoop+Hive environment building picture and text detailed explanation (stand-alone) 07/12 Update SLTechnology News&Howtos

The fourth of big data's learning series-Hadoop+Hive environment building picture and text detailed explanation (stand-alone)

2025-07-12 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

Author: nothingness

Source of blog Park: http://www.cnblogs.com/xuwujing

Source of CSDN: http://blog.csdn.net/qazwsxpcm

Source of personal blog: http://www.panchengming.com

Original is not easy, reprint please indicate the source, thank you!

Introduction

Successfully built the Hadoop environment in one of big data's learning series-Hadoop environment (stand-alone), and successfully built the HBase environment and related usage introduction in big data's second learning series-HBase environment (stand-alone). This article mainly explains how to build the environment of Hadoop+Hive.

1. Environment preparation 1, server selection

Local virtual machine

Operating system: linux CentOS 7

Cpu:2 kernel

Memory: 2G

Hard disk: 40g

Note: because you have to reconfigure the Ali cloud server every time, and you have to consider the problem of network transmission, you have built a virtual machine locally to facilitate file transfer and related configuration. The disadvantage is that the computer with the native card becomes more stuck. Specific tutorials and use in the previous blog post.

Address: http://blog.csdn.net/qazwsxpcm/article/details/78816230.

2, configuration selection

JDK:1.8 (jdk-8u144-linux-x64.tar.gz)

Hadoop:2.8.2 (hadoop-2.8.2.tar.gz)

Hive: 2.1 (apache-hive-2.1.1-bin.tar.gz)

3, download address

JDK:

Http://www.oracle.com/technetwork/java/javase/downloads

Hadopp:

Http://www.apache.org/dyn/closer.cgi/hadoop/common

Hive

Http://mirror.bit.edu.cn/apache/hive/

Baidu cloud disk:

Link: https://pan.baidu.com/s/1slxBsHv password: x51i

Second, the configuration of the server

You should configure Hadoop+Hive before you configure it.

For convenience in doing these configurations, use root permissions.

1, change the hostname

Change the hostname first in order to facilitate management.

Enter:

Hostname

View the name of this machine

Then change the hostname to master

Enter:

Hostnamectl set-hostname master

Note: after the host name is changed, reboot will not take effect until it is restarted.

2. Map IP to hostname

Modify the hosts file to do relational mapping

Input

Vim / etc/hosts

Add

Ip and host name of the host

192.168.238.128 master3, turn off the firewall

Turn off the firewall for easy access.

The following inputs for CentOS 7 version:

Turn off the firewall

Service iptables stop

Input for versions above CentOS 7:

Systemctl stop firewalld.service3, time settin

View current time

Enter:

Date

Check whether the server time is the same, and if not, change it.

Change time command

Date-s' MMDDhhmmYYYY.ss' III. Hadoop installation and configuration

The specific configuration of Hadoop is described in detail in Hadoop Environment Building (stand-alone), one of big data's learning series. So this article will give a general introduction.

Note: the specific configuration is based on your own.

1, environment variable setting

Edit the / etc/profile file:

Vim / etc/profile

Configuration file:

Export HADOOP_HOME=/opt/hadoop/hadoop2.8export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/nativeexport HADOOP_OPTS= "- Djava.library.path=$HADOOP_HOME/lib" export PATH=.:$ {JAVA_HOME} / bin:$ {HADOOP_HOME} / bin:$PATH2, configuration file change

Change to the / home/hadoop/hadoop2.8/etc/hadoop/ directory first

3.2.1 modify core-site.xml

Enter:

Vim core-site.xml

Before adding:

Hadoop.tmp.dir / root/hadoop/tmp Abase for other temporary directories. Fs.default.name hdfs://master:9000 3.2.2 modify hadoop-env.sh

Enter:

Vim hadoop-env.sh

Modify ${JAVA_HOME} to your own JDK path

Export JAVA_HOME=$ {JAVA_HOME}

Modified to:

Export JAVA_HOME=/home/java/jdk1.83.2.3 modifies hdfs-site.xml

Enter:

Vim hdfs-site.xml

Before adding:

Dfs.name.dir / root/hadoop/dfs/name Path on the local filesystem where theNameNode stores the namespace and transactions logs persistently. Dfs.data.dir / root/hadoop/dfs/data Comma separated list of paths on the localfilesystem of a DataNode where it should store its blocks. Dfs.replication 2 dfs.permissions false need not permissions3.2.4 modify mapred-site.xml

If the file is not mapred-site.xml, copy the mapred-site.xml.template file and rename it to mapred-site.xml.

Enter:

Vim mapred-site.xml

Modify the newly created mapred-site.xml file and add the configuration to the node:

Mapred.job.tracker master:9001 mapred.local.dir / root/hadoop/var mapreduce.framework.name yarn3,Hadoop startup

You need to format before starting.

Change to / home/hadoop/hadoop2.8/bin directory

Enter:

. / hadoop namenode-format

After the format is successful, switch to the / home/hadoop/hadoop2.8/sbin directory

Start hdfs and yarn

Enter:

Start-dfs.shstart-yarn.sh

After the startup is successful, enter jsp to check whether the startup is successful

Enter ip+8088 and ip+ 50070 in the browser to see if you can access it.

If you can access it correctly, you can start it successfully.

IV. Mysql installation

Because the default metadata for Hive is Mysql, you need to install Mysql first.

Mysql has two installation modes, which you can choose from.

# 1 installation of dint yum

First check to see if mysql has been installed

Enter:

Rpm-qa | grep mysql

If it is already installed and you want to delete it

Enter:

Normal delete command:

Rpm-e mysql

Force delete command:

Rpm-e-nodeps mysql

Dependent files are also deleted

Install mysql

Enter:

Yum list mysql-server

If not, download the package with the wget command

Enter:

Wget http://repo.mysql.com/mysql-community-release-el7-5.noarch.rpm

After the download is successful, enter the command to install

Yum install mysql-server

Just enter y when you encounter a choice during installation

After successful installation, type service mysqld start to start the service

Enter:

Mysqladmin-u root-p password '123456'

To set the password

Enter directly after typing (no password by default)

And then enter

Mysql-u root-p

Change remote connection permissions through authorization law

Input: grant all privileges on. To 'root'@'%' identified by' 123456'

Note: the first 'root'' is the user name, the second'%'is that all ip can be accessed remotely, and the third '123456' indicates that the user's password is turned off if it is not commonly used.

Enter: flush privileges; / / Refresh

After the firewall is turned off, use a tool such as SQLYog to test whether you can connect correctly

2. Compile package installation Mysql file preparation

Upload the downloaded mysql installation package to the linux server

Extract the mysql package, move it to the / usr/local directory, and rename it mysql.

Command:

Tar-xvf mysql-5.6.21-linux-glibc2.5-x86_64.tar.gzmv mysql-5.6.21-linux-glibc2.5-x86_64 / usr/localcd / usr/localmv mysql-5.6.21-linux-glibc2.5-x86_64 mysql

Note: the default path for mysql is / usr/local/mysql. If the installation location changes, you need to change the corresponding configuration file.

Install mysql

Change to mysql's directory / usr/local/mysql

Enter:

. / scripts/mysql_install_db-- user=mysql

After successfully installing mysql, enter

Service mysql start or / etc/init.d/mysql start

Check to see if the startup is successful

Enter:

Ps-ef | grep mysql

Change to / usr/local/mysql/bin directory

Set password

Mysqladmin-u root password '123456' into mysql

Enter:

Mysql-u root-p

Set remote connection permissions

Enter:

Grant all privileges on. To 'root'@'%' identified by' 123456'

Then enter:

Flush privileges

Description: the first 'root'' is the user name, the second'%'is that all ip can be accessed remotely, and the third '123456' indicates that the user password is turned off if it is not commonly used.

Connect the test using the local connection tool

Installation and configuration of Hive environment 1, file preparation

Extract the downloaded Hive configuration file

On linux, enter:

Tar-xvf apache-hive-2.1.1-bin.tar.gz

Then move to / opt/hive and rename the folder to hive2.1

Input

Mv apache-hive-2.1.1-bin / opt/hivemv apache-hive-2.1.1-bin hive2.12, environment configuration

Edit / etc/profile file

Enter:

Vim / etc/profile

Add:

Export HIVE_HOME=/opt/hive/hive2.1export HIVE_CONF_DIR=$ {HIVE_HOME} / confexport PATH=.:$ {JAVA_HOME} / bin:$ {SCALA_HOME} / bin:$ {SPARK_HOME} / bin:$ {HADOOP_HOME} / bin:$ {ZK_HOME} / bin:$ {HBASE_HOME} / bin:$ {HIVE_HOME} / bin:$PATH

Note: the actual configuration is based on your own!

Enter:

Source / etc/profile

Make the configuration effective

3, configuration changes 5.3.1 New folder

Before you can modify the configuration file, you need to create some folders in the root directory.

Mkdir / root/hivemkdir / root/hive/warehouse

After you have created the file, you need to have hadoop create new / root/hive/warehouse and / root/hive/ directories.

Execute the command:

$HADOOP_HOME/bin/hadoop fs-mkdir-p / root/hive/$HADOOP_HOME/bin/hadoop fs-mkdir-p / root/hive/warehouse

Give read and write access to the directory you just created, and execute the command:

$HADOOP_HOME/bin/hadoop fs-chmod 777 / root/hive/$HADOOP_HOME/bin/hadoop fs-chmod 777 / root/hive/warehouse

Check whether the two directories are created successfully

Enter:

$HADOOP_HOME/bin/hadoop fs-ls / root/$HADOOP_HOME/bin/hadoop fs-ls / root/hive/

You can see that it has been successfully created

5.3.2 modify hive-site.xml

Change to / opt/hive/hive2.1/conf directory

Make a copy of hive-default.xml.template and rename it hive-site.xml

Then edit the hive-site.xml file

Cp hive-default.xml.template hive-site.xmlvim hive-site.xml

Edit the hive-site.xml file and add:

Hive.metastore.warehouse.dir / root/hive/warehouse hive.exec.scratchdir / root/hive hive.metastore.uris javax.jdo.option.ConnectionURL jdbc:mysql://master:3306/hive?createDatabaseIfNotExist=true javax.jdo.option.ConnectionDriverName com.mysql.jdbc.Driver javax.jdo.option.ConnectionUserName root Javax.jdo.option.ConnectionPassword 123456 hive.metastore.schema.verification false

And then set all the

${system:java.io.tmpdir}

Change to / opt/hive/tmp (create if you don't have the file)

And give this folder read and write permissions, and set the

${system:user.name}

Change to root

For example:

Change the previous:

After the change:

Configuration diagram:

Note: because there are too many configurations in the hive-site.xml file, you can download it through FTP for editing. You can also configure what you want directly, and the rest can be deleted. The master in the connection address of the MySQL is the alias of the host and can be replaced with ip.

5.3.3 modify hive-env.sh

Modify the hive-env.sh file, copy hive-env.sh.template without it, and rename it to hive-env.sh

Add to this configuration file

Export HADOOP_HOME=/opt/hadoop/hadoop2.8export HIVE_CONF_DIR=/opt/hive/hive2.1/confexport HIVE_AUX_JARS_PATH=/opt/hive/hive2.1/lib

5.3.4 add a data-driven package

Because the default database that comes with Hive uses mysql, so this block uses mysql

Upload the driver package of mysql to / opt/hive/hive2.1/lib

VI. Hive Shell test

After successfully starting Hadoop

Change to the Hive directory

Enter:

Cd / opt/hive/hive2.1/bin

Initialize the database first

When initializing, pay attention to start mysql

Enter:

Schematool-initSchema-dbType mysql

After successful execution, you can see that the hive database and a stack of tables have been created successfully

Switch to cd / opt/hive/hive2.1/bin

Enter hive (make sure hadoop and start successfully)

Enter:

Hive

After entering hive

Do some simple operations

Create a new library, and then build a table

The basic operation is similar to that of ordinary relational database.

Create a library:

Create database db_hiveTest

Create a table:

Create table db_hiveTest.student (id int,name string) row format delimited fields terminated by'\ t'

Note: terminated by'\ t 'indicates that the text delimiter should use Tab, and there can be no spaces between lines.

Load data

Open a new window

Because hive does not support writing, add data using load load text fetch.

Create a new text

Touch / opt/hive/student.txt

Edit the text to add data

Enter:

Vim / opt/hive/student.txt

Add data:

The space character in the middle is built with Tab

1001 zhangsan1002 lisi1003 wangwu

Note: text can be created on Windows and then uploaded to linux through ftp. Note that the format of the text is unix format.

Switch to hive shell

Load data

Enter:

Load data local inpath'/ opt/hive/student.txt' into table db_hivetest.student

Then query the data

Enter:

Select * from db_hiveTest.student

This is the end of the configuration of Hadoop+Hive in this article, thank you for reading!

Other

For more use of hive, please refer to the official documentation.

Https://cwiki.apache.org/confluence/display/Hive/LanguageManual

Reference for environment building:

Http://blog.csdn.net/pucao_cug/article/details/71773665

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.