In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly shows you the "sample analysis of Hive command operation", which is easy to understand and clear. I hope it can help you solve your doubts. Let the editor lead you to study and learn the article "sample analysis of Hive command operation".
1. Prepare the text file and start Hadoop [root @ hadoop0 ~] # cat / opt/test.txt
JieJie
MengMeng
NingNing
JingJing
FengJie
[root@hadoop0 ~] # start-all.sh
Warning: $HADOOP_HOME is deprecated.
Starting namenode, logging to / opt/hadoop/libexec/../logs/hadoop-root-namenode-hadoop0.out
Localhost: starting datanode, logging to / opt/hadoop/libexec/../logs/hadoop-root-datanode-hadoop0.out
Localhost: starting secondarynamenode, logging to / opt/hadoop/libexec/../logs/hadoop-root-secondarynamenode-hadoop0.out
Starting jobtracker, logging to / opt/hadoop/libexec/../logs/hadoop-root-jobtracker-hadoop0.out
Localhost: starting tasktracker, logging to / opt/hadoop/libexec/../logs/hadoop-root-tasktracker-hadoop0.out
2. Enter the command line [root@hadoop0 ~] # hive
WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files.
Logging initialized using configuration in jar bank, filebank, bank, hop, bank, etc.,
Hive history file=/tmp/root/hive_job_log_root_201509252001_1674268419.txt
3. Query yesterday's table hive > select * from stu
OK
JieJie 26 NULL
MM 24 NULL
Time taken: 17.05 seconds
4. Display database hive > show databases
OK
Default
Time taken: 0.237 seconds
5. Create database hive > create database test
OK
Time taken: 0.259 seconds
Hive > show databases
OK
Default
Test
6. Use the database Time taken: 0.119 seconds
Hive > use test
OK
Time taken: 0.03 seconds
7. Create the textfile default format of the table without data compression, resulting in high disk overhead and high data parsing overhead.
It can be used in combination with Gzip and Bzip2 (the system automatically checks and decompresses the query automatically), but in this way, hive does not split the data, so it cannot operate on the data in parallel.
SequenceFile is a kind of binary file support provided by Hadoop API, which is easy to use, divisible and compressible.
SequenceFile supports three compression options: NONE, RECORD, and BLOCK. Record compression ratio is low, and BLOCK compression is generally recommended.
Rcfile is a storage method that combines row and column storage. First of all, it divides the data into rows to ensure that the same record is on the same block, avoiding the need to read multiple block to read a record. Secondly, block data column storage is beneficial to data compression and fast column access.
Hive > create table test1 (str STRING) STORED AS TEXTFILE
OK
Time taken: 0.598 seconds
-- load data
Hive > LOAD DATA LOCAL INPATH'/ opt/test.txt' INTO TABLE test1
Copying data from file:/opt/test.txt
Copying file: file:/opt/test.txt
Loading data to table test.test1
OK
Time taken: 1.657 seconds
Hive > select * from test1
OK
JieJie
MengMeng
NingNing
JingJing
FengJie
Time taken: 0.388 seconds
Hive > select count (*) from test1
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
Set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:
Set hive.exec.reducers.max=
In order to set a constant number of reducers:
Set mapred.reduce.tasks=
Starting Job = job_201509252000_0001, Tracking URL = http://hadoop0:50030/jobdetails.jsp?jobid=job_201509252000_0001
Kill Command = / opt/hadoop/libexec/../bin/hadoop job-Dmapred.job.tracker=hadoop0:9001-kill job_201509252000_0001
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2015-09-25 20 09 Stage-1 map = 0%, reduce = 0%
2015-09-25 20 sec 10 Cumulative CPU 19806 Stage-1 map = 100%, reduce = 0%
2015-09-25 20 sec 10 Cumulative CPU 54223 Stage-1 map = 100%, reduce = 100%
MapReduce Total cumulative CPU time: 6 seconds 950 msec
Ended Job = job_201509252000_0001
MapReduce Jobs Launched:
Job 0: Map: 1 Reduce: 1 Cumulative CPU: 6.95 sec HDFS Read: 258 HDFS Write: 2 SUCCESS
Total MapReduce CPU Time Spent: 6 seconds 950 msec
OK
five
Time taken: 77.515 seconds
Create table test1 (str STRING) STORED AS TEXTFILE
Create table test2 (str STRING)
Hive > create table test3 (str STRING) STORED AS SEQUENCEFILE
OK
Time taken: 0.112 seconds
Hive > create table test4 (str STRING) STORED AS RCFILE
OK
Time taken: 0.502 seconds
8. Import the data from the old table into the new table INSERT OVERWRITE TABLE test4 SELECT * FROM test1
9. Set hive parameters hive > SET hive.exec.compress.output=true
Hive > SET io.seqfile.compression.type=BLOCK
10. Check the hive parameter hive > SET
I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.