In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-03 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)05/31 Report--
This article mainly introduces how to merge hive small files, the article is very detailed, has a certain reference value, interested friends must read it!
Cause:
Recently, a new partition table has been built in the warehouse, with a data volume of about 1.2 billion rows. There are many partitions, one partition a day since July 2008.
A task is configured
When group by this table, we found that more than 2800 maps were started.
The execution time is also 10 minutes high.
Then I saw in the hdfs file that there were more than 20 small files in each partition of the table, each of which was not too 300KB--1MB.
Parameters of the previous hive:
Hive.merge.mapfiles=true
Hive.merge.mapredfiles=false
Hive.merge.rcfile.block.level=true
Hive.merge.size.per.task=256000000
Hive.merge.smallfiles.avgsize=16000000
Hive.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat
Mapred.max.split.size=256000000
Mapred.min.split.size=1
Mapred.min.split.size.per.node=1
Mapred.min.split.size.per.rack=1
Hive.merge.mapredfiles refers to merging small files at the end of Map-Reduce 's task.
Solution:
1. Modify parameter hive.merge.mapredfiles=true
two。 A new table is generated by map_reduece, and the generated file becomes one file per partition.
The efficiency of performing group by discovery again has been greatly improved.
The above is all the contents of the article "how to merge small hive Files". Thank you for reading! Hope to share the content to help you, more related knowledge, welcome to follow the industry information channel!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.