In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/02 Report--
Editor to share with you what is the difference between static and dynamic partitions of Hive, I believe most people do not know much about it, so share this article for your reference, I hope you can learn a lot after reading this article, let's go to know it!
Hive is a data warehouse tool of opportunity Hadoop, its learning cost is low, and simple MAPReduce statistics can be quickly realized through SQL-like statements, which is very suitable for data warehouse statistics. In the process of Hive learning, we are bound to come into contact with partitions, which is a form of data storage in Hive. When querying data, we use partition columns to filter, just scan the data under the corresponding directory directly according to the column value, do not scan other unconcerned partitions, locate quickly and improve the query efficiency. Partition is divided into two forms: static partition and dynamic partition.
Static partition
If the value of a partition is determined, it is called a static partition. When you add a partition or load partition data, the partition name is already specified.
Create table if not exists day_part1 (
Uid int
Uname string
)
Partitioned by (year int,month int)
Row format delimited fields terminated by'\ t'
# # loading data to specify Partition
Load data local inpath'/ root/Desktop/student.txt' into table day_part1 partition (year=2017,month=04)
# # adding Partition to specify Partition name
Alter table day_part1 add partition (year=2017,month=1) partition (year=2016,month=12)
Dynamic partition
The value of the partition is uncertain and is determined by the input data
1. Relevant attributes of dynamic partitions:
Hive.exec.dynamic.partition=true: whether dynamic partitioning is allowed
Hive.exec.dynamic.partition.mode=strict: partition mode settin
Strict: at least one static partition is required
Nostrict: can be all dynamic partitions
Hive.exec.max.dynamic.partitions=1000: the maximum number of dynamic partitions allowed
Hive.exec.max.dynamic.partitions.pernode = 100: maximum partition allowed to be created by mapper/reducer on a single node
2. Operation of dynamic partition
# # creating temporary tables
Create table if not exists tmp
(uid int
Commentid bigint
Recommentid bigint
Year int
Month int
Day int)
Row format delimited fields terminated by'\ t'
# # loading data
Load data local inpath'/ root/Desktop/comm' into table tmp
# # creating dynamic Partition Table
Create table if not exists dyp1
(uid int
Commentid bigint
Recommentid bigint)
Partitioned by (year int,month int,day int)
Row format delimited fields terminated by'\ t'
# # strict mode
Insert into table dyp1 partition (year=2016,month,day)
Select uid,commentid,recommentid,month,day from tmp
# # non-strict mode
# # setting non-strict Mode dynamic Partition
Set hive.exec.dynamic.partition.mode=nostrict
# # creating dynamic Partition Table
Create table if not exists dyp2
(uid int
Commentid bigint
Recommentid bigint)
Partitioned by (year int,month int,day int)
Row format delimited fields terminated by'\ t'
# # loading data for non-strict Mode dynamic Partition
Insert into table dyp2 partition (year,month,day)
Select uid,commentid,recommentid,year,month,day from tmp
Pay attention to the details of the partition
1, try not to use dynamic partitioning, because when dynamic partitioning, the number of reducer will be allocated for each partition, when the number of partitions is large, the number of reducer will increase, which is a disaster to the server.
2, the difference between dynamic partitions and static partitions, static partitions will create the partition with or without data, dynamic partitions will be created with result sets, otherwise they will not be created.
3. The strict mode of hive dynamic partition and the strict mode of hive.mapred.mode provided by hive.
Hive provides us with a strict model: to prevent users from accidentally submitting malicious hql
Hive.mapred.mode=nostrict: strict
If the schema value is strict, the following three types of queries are blocked:
(1) for the query of partition table, the filter field in where is not a partition field.
(2) Cartesian product join query, join query statement, without on condition or where condition.
(3) for order by queries, queries with order by do not have limit statements.
These are all the contents of the article "what is the difference between static partitions and dynamic partitions of Hive". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.