Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Example Analysis of static Partition and dynamic Partition in Hive

2025-01-29 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

This article shares with you the content of the sample analysis of static and dynamic partitions in Hive. The editor thinks it is very practical, so share it with you as a reference and follow the editor to have a look.

Partitioning is a way for hive to store data. Using the column value as a directory to store data is a partition. In this way, the partition column is used for filtering in the query, only the data under the corresponding directory is directly scanned according to the column value, and other unconcerned partitions are not scanned, so as to locate quickly and improve the query efficiency. There are two types of dynamic and static partitions:

1. Static partition: if the value of the partition is determined, it is called a static partition. When you add a partition or load partition data, the partition name is already specified.

Createtableifnotexistsday_part1 (

Uidint

Unamestring

)

Partitionedby (yearint,monthint)

Rowformatdelimitedfieldsterminatedby''

# # loading data to specify Partition

Loaddatalocalinpath'/root/Desktop/student.txt'intotableday_part1partition (year=2017,month=04)

# # adding Partition to specify Partition name

Altertableday_part1addpartition (year=2017,month=1) partition (year=2016,month=12)

two。 Dynamic partitioning: the value of the partition is indeterminate and is determined by input data

2.1 relevant properties of dynamic partitions:

Hive.exec.dynamic.partition=true: whether dynamic partitioning is allowed

Hive.exec.dynamic.partition.mode=strict: partition mode settin

Strict: at least one static partition is required

Nostrict: can be all dynamic partitions

Hive.exec.max.dynamic.partitions=1000: the maximum number of dynamic partitions allowed

Hive.exec.max.dynamic.partitions.pernode=100: maximum partition allowed to be created by a mapper/reducer on a single node

2.2 Operation of dynamic partitioning

# # creating temporary tables

Createtableifnotexiststmp (

Uidint

Commentidbigint

Recommentidbigint

Yearint

Monthint

Dayint

)

Rowformatdelimitedfieldsterminatedby''

# # loading data

Loaddatalocalinpath'/root/Desktop/comm'intotabletmp

# # creating dynamic Partition Table

Createtableifnotexistsdyp1 (

Uidint

Commentidbigint

Recommentidbigint

)

Partitionedby (yearint,monthint,dayint)

Rowformatdelimitedfieldsterminatedby''

# # strict mode

Insertintotabledyp1partition (year=2016,month,day)

Selectuid,commentid,recommentid,month,dayfromtmp

# # non-strict mode

# # setting non-strict Mode dynamic Partition

Sethive.exec.dynamic.partition.mode=nostrict

# # creating dynamic Partition Table

Createtableifnotexistsdyp2 (

Uidint

Commentidbigint

Recommentidbigint

)

Partitionedby (yearint,monthint,dayint)

Rowformatdelimitedfieldsterminatedby''

# # loading data for non-strict Mode dynamic Partition

Insertintotabledyp2partition (year,month,day)

Selectuid,commentid,recommentid,year,month,dayfromtmp

3. Pay attention to the details of the partition

(1), try not to use dynamic partitioning, because when dynamic partitioning, the number of reducer will be allocated for each partition, when the number of partitions is large, the number of reducer will increase, which is a disaster to the server.

(2), the difference between dynamic partitions and static partitions, static partitions will create the partition with or without data, dynamic partitions will be created with the result set, otherwise they will not be created.

(3) the strict mode of hive dynamic partition and the strict mode of hive.mapred.mode provided by hive.

Hive provides us with a strict model: to prevent users from accidentally submitting malicious hql

Hive.mapred.mode=nostrict:strict

If the schema value is strict, the following three types of queries are blocked:

(1) for the query of partition table, the filter field in where is not a partition field.

(2) Cartesian product join query, join query statement, without on condition or where condition.

(3) for orderby queries, queries with orderby do not have limit statements.

Thank you for reading! This is the end of this article on "sample analysis of static and dynamic partitions in Hive". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, you can share it out for more people to see!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report