Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What is the difference between static partition and dynamic partition of Hive

2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/02 Report--

Editor to share with you what is the difference between static and dynamic partitions of Hive, I believe most people do not know much about it, so share this article for your reference, I hope you can learn a lot after reading this article, let's go to know it!

Hive is a data warehouse tool of opportunity Hadoop, its learning cost is low, and simple MAPReduce statistics can be quickly realized through SQL-like statements, which is very suitable for data warehouse statistics. In the process of Hive learning, we are bound to come into contact with partitions, which is a form of data storage in Hive. When querying data, we use partition columns to filter, just scan the data under the corresponding directory directly according to the column value, do not scan other unconcerned partitions, locate quickly and improve the query efficiency. Partition is divided into two forms: static partition and dynamic partition.

Static partition

If the value of a partition is determined, it is called a static partition. When you add a partition or load partition data, the partition name is already specified.

Create table if not exists day_part1 (

Uid int

Uname string

)

Partitioned by (year int,month int)

Row format delimited fields terminated by'\ t'

# # loading data to specify Partition

Load data local inpath'/ root/Desktop/student.txt' into table day_part1 partition (year=2017,month=04)

# # adding Partition to specify Partition name

Alter table day_part1 add partition (year=2017,month=1) partition (year=2016,month=12)

Dynamic partition

The value of the partition is uncertain and is determined by the input data

1. Relevant attributes of dynamic partitions:

Hive.exec.dynamic.partition=true: whether dynamic partitioning is allowed

Hive.exec.dynamic.partition.mode=strict: partition mode settin

Strict: at least one static partition is required

Nostrict: can be all dynamic partitions

Hive.exec.max.dynamic.partitions=1000: the maximum number of dynamic partitions allowed

Hive.exec.max.dynamic.partitions.pernode = 100: maximum partition allowed to be created by mapper/reducer on a single node

2. Operation of dynamic partition

# # creating temporary tables

Create table if not exists tmp

(uid int

Commentid bigint

Recommentid bigint

Year int

Month int

Day int)

Row format delimited fields terminated by'\ t'

# # loading data

Load data local inpath'/ root/Desktop/comm' into table tmp

# # creating dynamic Partition Table

Create table if not exists dyp1

(uid int

Commentid bigint

Recommentid bigint)

Partitioned by (year int,month int,day int)

Row format delimited fields terminated by'\ t'

# # strict mode

Insert into table dyp1 partition (year=2016,month,day)

Select uid,commentid,recommentid,month,day from tmp

# # non-strict mode

# # setting non-strict Mode dynamic Partition

Set hive.exec.dynamic.partition.mode=nostrict

# # creating dynamic Partition Table

Create table if not exists dyp2

(uid int

Commentid bigint

Recommentid bigint)

Partitioned by (year int,month int,day int)

Row format delimited fields terminated by'\ t'

# # loading data for non-strict Mode dynamic Partition

Insert into table dyp2 partition (year,month,day)

Select uid,commentid,recommentid,year,month,day from tmp

Pay attention to the details of the partition

1, try not to use dynamic partitioning, because when dynamic partitioning, the number of reducer will be allocated for each partition, when the number of partitions is large, the number of reducer will increase, which is a disaster to the server.

2, the difference between dynamic partitions and static partitions, static partitions will create the partition with or without data, dynamic partitions will be created with result sets, otherwise they will not be created.

3. The strict mode of hive dynamic partition and the strict mode of hive.mapred.mode provided by hive.

Hive provides us with a strict model: to prevent users from accidentally submitting malicious hql

Hive.mapred.mode=nostrict: strict

If the schema value is strict, the following three types of queries are blocked:

(1) for the query of partition table, the filter field in where is not a partition field.

(2) Cartesian product join query, join query statement, without on condition or where condition.

(3) for order by queries, queries with order by do not have limit statements.

These are all the contents of the article "what is the difference between static partitions and dynamic partitions of Hive". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report