Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

The year, month and day are used for hive partition.

2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/03 Report--

Create a tabl

Create [external] table log_app_web (log map) partitioned by (year int,month int,day int)

ROW FORMAT DELIMITED COLLECTION ITEMS TERMINATED BY'\ t'

MAP KEYS TERMINATED BY'= 'STORED AS ORC

Table name: log_app_web

Field: Map type field log

Partition: partition year, month, day according to year, month and day

Split using'\ t'

KV in MAP uses'='

The data file is stored in ORCFILE format

When querying according to the date range, use the string that converts the installment time to the date to query the range.

Concat_ws ('-', cast (year as string), lpad (cast (month as string), 2scoop 0'), lpad (cast (day as string), 2je Ji 0') > ='"+ format.format (StartDay) +"'

Similar to yyyy-MM-dd > = '2016-04-28'

The above is a more alternative way.

The following is the next section to reprint others'

I. background

1. In a Hive Select query, the contents of the entire table are generally scanned, which consumes a lot of time to do unnecessary work. Sometimes only a portion of the data you care about in the table needs to be scanned, so the concept of partition is introduced when creating the table.

2. The partition table refers to the partition space of the partition specified when the table is created.

3. If you need to create a partitioned table, you need to call the optional parameter partitioned by in the create table. For more information, please see the syntax structure created by the table.

Second, technical details

1. A table can have one or more partitions, each in the form of a folder under the directory of the table folder.

2. Table and column names are not case-sensitive.

3. The partition exists in the table structure in the form of a field. You can see the existence of the field through the describe table command, but the field does not store the actual data content, only the representation of the partition.

4. Syntax for creating a table (for more information on partition, please see PARTITIONED BY parameter):

CREATE [EXTERNAL] TABLE [IF NOT EXISTS] table_name [(col_name data_type [COMMENT col_comment],...)] [COMMENT table_comment] [PARTITIONED BY (col_name data_type [COMMENT col_comment],...)] [CLUSTERED BY (col_name, col_name,...) [SORTED BY (col_name [ASC | DESC],...)] INTO num_buckets BUCKETS] [ROW FORMAT row_format] [STORED AS file_format] [LOCATION hdfs_path]

5. There are two types of partition tables, one is a single partition, that is to say, there is only a first-level folder directory under the table folder directory. The other is multi-partition, where there is a multi-folder nesting pattern under the table folder.

A, single partition table statement: create table day_table (id int, content string) partitioned by (dt string); single partition table, partitioned by day, there are three id,content,dt columns in the table structure.

B, double-partition table statement: create table day_hour_table (id int, content string) partitioned by (dt string, hour string); double-partition table, partitioned by day and hour, adding dt and hour columns to the table structure.

Schematic diagram of table folder directory (multi-partition table):

6. Add partition table syntax (table has been created, add partition on this basis):

ALTER TABLE table_name ADD partition_spec [LOCATION 'location1'] partition_spec [LOCATION' location2']... Partition_spec:: PARTITION (partition_col = partition_col_value, partition_col = partiton_col_value,...)

Users can use ALTER TABLE ADD PARTITION to add partitions to a table. Put quotation marks when the partition name is a string. Example:

ALTER TABLE day_table ADD PARTITION (dt='2008-08-08, hour='08') location'/ path/pv1.txt' PARTITION (dt='2008-08-08, hour='09') location'/ path/pv2.txt'

7. Delete partition syntax:

ALTER TABLE table_name DROP partition_spec, partition_spec,...

Users can use ALTER TABLE DROP PARTITION to delete partitions. The metadata and data of the partition will be deleted together. Example:

ALTER TABLE day_hour_table DROP PARTITION (dt='2008-08-08, hour='09')

8. The data is loaded into the partition table syntax:

LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename [PARTITION (partcol1=val1, partcol2=val2...)]

Example:

LOAD DATA INPATH'/ user/pv.txt' INTO TABLE day_hour_table PARTITION (dt='2008-08-08, hour='08'); LOAD DATA local INPATH'/ user/hua/*' INTO TABLE day_hour partition (dt='2010-07-07')

When the data is loaded into the table, no conversion is made to the data. The Load operation simply copies the data to the location corresponding to the Hive table. When the data is loaded, a directory is automatically created under the table, and the files are stored under the partition.

9. Statements for partition-based queries:

SELECT day_table.* FROM day_table WHERE day_table.dt > = '2008-08-08'

10. View the partition statement:

Hive > show partitions day_hour_table; OK dt=2008-08-08/hour=08 dt=2008-08-08/hour=09 dt=2008-08-09/hour=09

III. Summary

1. In Hive, a Partition in a table corresponds to a directory under the table, and all Partition data is stored in the directory of the most word set.

2. Generally speaking, partition is an auxiliary query, narrowing the scope of query, speeding up the speed of data retrieval and managing the data according to certain specifications and conditions.

This is quite common.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report