Hive-1.2.0 Learning Notes (2) Hive data types 07/06 Update SLTechnology News&Howtos

Hive-1.2.0 Learning Notes (2) Hive data types

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

Lu Chunli's work notes, who said that programmers should not have literary style?

Create a mywork database in hive, where future tests are conducted to avoid using the default database every time.

Hive > create database mywork;OKTime taken: 0.487 secondshive > show databases;OKdefaultmyworkTime taken: 0.614 seconds, Fetched: 2 row (s) hive > hive > use mywork;OKTime taken: 0.064 secondshive > create table student (id int, name string); OKTime taken: 0.519 secondshive >

View the storage of Hive on HDFS

[hadoop@dnode1] $hdfs dfs-ls-R / user/hivedrwxrw-rw--hadoop hadoop 0 2015-12-08 21:37 / user/hive/warehousedrwxrw-rw--hadoop hadoop 0 2015-12-08 21:36 / user/hive/warehouse/mywork.dbdrwxrw-rw--hadoop hadoop 0 2015-12-08 21:36 / user/hive/warehouse/mywork.db/student [hadoop@dnode1 ~] $

The data types supported by Hive are as follows:

Native type:

TINYINT 1-byte SMALLINT 2-byte INT 4-byte BIGINT 8-byte BOOLEAN true/falseFLOAT 4-byte DOUBLE 8-byte STRING string BINARY (available above Hive 0.8.0) TIMESTAMP (available only above Hive 0.8.0)

Compound type:

Arrays: ARRAY ordered field, type must be the same maps: MAP unordered key / value pair structs: STRUCT A set of named fields union: UNIONTYPE

Note: the ARRAY data type obtains the value through subscript, such as arrays [0], MAP is accessed through ["specify domain name"], and STRUCT type is accessed by point method (such as structs.col_name).

Build an example:

Hive > create table employee (> eno int comment 'the no of employee', > ename string comment' name of employee', > salary float comment 'salary of employee', > subordinates array comment' employees managed by current employee', > deductions map comment 'deductions', > address struct comment' address' >) comment 'This is table of employee info';OKTime taken: 0.33 secondshive >

Different delimiters are used between the columns in the Hive and within the composite type, with one record for each row of data.

Create a file data_default.txt file under the ${HIVE_HOME} / data directory with the default delimiter as follows:

The default field separator of Hive is the control character of ascii code\ 001, and the table is created with fields terminated by'\ 001'. Create the data in the vi open file, use ctrl+v and then ctrl+a to enter the control character\ 001 (i.e. ^ A). In order, the input mode of\ 002 is ctrl+v,ctrl+b. and so on.

Description:

1000 employee number zhangsan employee name 5000.0 employee salary Lisi ^ Bwangwu subordinate employee tax ^ C200 ^ Bpen ^ C200 wage deduction amount (e.g. tax, etc.) struct structure only needs to specify the value)

Load data

Hive > load data local inpath 'data/data_default.txt' into table employee;Loading data to table mywork.employeeTable mywork.employee stats: [numFiles=1, numRows=0, totalSize=83, rawDataSize=0] OKTime taken: 0.426 secondshive > select * from employee OK1000 zhangsan 5000.0 ["lisi", "wangwu"] {"ptax": 200.0, "pension": 200.0} {"province": "shandong", "city": "heze", "street": "dingtao", "zip": 274106} Time taken: 0.114 seconds, Fetched: 1 row (s) hive > # for composite type data query methods are as follows: hive > select eno, ename, salary, subordinates [0], deductions ['ptax'], address.province from employee OK1000 zhangsan 5000.0 lisi 200.0 shandongTime taken: 0.129 seconds, Fetched: 1 row (s) hive >

View HDFS data structures

[hadoop@dnode1] $hdfs dfs-ls-R / user/hive/warehouse/drwxrw-rw--hadoop hadoop 0 2015-12-09 00:00 / user/hive/warehouse/mywork.dbdrwxrw-rw--hadoop hadoop 0 2015-12-09 00:00 / user/hive/warehouse/mywork.db/employee-rwxrw-rw- 2 hadoop hadoop 83 2015-12-09 00:00 / user/hive/warehouse/mywork.db/employee/ Data_default.txtdrwxrw-rw--hadoop hadoop 0 2015-12-08 23:03 / user/hive/warehouse/mywork.db/student [hadoop@dnode1 ~] $hdfs dfs-text / user/hive/warehouse/mywork.db/employee/data_default.txt1000zhangsan5000.0lisiwangwuptax200pension200shandonghezedingtao274106 [hadoop@dnode1 ~] $

Custom delimiter:

Hive > create table employee_02 (> eno int comment 'the no of employee', > ename string comment' name of employee', > salary float comment 'salary of employee', > subordinates array comment' employees managed by current employee', > deductions map comment 'deductions', > address struct comment' address' >) comment 'This is table of employee info' > row format delimited fields terminated by'\ t'> collection items terminated by' '> map keys terminated by':'> lines terminated by'\ n' OKTime taken: 0.228 secondshive > load data local inpath 'data/data_employee02.txt' into table employee_02;Loading data to table mywork.employee_02Table mywork.employee_02 stats: [numFiles=1, totalSize=99] OKTime taken: 0.371 secondshive > select * from employee_02 OK1000 'zhangsan' 5000.0 ["' lisi'", "'wangwu'"] {"' ptax'": 200.0, "'pension'": 200.0} {"province": "' shandong'", "city": "'heze'", "street": "dingtao'", "zip": 274106} Time taken: 0.101 seconds, Fetched: 1 row (s) hive >

The content of data/employee02.txt file is

[hadoop@nnode data] $pwd/usr/local/hive1.2.0/data [hadoop@nnode data] $cat data_employee02.txt 1000 'zhangsan' 5000.0' lisi','wangwu' 'ptax':200,'pension':200' shandong','heze','dingtao',274106 [hadoop@nnode data] $

Note: because there are single quotation marks in the text file, in the load to hive table after the attribute plus double quotation marks, where the single quotation mark is considered to be part of the attribute or value, need to note.

View the schedule definition

# set hive > describe formatted employee by default when creating a table OK# col_name data_type comment eno int the no of employee ename string name of employee salary float salary of employee subordinates array employees Managed by current employeedeductions map deductions address struct address # Detailed Table Information Database: mywork Owner: hadoop CreateTime: Tue Dec 08 23:10:07 CST 2015 LastAccessTime: UNKNOWN Protect Mode: None Retention: 0 Location: hdfs://cluster/user/hive/warehouse/mywork.db/employee Table Type: MANAGED_TABLE Table Parameters: COLUMN_STATS_ACCURATE true Comment This is table of employee info numFiles 1 numRows 0 rawDataSize 0 totalSize 83 transient_lastDdlTime 1449590423 # Storage Information SerDe Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDeInputFormat: org.apache.hadoop.mapred.TextInputFormatOutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormatCompressed: No Num Buckets:-1 Bucket Columns : [] Sort Columns: [] Storage Desc Params: serialization.format 1 Time taken: 0.098 seconds Fetched: 37 row (s) # the delimiter hive > describe formatted employee_02 is defined when creating the table OK# col_name data_type comment eno int the no of employee ename string name of employeesalary float salary of employee subordinates array employees managed by Current employeedeductions map deductions address struct address # Detailed Table Information Database: mywork Owner: hadoop CreateTime: Wed Dec 09 00:12:53 CST 2015 LastAccessTime: UNKNOWN Protect Mode: None Retention: 0 Location: hdfs://cluster/user/hive/warehouse/mywork.db/employee_02 Table Type: MANAGED_TABLE Table Parameters: COLUMN_STATS_ACCURATE true Comment This is table of employee info numFiles 1 totalSize 99 transient_lastDdlTime 1449591260 # Storage Information SerDe Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDeInputFormat: org.apache.hadoop .mapred.TextInputFormatOutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormatCompressed: No Num Buckets:-1 Bucket Columns: [] Sort Columns: [] Storage Desc Params: colelction.delim Field.delim\ t line.delim\ n mapkey.delim: serialization.format\ t Time taken: 0.116 seconds, Fetched: 39 row (s) hive >

Remaining questions:

Hive > delete from student;FAILED: SemanticException [Error 10294]: Attempt to do update or delete using transaction manager that does not support these operations.hive >

Note: if the sql statement contains content in tab format, the following problems occur

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.