In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
Https://www.cnblogs.com/qingyunzong/category/1191578.html
I. data type
1. Basic data types
Hive supports most basic data types in relational data
Boolean true/false TRUE
Signed integer of tinyint 1 byte-128 to 127 1Y
Smallint 2-byte signed integer,-32768mm 32767 1S
Int 4-byte signed integer 1
Bigint 8-byte signed integer 1L
Float 4-byte single-precision floating point number 1.0
Double 8-byte double-precision floating-point number 1.0
Deicimal signed decimal 1.0 with arbitrary precision
String string, variable length "a", "b"
Varchar variable length string "a",'b'
Char fixed-length strings "a",'b'
Binary byte array cannot be represented
Timestamp timestamp with nanosecond precision of 122327493795
Date date '2018-04-07'
Like other SQL languages, these are reserved words. It is important to note that all of these data types are implementations of interfaces in Java, so the specific behavior details of these types are exactly the same as those in Java. For example, the string type implements String,float in Java, float in Java, and so on.
2. Complex types
Array ordered sets of the same type array (1)
Map key-value,key must be a primitive type, and value can be of any type map.
A collection of struct fields, which can be of different types, struct (1), named_stract ('col1','1','col2',1,'clo3',1.0).
II. Storage format
Hive creates a directory on HDFS for each database created, the tables of the database are stored as subdirectories, and the data in the table is stored as files under the table directory. For default databases, the default database does not have its own directory, and tables under the default database are stored in the / user/hive/warehouse directory by default.
(1) textfile
Textfile is the default format and is stored in line storage. The data is not compressed, the cost of disk is high, and the cost of data analysis is high.
(2) SequenceFile
SequenceFile is a kind of binary file support provided by Hadoop API, which is easy to use, divisible and compressible.
SequenceFile supports three compression options: NONE, RECORD, and BLOCK. Record compression ratio is low, it is generally recommended to use BLOCK compression.
(3) RCFile
The utility model relates to a storage mode that combines row and column storage.
(4) ORCFile
Data is divided into blocks by rows, each block is stored in columns, and each block is stored with an index. The new format given by hive belongs to the upgraded version of RCFILE, the performance has been greatly improved, and the data can be compressed and stored, compressed fast column access.
(5) Parquet
Parquet is also a kind of row storage with good compression performance and can reduce a lot of table scan and deserialization time.
III. Data format
When data is stored in a text file, rows and columns must be distinguished according to a certain format, and these distinguishers must be indicated in the Hive. By default, Hive uses a few characters that are rarely seen, and these characters generally do not appear in the record as content.
The default row and column delimiters for Hive are shown in the following table.
Separator
Description
\ nfor text files, each line is a record, so\ nSegment the record.
^ A (Ctrl+A) split fields, which can also be represented by\ 001
^ B (Ctrl+B) is used to split elements in Arrary or Struct, or between keys and values in map, or it can be split with\ 002.
^ C is used for self-separation of keys and values in map, or can be represented by\ 003.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.