In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)05/31 Report--
This article mainly introduces the example analysis of hive native and compound data, which is very detailed and has a certain reference value. Interested friends must read it!
Primary type
Native types include TINYINT,SMALLINT,INT,BIGINT,BOOLEAN,FLOAT,DOUBLE,STRING,BINARY (available only after Hive 0.8.0) and TIMESTAMP (available only if Hive is above 0.8.0). These data are easy to load, as long as you set the column delimiter and output it to a file according to the column delimiter.
Suppose there is such a user login form.
CREATE TABLE login (uid BIGINT, ip STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY', 'STORED AS TEXTFILE
This means that the ip field and uid field of the login form are separated by the delimiter','.
Output data corresponding to hive table
# printf "% s printf% s\ n" 3105007001 192.168.1.1 > > login.txt # printf "% s department% s\ n" 3105007002 192.168.1.2 > > login.txt
The content of login.txt:
# cat login.txt 3105007001192.168.1.13105007002192.168.1.2
Load data into the hive table
LOAD DATA LOCAL INPATH'/ home/hadoop/login.txt' OVERWRITE INTO TABLE login PARTITION (dt='20130101')
View data
Select uid,ip from login where dt='20130101';3105007001 192.168.1.13105007002 192.168.1.2array
Suppose the landing form is
CREATE TABLE login_array (ip STRING, uid array) PARTITIONED BY (dt STRING) ROW FORMAT DELIMITEDFIELDS TERMINATED BY', 'COLLECTION ITEMS TERMINATED BY' | 'STORED AS TEXTFILE
This means that the login table has multiple users per ip, with the ip and uid fields separated by', 'and the elements of the uid array separated by' |'.
Output data corresponding to hive table
# printf "% s printf% s |% s |% s\ n" 192.168.1.1 3105007010 3105007011 3105007012 > > login_array.txt# printf "% s% s |% s |% s\ n" 192.168.1.2 3105007020 3105007021 3105007022 > > login_array.txt
The content of login_array.txt:
Cat login_array.txt 192.168.1.1 3105007010 | 3105007011 | 3105007012192.168.1.2 3105007020 | 3105007021 | 3105007022
Load data into the hive table
LOAD DATA LOCAL INPATH'/ home/hadoop/login_array.txt' OVERWRITE INTO TABLE login_array PARTITION (dt='20130101')
View data
Select ip,uid from login_array where dt='20130101';192.168.1.1 [3105007010,3105007011,3105007012] 192.168.1.2 [3105007020,3105007021,3105007022]
Use array
Select ip,uid [0] from login_array where dt='20130101';-use subscript to access array select ip,size (uid) from login_array where dt='20130101'; # to view array length select ip from login_array where dt='20130101' where array_contains (uid,'3105007011'); # array lookup
For more operations, see https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-CollectionFunctions.
Map
Suppose the landing form is
CREATE TABLE login_map (ip STRING, uid STRING, gameinfo map) PARTITIONED BY (dt STRING) ROW FORMAT DELIMITEDFIELDS TERMINATED BY', 'COLLECTION ITEMS TERMINATED BY' | 'MAP KEYS TERMINATED BY': 'STORED AS TEXTFILE
This means that each user will have game information on the login form, and the user has multiple game information. Key is the name of the game and value is the score of the game. Key and value in map are separated by':', and elements of map are separated by'|'.
Output data corresponding to hive table
# printf "% srecade% srecade% s |% svv% s |% svv% s\ n" 192.168.1.1 3105007010 wow 10 cf 1 qqgame 2 > > login_map.txt# printf "% sreco% sGV% s |% sRO% s |% sRO% s\ n" 192.168.1.2 3105007012 wow 20 cf 21 qqgame 22 > > login_map.txt "
The content of login_map.txt:
# cat login_map.txt192.168.1.1,3105007010,wow:10 | cf:1 | qqgame:2192.168.1.2,3105007012,wow:20 | cf:21 | qqgame:22
Load data into the hive table
LOAD DATA LOCAL INPATH'/ home/hadoop/login_map.txt' OVERWRITE INTO TABLE login_map PARTITION (dt='20130101')
View data
Select ip,uid,gameinfo from login_map where dt='20130101';192.168.1.1 3105007010 {"wow": 10, "cf": 1, "qqgame": 2} 192.168.1.2 3105007012 {"wow": 20, "cf": 21, "qqgame": 22}
Use map
Select ip,uid,gameinfo ['wow'] from login_map where dt='20130101';-use the subscript to access mapselect ip,uid,size (gameinfo) from login_map where dt='20130101'; # to view the map length select ip,uid from login_map where dt='20130101' where array_contains (map_keys (gameinfo),' wow'); # check map's key to find a record of playing wow games
For more operations, see https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-CollectionFunctions.
Struct
Suppose the landing form is
CREATE TABLE login_struct (ip STRING, user struct) PARTITIONED BY (dt STRING) ROW FORMAT DELIMITEDFIELDS TERMINATED BY', 'COLLECTION ITEMS TERMINATED BY' | 'MAP KEYS TERMINATED BY': 'STORED AS TEXTFILE
User is a struct that contains the user uid and the user name, respectively.
Output data corresponding to hive table
Printf "% s% s |% s |\ n" 192.168.1.1 3105007010 blue > > login_struct.txtprintf "% s% s |% s |\ n" 192.168.1.2 3105007012 ggjucheng > > login_struct.txt
The content of login_struct.txt:
# cat login_struct.txt192.168.1.1,3105007010 | blue192.168.1.2,3105007012 | ggjucheng
Load data into the hive table
LOAD DATA LOCAL INPATH'/ home/hadoop/login_struct.txt' OVERWRITE INTO TABLE login_struct PARTITION (dt='20130101')
View data
Select ip,user from login_struct where dt='20130101';192.168.1.1 {"uid": 3105007010, "name": "blue"} 192.168.1.2 {"uid": 3105007012, "name": "ggjucheng"}
Use struct
Select ip,user.uid,user.name from login_map where dt='20130101';union
Use less, don't talk about it for the time being
The above is all the content of the article "sample Analysis of hive Native and Composite data". Thank you for reading! Hope to share the content to help you, more related knowledge, welcome to follow the industry information channel!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.