Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Typical Application scenarios of hive

2025-03-26 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/03 Report--

Case 1:

Demand: there is such a batch of data, now requires: each user up to the maximum number of visits in a single month and the total number of visits accumulated to that month.

Data:

User name, month, and number of visits to AMagol 2015-01recorder 5A recorder 2015-01Magol 2015-01Magol 2015-01Magol 2015-01Magol 8BMagol 2015-01Mere 25AMagol 2015-02Mere 4AMen2015-02MIT 6BJE 2015-02MIT 10BEng 2015-02MIT 2015-03MIT 16APower2015-03MIT 2015-03MIT 22BEng 2015-03MIT 23BLI 2015-03MIT 10BEne 2015-02MIT 11

End result:

Maximum number of visits per month Total visits A 2015-01 33 33 33A 2015-02 33 43 10A 2015-03 38 81 38B 2015-01 30 30 30B 2015-02 30 45 15B 2015-03 44 89 44

Resolve:

# step01 counts the total number of visits per user per month create view view_step01 as select name,month,sum (visitCount) total from t_user group by name,month # step02 (self-connection, connection condition is name) create view view_step02 as select t1.name aname,t1.month amonth,t1.total atotal,t2.name bname,t2.month bmonth,t2.total btotal from view_step01 T1 join view_step01 T2 on t1.name = t2.name # step03 remove useless data Each group finds data less than or equal to its own month select bname,bmonth,max (btotal), sum (btotal), btotalfrom view_step02where unix_timestamp (amonth,'yyyy-MM') > = unix_timestamp (bmoth,'yyyy-MM') group by aname,amonth,atotal Case 2:

# Table creation statement:

CREATE TABLE `room` (`id`int (11) NOT NULL AUTO_INCREMENT PRIMARY KEY, `sid`int (11) DEFAULT NULL, `score`varchar (25525) DEFAULT NULL, `score`DEFAULT NULL (11) ENGINE=InnoDB DEFAULT CHARSET=utf8;# insert data INSERT INTO `room`VALUES (1,1, 'yuwen', 43); INSERT INTO `room`VALUES (2,1,' shuxue', 55); INSERT INTO `room`VALUES (3,2, 'yuwen', 77); INSERT INTO `room`VALUES (4,2,' shuxue', 88) INSERT INTO `room`VALUES (5,3, 'yuwen', 98); INSERT INTO `room`VALUES (6,3,' shuxue', 65)

Demand: the student numbers of all students whose scores in math courses are higher than those in Chinese courses.

Solution: (row-column conversion)

SELECT t1.sid FROM (SELECT sid, max (CASE `room` WHEN "yuwen" THEN score ELSE 0 END) AS "yuwen", max (CASE `room`WHEN "shuxue" THEN score ELSE 0 END) AS "shuxue" FROM `room`GROUP BY sid) T1 WHERE t1.yuwen

< t1.shuxue;案例三: 需求:比如:2010012325表示在2010年01月23日的气温为25度。现在要求使用hive,计算每一年出现过的最大气温的日期+温度。 数据: 年 温度 20140101 14 20140102 16 20140103 17 20140104 10 20140105 06 20120106 09 20120107 32 20120108 12 20120109 19 20120110 23 20010101 16 20010102 12 20010103 10 20010104 11 20010105 29 20130106 19 20130107 22 20130108 12 20130109 29 20130110 23 20080101 05 现在需要根据年月进行group by 但是最终的结果需要是20080101 05,也就是说,分组字段和最后保留的字段不相同,这时怎么办? 解决: #Step1:CREATE VIEW view_step1 AS SELECTsubstr( tmp, 1, 4 ) AS YEAR,max( substr( tmp, 9, 2 ) ) AS tmp FROM tmp GROUP BY substr( tmp, 1, 4 );#Step2:SELECT b.tmp, a.tmp FROM view_step1 a JOIN tmp b ON a.YEAR = substr( b.tmp, 1, 4 ) AND a.tmp = substr( b.tmp, 9, 2 );案例四: 数据 #表示有id为1,2,3的学生选修了课程a,b,c,d,e,f中其中几门:id course 1,a 1,b 1,c 1,e 2,a 2,c 2,d 2,f 3,a 3,b 3,c 3,e 需求:编写Hive的HQL语句来实现以下结果:表中的1表示选修,表中的0表示未选修。

Solution (solution 1):

# Row conversion select id max (case when course='a' then 1 else 0 and) as a, max (case when course='b' then 1 else 0 and) as b, max (case when course='c' then 1 else 0 and) as c, max (case when course='d' then 1 else 0 and) as d, max (case when course='e' then 1 else 0 and) ase, max (case when course='f' then 1 else 0 and) as ffrom course group by id

Solution (solution 2):

# collect_set function # step01create view id_courses as select a.course acourse,b.course bcourse,b.id id (select collect_set (course) as course from course) a join (selecet id, colect_set (course) as course from course group by id) b#step02select id,case when array_contains (bcourse,acourse [0]) then 1 else 0 end as a, case when array_contains (bcourse,acourse [1]) then 1 else 0 end as b, case when array_contains (bcourse,acourse [2]) then 1 else 0 end as c Case when array_contains (bcourse,acourse [3]) then 1 else 0 end as d, case when array_contains (bcourse,acourse [4]) then 1 else 0 end ase, case when array_contains (bcourse,acourse [5]) then 1 else 0 end as ffrom id_courses

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report