Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How does HIVE count the most popular TOP3 products in each region

2025-01-29 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

Editor to share with you how HIVE counts the most popular TOP3 products in various regions. I hope you will get something after reading this article. Let's discuss it together.

Environment:

Hadoop-2.6.0-cdh6.7.0

Hive-1.1.0-cdh6.7.0

Sqoop-1.4.6-cdh6.7.0

MySQL5.6.39

Demand: HIVE counts the most popular TOP3 products in each region, and exports the statistical results to MySQL

There are city_info city information table and product_info commodity information table in MySQL.

There are user_click user behavior logs in HIVE, which are partitioned by date

I. MySQL database building, table building and initialization data

1.1 MySQL create ruozedata database:

CREATE DATABASE ruozedata

1.2 create a city_info table

DROP TABLE if exists city_info

CREATE TABLE `city_ info` (

`city_ id` int (11) DEFAULT NULL

`city_ name` varchar (255) DEFAULT NULL

`area` varchar (255) DEFAULT NULL

) ENGINE=InnoDB DEFAULT CHARSET=utf8

Initialize city_ info table data

Insert into 'city_ info` (' city_ id`, 'city_ name`, `area`) values (1), (2), (2), (3), (3), (3), (4), (5), (6), (6), (7), (7), (8), (9), (10, and 10) 'NE')

1.3Create product_info table

DROP TABLE if exists product_info

CREATE TABLE `product_ info` (

`product_ id` int (11) DEFAULT NULL

`product_ name` varchar (255) DEFAULT NULL

`extend_ info` varchar (255) DEFAULT NULL

) ENGINE=InnoDB DEFAULT CHARSET=utf8

Initialize product_ info table data

Insert into product_info (product_id,product_name,extend_info) values (1 product_status: 1}')

Insert into product_info (product_id,product_name,extend_info) values (2 product_status: 1}')

Insert into product_info (product_id,product_name,extend_info) values (3) "product_status": 1}')

Insert into product_info (product_id,product_name,extend_info) values (4 product_status: 1}')

Insert into product_info (product_id,product_name,extend_info) values (5, product_status, 5, production, 5)

Insert into product_info (product_id,product_name,extend_info) values (6) "product_status": 1}')

Insert into product_info (product_id,product_name,extend_info) values (7 recordings, product7 copies,'{product_status ": 1}')

Insert into product_info (product_id,product_name,extend_info) values (8, product_status, 8, product_status: 1}')

Insert into product_info (product_id,product_name,extend_info) values (9 recordings production 9 minutes'{"product_status": 0}')

Insert into product_info (product_id,product_name,extend_info) values (10 product_status: 1}')

Insert into product_info (product_id,product_name,extend_info) values (11 recordings, product11 copies,'{product_status ": 0}')

Insert into product_info (product_id,product_name,extend_info) values (12 product_status: 0}')

Insert into product_info (product_id,product_name,extend_info) values (13dint product_status: 0}')

Insert into product_info (product_id,product_name,extend_info) values (14, product_status, 14, product_status: 0}')

Insert into product_info (product_id,product_name,extend_info) values (15, product_status, 15, production, 15: {"product_status": 1}')

Insert into product_info (product_id,product_name,extend_info) values (16 recordings, product16 copies,'{"product_status": 0}')

Insert into product_info (product_id,product_name,extend_info) values (17 product_status: 1}')

Insert into product_info (product_id,product_name,extend_info) values (18maths, product18packs,'{"product_status": 0}')

Insert into product_info (product_id,product_name,extend_info) values (19recast, production, 19pm,'{product_status ": 1}')

Insert into product_info (product_id,product_name,extend_info) values (20 product_status: 1}')

Insert into product_info (product_id,product_name,extend_info) values (21 product_status: 0}')

Insert into product_info (product_id,product_name,extend_info) values (22 product_status: 0}')

Insert into product_info (product_id,product_name,extend_info) values (23mai product_status: 0}')

Insert into product_info (product_id,product_name,extend_info) values (24recordings, product24copies,'{"product_status": 0}')

Insert into product_info (product_id,product_name,extend_info) values (25, "product_status": 1}')

Insert into product_info (product_id,product_name,extend_info) values (26maths, product26th,'{"product_status": 1}')

Insert into product_info (product_id,product_name,extend_info) values (27 recordings, product27 copies,'{product_status ": 0}')

Insert into product_info (product_id,product_name,extend_info) values (28 recordings, product28,'{"product_status": 1}')

Insert into product_info (product_id,product_name,extend_info) values (29 product_status: 0}')

Insert into product_info (product_id,product_name,extend_info) values (30 product_status: 0}')

Insert into product_info (product_id,product_name,extend_info) values (31 product_status: 0}')

Insert into product_info (product_id,product_name,extend_info) values (32 product_status product_status: 0}')

Insert into product_info (product_id,product_name,extend_info) values (33 product_status: 1}')

Insert into product_info (product_id,product_name,extend_info) values (34 product_status: 1}')

Insert into product_info (product_id,product_name,extend_info) values (35 product_status: 0}')

Insert into product_info (product_id,product_name,extend_info) values (36 cinematic products 36 minutes,'{"product_status": 0}')

Insert into product_info (product_id,product_name,extend_info) values (37 product_status: 1}')

Insert into product_info (product_id,product_name,extend_info) values (38 product_status: 0}')

Insert into product_info (product_id,product_name,extend_info) values (39 product_status: 0}')

Insert into product_info (product_id,product_name,extend_info) values (40 recordings, product40 copies,'{"product_status": 1}')

Insert into product_info (product_id,product_name,extend_info) values (41 product_status: 1}')

Insert into product_info (product_id,product_name,extend_info) values (42 product_status: 1}')

Insert into product_info (product_id,product_name,extend_info) values (43 product_status: 1}')

Insert into product_info (product_id,product_name,extend_info) values (44 product_status: 0}')

Insert into product_info (product_id,product_name,extend_info) values (45 product_status: 1}')

Insert into product_info (product_id,product_name,extend_info) values (46 cinematic products 46 minutes'{"product_status": 1}')

Insert into product_info (product_id,product_name,extend_info) values (47 product_status: 0}')

Insert into product_info (product_id,product_name,extend_info) values (48 product_status: 1}')

Insert into product_info (product_id,product_name,extend_info) values (49 product_status: 1}')

Insert into product_info (product_id,product_name,extend_info) values (50, product_status, 50, production, 50: 1}')

Insert into product_info (product_id,product_name,extend_info) values (51 product_status: 1}')

Insert into product_info (product_id,product_name,extend_info) values (52 product_status product_status: 0}')

Insert into product_info (product_id,product_name,extend_info) values (53 product_status: 0}')

Insert into product_info (product_id,product_name,extend_info) values (54 product_status product_status: 1}')

Insert into product_info (product_id,product_name,extend_info) values (55 product_status: 0}')

Insert into product_info (product_id,product_name,extend_info) values (56 product_status product_status: 0}')

Insert into product_info (product_id,product_name,extend_info) values (57 product_status: 1}')

Insert into product_info (product_id,product_name,extend_info) values (58 product_status: 1}')

Insert into product_info (product_id,product_name,extend_info) values (59 product_status: 1}')

Insert into product_info (product_id,product_name,extend_info) values (60 product_status: 1}')

Insert into product_info (product_id,product_name,extend_info) values (61 product_status: 0}')

Insert into product_info (product_id,product_name,extend_info) values (62 product_status: 1}')

Insert into product_info (product_id,product_name,extend_info) values (63 product_status: 1}')

Insert into product_info (product_id,product_name,extend_info) values (64 product_status: 0}')

Insert into product_info (product_id,product_name,extend_info) values (65 product_status: 0}')

Insert into product_info (product_id,product_name,extend_info) values (66 product_status: 1}')

Insert into product_info (product_id,product_name,extend_info) values (67 product_status: 1}')

Insert into product_info (product_id,product_name,extend_info) values (68mai / product68pm /'{"product_status": 0}')

Insert into product_info (product_id,product_name,extend_info) values (69 product_status: 1}')

Insert into product_info (product_id,product_name,extend_info) values (70 product_status: 0}')

Insert into product_info (product_id,product_name,extend_info) values (71 product_status: 0}')

Insert into product_info (product_id,product_name,extend_info) values (72 product_status: 0}')

Insert into product_info (product_id,product_name,extend_info) values (73 product_status: 1}')

Insert into product_info (product_id,product_name,extend_info) values (74 product_status: 0}')

Insert into product_info (product_id,product_name,extend_info) values (75 product_status: 1}')

Insert into product_info (product_id,product_name,extend_info) values (76 cinematic products 76 minutes'{"product_status": 0}')

Insert into product_info (product_id,product_name,extend_info) values (77 product_status: 0}')

Insert into product_info (product_id,product_name,extend_info) values (78 product_status: 1}')

Insert into product_info (product_id,product_name,extend_info) values (79 product_status: 0}')

Insert into product_info (product_id,product_name,extend_info) values (80 product_status: 0}')

Insert into product_info (product_id,product_name,extend_info) values (81 product_status: 0}')

Insert into product_info (product_id,product_name,extend_info) values (82 product_status: 1}')

Insert into product_info (product_id,product_name,extend_info) values (83 product_status: 1}')

Insert into product_info (product_id,product_name,extend_info) values (84 product_status: 1}')

Insert into product_info (product_id,product_name,extend_info) values (85 product_status: 0}')

Insert into product_info (product_id,product_name,extend_info) values (86 product_status: 1}')

Insert into product_info (product_id,product_name,extend_info) values (87 product_status: 1}')

Insert into product_info (product_id,product_name,extend_info) values (88 product_status: 1}')

Insert into product_info (product_id,product_name,extend_info) values (89 product_status: 1}')

Insert into product_info (product_id,product_name,extend_info) values (90th product_status: 1}')

Insert into product_info (product_id,product_name,extend_info) values (91 product_status: 1}')

Insert into product_info (product_id,product_name,extend_info) values (92 product_status: 0}')

Insert into product_info (product_id,product_name,extend_info) values (93 product_status: 0}')

Insert into product_info (product_id,product_name,extend_info) values (94 product_status: 1}')

Insert into product_info (product_id,product_name,extend_info) values (95 product_status: 0}')

Insert into product_info (product_id,product_name,extend_info) values (96 product_status product_status: 0}')

Insert into product_info (product_id,product_name,extend_info) values (97 product_status: 1}')

Insert into product_info (product_id,product_name,extend_info) values (98 product_status: 1}')

Insert into product_info (product_id,product_name,extend_info) values (99 product_status / product 99: 0}')

Insert into product_info (product_id,product_name,extend_info) values (100 product_status: 1}')

1.4 create MySQL database product statistical table product_stat

Drop table if exists `product_ stat`

CREATE TABLE `product_ stat` (

`product_ id` int (11) DEFAULT NULL

, `product_ name` varchar (255) DEFAULT NULL

, `area` varchar (255) DEFAULT NULL

, `click_ count` int (11)

, `rank` int

, `days` varchar (20)

) ENGINE=InnoDB DEFAULT CHARSET=utf8

Use sqoop to import data from MySQL's city_ info table and product_info table into hive (including creating tables)

2.1 check the tables in hive before you operate

Hive > show databases

OK

Default

Hive2

Hive2_ruozedata

Ruozedata

Sqoophive

Time taken: 0.06 seconds, Fetched: 5 row (s)

Hive > use ruozedata

OK

Time taken: 0.069 seconds

Hive > show tables

OK

Emp_hive

Gw_test

Hive_wc

Rating_json

Ruozedata_emp

Ruozedata_emp_managed

Ruozedata_emp_partition

Ruozedata_person

Time taken: 0.061 seconds, Fetched: 8 row (s)

Hive >

2.2 Import the city_ info table (import data from MySQL to hive's ruozedata.city_info table):

Sqoop import\

-- connect jdbc:mysql://localhost:33066/ruozedata\

-- username root\

-- password root\

-- table city_info-m 1\

-- mapreduce-job-name FromMySQLToHive\

-- delete-target-dir\

-- create-hive-table\

-- hive-table ruozedata.city_info\

-hive-import-hive-overwrite

2.3 Import the product_ info table (import data from MySQL to hive's ruozedata.product_info table):

Sqoop import\

-- connect jdbc:mysql://localhost:33066/ruozedata\

-- username root\

-- password root\

-- table product_info-m 1\

-- mapreduce-job-name FromMySQLToHive\

-- delete-target-dir\

-- create-hive-table\

-- hive-table ruozedata.product_info\

-hive-import-hive-overwrite

2.4After operation, look at the table in hive:

Hive > show tables

OK

City_info

Emp_hive

Gw_test

Hive_wc

Product_info

Rating_json

Ruozedata_emp

Ruozedata_emp_managed

Ruozedata_emp_partition

Ruozedata_person

Time taken: 0.061 seconds, Fetched: 10 row (s)

Hive > select * from city_info

OK

1 BEIJING NC

2 SHANGHAI EC

3 NANJING EC

4 GUANGZHOU SC

5 SANYA SC

6 WUHAN CC

7 CHANGSHA CC

8 XIAN NW

9 CHENGDU SW

10 HAERBIN NE

Time taken: 0.554 seconds, Fetched: 10 row (s)

Hive > select * from product_info

OK

1 product1 {"product_status": 1}

2 product2 {"product_status": 1}

3 product3 {"product_status": 1}

4 product4 {"product_status": 1}

5 product5 {"product_status": 1}

6 product6 {"product_status": 1}

7 product7 {"product_status": 1}

8 product8 {"product_status": 1}

9 product9 {"product_status": 0}

10 product10 {"product_status": 1}

11 product11 {"product_status": 0}

12 product12 {"product_status": 0}

13 product13 {"product_status": 0}

14 product14 {"product_status": 0}

15 product15 {"product_status": 1}

16 product16 {"product_status": 0}

17 product17 {"product_status": 1}

18 product18 {"product_status": 0}

19 product19 {"product_status": 1}

20 product20 {"product_status": 1}

21 product21 {"product_status": 0}

22 product22 {"product_status": 0}

23 product23 {"product_status": 0}

24 product24 {"product_status": 0}

25 product25 {"product_status": 1}

26 product26 {"product_status": 1}

27 product27 {"product_status": 0}

28 product28 {"product_status": 1}

29 product29 {"product_status": 0}

30 product30 {"product_status": 0}

31 product31 {"product_status": 0}

32 product32 {"product_status": 0}

33 product33 {"product_status": 1}

34 product34 {"product_status": 1}

35 product35 {"product_status": 0}

36 product36 {"product_status": 0}

37 product37 {"product_status": 1}

38 product38 {"product_status": 0}

39 product39 {"product_status": 0}

40 product40 {"product_status": 1}

41 product41 {"product_status": 1}

42 product42 {"product_status": 1}

43 product43 {"product_status": 1}

44 product44 {"product_status": 0}

45 product45 {"product_status": 1}

46 product46 {"product_status": 1}

47 product47 {"product_status": 0}

48 product48 {"product_status": 1}

49 product49 {"product_status": 1}

50 product50 {"product_status": 1}

51 product51 {"product_status": 1}

52 product52 {"product_status": 0}

53 product53 {"product_status": 0}

54 product54 {"product_status": 1}

55 product55 {"product_status": 0}

56 product56 {"product_status": 0}

57 product57 {"product_status": 1}

58 product58 {"product_status": 1}

59 product59 {"product_status": 1}

60 product60 {"product_status": 1}

61 product61 {"product_status": 0}

62 product62 {"product_status": 1}

63 product63 {"product_status": 1}

64 product64 {"product_status": 0}

65 product65 {"product_status": 0}

66 product66 {"product_status": 1}

67 product67 {"product_status": 1}

68 product68 {"product_status": 0}

69 product69 {"product_status": 1}

70 product70 {"product_status": 0}

71 product71 {"product_status": 0}

72 product72 {"product_status": 0}

73 product73 {"product_status": 1}

74 product74 {"product_status": 0}

75 product75 {"product_status": 1}

76 product76 {"product_status": 0}

77 product77 {"product_status": 0}

78 product78 {"product_status": 1}

79 product79 {"product_status": 0}

80 product80 {"product_status": 0}

81 product81 {"product_status": 0}

82 product82 {"product_status": 1}

83 product83 {"product_status": 1}

84 product84 {"product_status": 1}

85 product85 {"product_status": 0}

86 product86 {"product_status": 1}

87 product87 {"product_status": 1}

88 product88 {"product_status": 1}

89 product89 {"product_status": 1}

90 product90 {"product_status": 1}

91 product91 {"product_status": 1}

92 product92 {"product_status": 0}

93 product93 {"product_status": 0}

94 product94 {"product_status": 1}

95 product95 {"product_status": 0}

96 product96 {"product_status": 0}

97 product97 {"product_status": 1}

98 product98 {"product_status": 1}

99 product99 {"product_status": 0}

100 product100 {"product_status": 1}

Time taken: 0.141 seconds, Fetched: 100row (s)

Hive >

Sqoop successfully imported data from MySQL's city_ info table and product_info table into hive (including creating tables)

Third, operate the table and data in hive, and finally make statistics.

3.1Create user behavior log table user_click partition table

Create table user_click

(

User_id int

, session_id string

, action_time string

, city_id int

, product_id int

)

Partitioned by (date string)

ROW FORMAT DELIMITED

FIELDS TERMINATED BY','

Upload user_click.txt data file and import the data into the table.

Upload the user_click.txt data file to the / home/hadoop/data directory.

[hadoop@hadoop002 data] $ll

Total 62912

-rw-rw-r--. 1 hadoop hadoop 652 Jun 4 16:34 emp.txt

-rw-r--r--. 1 hadoop hadoop 84 Jun 7 09:53 hive_row_number.txt

-rw-rw-r--. 1 hadoop hadoop 34 Jun 11 15:17 hive_wc.txt

-rw-r--r--. 1 hadoop hadoop 63602280 Jun 7 09:54 rating.json

-rw-rw-r--. 1 hadoop hadoop 67 Jun 6 18:30 student.txt

-rwxrwxrwx. 1 hadoop hadoop 725264 Jun 9 21:28 user_click.txt

[hadoop@hadoop002 data] $pwd

/ home/hadoop/data

Import data into the user_ Click table:

LOAD DATA LOCAL INPATH'/ home/hadoop/data/user_click.txt' OVERWRITE INTO TABLE user_click PARTITION (date='2018-06-20')

Create a product statistics partition table product_stat in hive to write statistical results

Create table product_stat

(

Product_id int

, product_name string

, area string

, click_count int

, rank int

, days string

)

Partitioned by (date string)

ROW FORMAT DELIMITED

FIELDS TERMINATED BY'\ t'

3.4 count the most popular TOP3 products in each region and write them into the statistical table product_stat

Insert overwrite table product_stat partition (date='2018-06-20')

Select

T.product_id

, t.product_name

, t.area

, t.click_count

, t.rank

, '2018-06-20' as days

From

(

Select

Ci.area as area

, uc.product_id as product_id

, pd.product_name as product_name

, count (uc.product_id) as click_count

, (row_number () over (partition by ci.area order by count (uc.product_id) desc)) as rank

From city_info ci

Left join user_click uc on uc.city_id = ci.city_id and uc.date='2018-06-20'

Left join product_info pd on pd.product_id = uc.product_id

Group by ci.area,uc.product_id,pd.product_name

) t where t.rank select * from product_stat where date='2018-06-20'

OK

7 product7 CC 39 1 2018-06-20 2018-06-20

26 product26 CC 39 2 2018-06-20 2018-06-20

70 product70 CC 38 3 2018-06-20 2018-06-20

4 product4 EC 40 1 2018-06-20 2018-06-20

96 product96 EC 32 2 2018-06-20 2018-06-20

99 product99 EC 31 3 2018-06-20 2018-06-20

9 product9 NC 16 1 2018-06-20 2018-06-20

40 product40 NC 16 2 2018-06-20 2018-06-20

94 product94 NC 13 3 2018-06-20 2018-06-20

NULL NULL NE 01 2018-06-20 2018-06-20

67 product67 NW 201 2018-06-20 2018-06-20

56 product56 NW 20 2 2018-06-20 2018-06-20

48 product48 NW 19 3 2018-06-20 2018-06-20

38 product38 SC 35 1 2018-06-20 2018-06-20

88 product88 SC 34 2 2018-06-20 2018-06-20

33 product33 SC 34 3 2018-06-20 2018-06-20

16 product16 SW 2018-06-20 2018-06-20

95 product95 SW 19 2 2018-06-20 2018-06-20

60 product60 SW 19 3 2018-06-20 2018-06-20

Time taken: 0.345 seconds, Fetched: 19 row (s)

Hive >

4. Use sqoop to export the statistical table product_stat data of hive to the statistical table product_stat of MySQL:

. / sqoop export\

-- connect jdbc:mysql://localhost:33066/ruozedata\

-- username root\

-- password root\

-- table product_stat\

-export-dir / ruozedata_03/product_stat/date=2018-06-20\

-- input-fields-terminated-by'\ t'\

-- input-null-string'--input-null-non-string 0\

-columns "product_id,product_name,area,click_count,rank,days"\

-- update-key product_id,area-- update-mode allowinsert\

Go to MySQL to check whether the statistics have been successfully exported:

Mysql > select * from product_stat where days='2018-06-20 'order by area,rank

+-+ +

| | product_id | product_name | area | click_count | rank | days | |

+-+ +

| | 7 | product7 | CC | 39 | 1 | 2018-06-20 | |

| | 26 | product26 | CC | 39 | 2 | 2018-06-20 | |

| | 70 | product70 | CC | 38 | 3 | 2018-06-20 | |

| | 4 | product4 | EC | 40 | 1 | 2018-06-20 | |

| | 96 | product96 | EC | 32 | 2 | 2018-06-20 | |

| | 99 | product99 | EC | 31 | 3 | 2018-06-20 | |

| | 9 | product9 | NC | 16 | 1 | 2018-06-20 | |

| | 40 | product40 | NC | 16 | 2 | 2018-06-20 | |

| | 94 | product94 | NC | 13 | 3 | 2018-06-20 | |

| | 67 | product67 | NW | 20 | 1 | 2018-06-20 | |

| | 56 | product56 | NW | 20 | 2 | 2018-06-20 | |

| | 48 | product48 | NW | 19 | 3 | 2018-06-20 | |

| | 38 | product38 | SC | 35 | 1 | 2018-06-20 | |

| | 88 | product88 | SC | 34 | 2 | 2018-06-20 | |

| | 33 | product33 | SC | 34 | 3 | 2018-06-20 | |

| | 16 | product16 | SW | 20 | 1 | 2018-06-20 | |

| | 95 | product95 | SW | 19 | 2 | 2018-06-20 | |

| | 60 | product60 | SW | 19 | 3 | 2018-06-20 | |

+-+ +

18 rows in set (0.00 sec)

After reading this article, I believe you have a certain understanding of "how HIVE counts the most popular TOP3 products in various regions". If you want to know more about it, you are welcome to follow the industry information channel. Thank you for reading!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report