Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Hive on tez sql optimization

2025-03-26 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/03 Report--

Select t.id end_usr_id

T.end_user_name end_usr_name

T.end_user_real_name end_usr_nknm

T.end_user_password end_usr_pswd

T.end_user_birthday end_usr_bthd

T.mobile mobl_num

T.phone phone_num

T.valid_mobile_phone_num bindg_mobl_num

T.end_user_email end_usr_email

Substr (t.end_user_email

Instr (t.end_user_email,'@')

Instr (t.end_user_email,'.')-instr (t.end_user_email,'@') email_class

T.end_user_create_time creat_time

T.end_user_last_login_date end_usr_last_login_date

T.end_user_last_bought_date end_usr_last_buy_date

T.end_user_login_times end_usr_login_times

T.end_user_bought_times end_usr_buy_times

T.end_user_bought_amount end_usr_buy_amt

T.end_user_type end_usr_type

T.ip ip

T.end_user_points end_usr_intgl

T.id_card id_card

T.end_user_sex sex

T.mc_site_id mc_site_id

Case

When t.is_email_activate = 1 then

one

Else

0

End as email_verfd_flg

Case

When t.valid_mobile_phone_num is not null then

one

Else

0

End as mobl_bding_flg

Case

When t.end_user_name like'% @ sina%' then

one

When t.end_user_name like'% @ pingan%' then

two

When t.end_user_name like'% @ alipay%' and

Not (t2.user_grade in (1,2) and t2.user_grade_type = 1) then

three

When t.end_user_name like'% @ alipay%' and t2.user_grade in (1,2) and

T2.user_grade_type = 1 then

four

When t.end_user_name like'% @ 163% 'then

five

When t.end_user_name like'% @ kaixin001%' then

six

When t.end_user_name like'% @ 139% 'then

seven

When t.end_user_name like'% @ msn.com%' then

eight

When t.end_user_name like'% @ anyue%' then

nine

When t.end_user_name like'% @ qq%' then

ten

Else

0

End as commn_login_type

Case

When t.end_user_name like'% @ b2B%'or t.end_user_name = 'yuxiaolan' then

one

Else

0

End as b2b_flg

Case

When t3.send_to is not null then

one

Else

0

End as email_ret_flg

Case

When t4.email is not null then

one

Else

0

End as email_blacklist_flg

Case

When t5.car ='1' then

one

Else

0

End as car_flg

T5.shopping_habit shppg_prefr

T6.mail_level email_lvl

'' whsl_flg

'' xh_flg

'' exptn_flg

'' b2b_vip_flg

T.member_grade end_usr_grd_id

Case

When t.co_code is null then

Null

When t.co_code is not null and t7.usr_union_logon_id is null then

-999999

Else

T7.usr_union_logon_id

End usr_union_logon_id

T.co_code usr_union_logon_code

T8.exp end_usr_grow_val

From tandem.end_user t

Left outer join tandem.alipay_user t2

On t.id = t2.end_user_id

Left outer join (select distinct lower (send_to) as send_to

From tandem.yhd_send_message_blacklist) T3

On lower (t.end_user_email) = t3.send_to

Left outer join (select distinct lower (email) as email

From tandem.edm_error_email) T4

On lower (t.end_user_email) = t4.email

Left outer join tmp_end_usr_info t5

On t.id = t5.end_usr_id

Left outer join tandem.crm_user_quality t6

On t.end_user_email = t6.email

And t6.user_id is not null

Left outer join dw.usr_union_logon t7

On t.co_code = t7.usr_union_logon_code

And t7.usr_union_logon_code is not null

Left outer join tandem.member t8

On t.id = t8.end_user_id

And to_date (t8.update_time) = '2015-05-04'

Where t.update_time > = '2015-05-04'

Or length (t8.end_user_id) > 0

Slow running speed

The implementation plan is as follows:

STAGE DEPENDENCIES:

Stage-1 is a root stage

Stage-0 depends on stages: Stage-1

STAGE PLANS:

Stage: Stage-1

Tez

Edges:

Reducer 2 = '2015-05-04' or t8.end_user_id > 0; t.update_time cannot be filtered out in advance

< '2015-05-04'的数据,导致后续join产生的reduce数据过大(见上面执行计划蓝色部分),可能产生了数据倾斜,导致任务时长拉长。 去掉or t8.end_user_id>

At 0, end_user TS returns 36219736 rows

Map 7

MapOperator Tree:

TableScan

Alias: t

Statistics: Num rows: 108659208 Data size: 49700768672 Basic stats: COMPLETEColumn stats: NONE

Filter Operator

Predicate: (update_time > = '2015-05-04') (type: boolean)

Statistics: Num rows: 36219736 Data size: 16566922890 Basic stats: COMPLETEColumn stats: NONE

Reduce Output Operator

Key expressions: id (type: bigint)

Sort order: +

Map-reduce partition columns: id (type: bigint)

Statistics: Num rows: 36219736 Data size: 16566922890 Basic stats: COMPLETEColumn stats: NONE

Value expressions: end_user_name (type: string)

When or t8.end_user_id > 0 is not removed, end_user TS returns 108659208 rows

Map 7

MapOperator Tree:

TableScan

Alias: t

Statistics: Num rows: 108659208 Data size: 49700768672 Basic stats: COMPLETEColumn stats: NONE

Reduce Output Operator

Key expressions: id (type: bigint)

Sort order: +

Map-reduce partition columns: id (type: bigint)

Statistics: Num rows: 108659208 Data size: 49700768672 Basic stats: COMPLETEColumn stats: NONE

Value expressions: end_user_name (type: string), end_user_password (type: string), end_user_real_name (type: string), end_user_birthday (type: string), end_user_last_login_date (type: string), end_user_last_bought_date (type: string), end_user_login_times (type: double), end_user_bought_amount (type: double), end_user_bought_times (type: double), end_user_sex (type: double) End_user_create_time (type:string), end_user_type (type: double), ip (type:string), end_user_points (type: double), co_code (type:string), is_email_activate (type: double), mc_site_id (type: bigint), update_time (type:string), member_grade (type: double), end_user_email (type:string), mobile (type:string), phone (type:string), valid_mobile_phone_num (type: int) Id_card (type: string)

The business unit says that or t8.end_user_id > 0 can be removed, which greatly reduces the result set of reduce data.

Finally, an implementation plan for removing or t8.end_user_id > 0 is attached:

STAGE DEPENDENCIES:

Stage-1 is a root stage

Stage-0 depends on stages: Stage-1

STAGE PLANS:

Stage: Stage-1

Tez

Edges:

Reducer 2

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report