Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What is the practice of user behavior analysis based on ClickHouse

2025-02-22 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

ClickHouse-based user behavior analysis practice is what, many novices are not very clear about this, in order to help you solve this problem, the following editor will explain for you in detail, people with this need can come to learn, I hope you can gain something.

Preface

ClickHouse provides users with a wealth of multi-parameter aggregation functions (parametric aggregate function) and higher-order functions based on array + Lambda expressions (higher-order function), which can be used flexibly to achieve magical results. In our system, ClickHouse locates the number of clickstreams.

Path matching

By default, CK provides the sequenceMatch function to check whether any event chains satisfy the input pattern, and the sequenceCount function counts the number of event chains that satisfy the input pattern. Example:

SELECT site_id, sequenceMatch ('(? 1) (? t') AS result_chain FROM (SELECT ts_date,ts_date_time,event_type,column_type,user_idFROM ods.analytics_access_log_allWHERE ts_date > = '2020-06-30' AND ts_date 1) WHERE event_ chain.1 = 'shtOrderDone'AND has_midway_hit = 1GROUP BY result_chainORDER BY user_count DESC LIMIT 20

Briefly describe the train of thought:

Use the groupArray function to organize the tuples of the user's behavior, and use the arraySort function to sort them in ascending order of time.

Using arrayEnumerate function to obtain the subscript array of the original behavior chain

The subscript of the demarcation point in the original behavior chain is filtered by using arrayFilter and arrayDifference functions. The condition of the demarcation point is that the end of the path or the time difference is greater than the maximum interval.

The mask of the subscript array (a sequence of 0 and 1) is obtained by using the arrayMap and has functions, which is used for final segmentation. 1 represents the demarcation point.

Call the arraySplit function to split the original behavior chain into a single access behavior chain according to the demarcation point. Note that the function takes the demarcation point as the starting point of the new chain, so add 1 to the subscript of the demarcation point before.

Call the arrayJoin and arrayCompact functions to flatten the array of event chains into multiple rows and single columns, and remove adjacent duplicates.

Call the hasAll function to determine whether all the specified pass points exist. If any one of the pass points is required, use the hasAny function instead. Of course, you can also modify the WHERE predicate to exclude the specified pass point.

The final result is sorted into a readable string, and the user base is counted according to the behavior chain.

Ordered funnel transformation

CK provides the windowFunnel function to implement the funnel, which slides the window to match the chain of events sequentially for a specified length of time (in seconds) and returns the number of steps converted within the window. If there are multiple matches, the one with the largest number of steps (the deepest conversion) shall prevail.

By counting the number of steps, the conversion rate of each step in the funnel can be obtained. The SQL statement is as follows, and the query results are sensitive data and are no longer posted.

SELECT level,user_count,conv_rate_percentFROM (SELECT level, uniqCombined (user_id) AS user_count, neighbor (user_count,-1) AS prev_user_count,if (prev_user_count = 0,-1, round (user_count / prev_user_count * 100,3) AS conv_rate_percentFROM (SELECT user_id, windowFunnel (900) (ts_date_time, event_type = 'shtLogon' Event_type = 'shtKkClick' AND column_type =' homePage', event_type = 'shtOpenGoodsDetail', event_type =' shtAddCart', event_type = 'shtOrderDone') AS levelFROM (SELECT ts_date,ts_date_time,event_type,column_type,user_id FROM ods.analytics_access_log_all WHERE ts_date > =' 2020-06-30 'AND ts_date 0 GROUP BY level ORDER BY level ASC)

What if you want to be more accurate and implement the field correlation between the funnel steps (such as the item details → joining the item ID association in the three steps of placing an order in the shopping cart →)? You can take advantage of the xFunnel function proposed in the https://github.com/housepower/olap2018 project. It is the ancestor of the windowFunnel function, but you need to modify the ClickHouse source code and recompile it, and you will simply write the process if you have time in the future.

User retention

The retention function makes it easy to calculate retention. The function accepts multiple conditions, takes the result of the first condition as the basis, observes whether the following conditions are satisfied, sets 1 if it is satisfied, sets 0 if it is not satisfied, and finally returns an array of 0 and 1. By counting the number of 1, the retention rate can be calculated.

The following SQL statement calculates the repeat order rate for the next day and the seven-day repeat order rate (the semantics are the same as retention).

SELECT sum (ret [1]) AS original, sum (ret [2]) AS next_day_ret, round (next_day_ret / original * 100,3) AS next_day_ratio, sum (ret [3]) AS seven_day_ret, round (seven_day_ret / original * 100,3) AS seven_day_ratioFROM (WITH toDate ('2020-06-24') AS first_date SELECTuser_id, retention (ts_date = first_date) Ts_date = first_date + INTERVAL 1 DAY, ts_date = first_date + INTERVAL 7 DAY) AS ret FROM ods.ms_order_done_all WHERE ts_date > = first_date AND ts_date x > 1800, times_diff) AS session_gaps FROM ods.analytics_access_log_all WHERE ts_date > = '2020-06-30' GROUP BY ts_date,user_id) GROUP BY ts_date Is it helpful for you to read the above content? If you want to know more about the relevant knowledge or read more related articles, please follow the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report