In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
This article introduces the relevant knowledge of "the method and steps of calculating the maximum continuous active days". In the operation of actual cases, many people will encounter such a dilemma. Next, let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!
This time we will first talk about the second solution to the maximum number of consecutive active days, and then talk about another common interview SQL question.
This method uses row_number to generate a continuous sequence number for each line, so that in a continuous active period of time, there is a control value, and a difference can be obtained by using the date and the line number. During the continuous active period, the difference is the same, then we can further calculate the number of consecutive active days by grouping according to this difference.
SQL is as follows
WITH active AS (SELECT 100UID, '2021-04-01'dt UNION ALL SELECT 101UID,' 2021-04-01'dt UNION ALL SELECT 102UID, '2021-04-01'dt UNION ALL SELECT 103UID,' 2021-04-01'dt UNION ALL SELECT 100UID '2021-04-02'dt UNION ALL SELECT 101UID,' 2021-04-02'dt UNION ALL SELECT 102UID, '2021-04-02'dt UNION ALL SELECT 103UID,' 2021-04-02'dt UNION ALL SELECT 104UID '2021-04-02'dt UNION ALL SELECT 100UID,' 2021-04-03'dt UNION ALL SELECT 104UID, '2021-04-03'dt UNION ALL SELECT 101UID,' 2021-04-04'dt UNION ALL SELECT 102UID '2021-04-04'dt UNION ALL SELECT 103UID,' 2021-04-04'dt UNION ALL SELECT 104UID, '2021-04-04'dt UNION ALL SELECT 105UID,' 2021-04-04'dt UNION ALL SELECT 102UID '2021-04-03'dt) SELECT UID, max (days) days FROM (SELECT date_sub (dt,rn), UID, count (DISTINCT dt) days FROM (SELECT UID, dt) Row_number () over (PARTITION BY UID ORDER BY dt) rn FROM active) x GROUP BY date_sub (dt,rn), UID) y GROUP BY UID
The results are as follows:
At this point, the problem of continuous activity comes to an end. Let's look at another frequently asked question:
two。 Row-column conversion
(1) for a live broadcast product, there are liveid (live id) and invite_list (invitation list, string, user id split by ",") fields in the live broadcast record table lives, and uid (user id), liveid (live broadcast id) and duration (viewing time) in the live broadcast viewing record table record. Try to find out the viewing time of all users in the invitation list.
Due to the problem that invited users do not enter the room to watch, we take it directly from the viewing record, and we may not be able to include all the users in the list. Therefore, we need to take the invitation list in the LVB record as the standard. This requires the use of explode and lateral view to convert this string type to the form of a row record.
We still use with to construct two simple temporary tables, lives and record. The details of SQL are as follows:
With lives as (select 100liveid,'A01,A02,A03,A04'invite_list union all select 101liveid,'A05,A06,A07,A08'invite_list), record as (select 100liveid,'A01' uid,30 duration union all select 100liveid,'A02' uid,50 duration union all select 100liveid,'A03' uid,15 duration union all select 101liveid,'A07' uid,20 duration union all select 101liveid,'A08' uid,60 duration) select list.liveid,list.ulist,if (record.duration is null,0 Record.duration) duration from (select liveid,ulist from lives lateral view explode (split (invite_list,',')) uid as ulist) list left join record on list.liveid=record.liveid and list.ulist=record.uid
Here is a brief explanation: lateral view calls the explode function for the original table, decomposing an array or map field into multiple rows as a temporary table, and then combining it with other fields.
(2) An APP user visits the page record record, which contains fields uid (user id) and pageid (page id), and outputs the access path of each user in a field.
The previous topic is column switching, and this question is a row-specific train. With collect_set, you can aggregate all the records in a field below the grouping into a list, and then you can further use the concat_ws function to convert array to a string and concatenate it with ">".
WITH record as (SELECT 100UID,'A01' pageid UNION ALL SELECT 100UID,'A02' pageid UNION ALL SELECT 100UID,'A03' pageid UNION ALL SELECT 101UID,'A07' pageid UNION ALL SELECT 101UID,'A08' pageid) SELECT UID, concat_ws ('>', collect_list (pageid)) page FROM record GROUP BY UID
3. Beg for retention
For most Internet companies, the retention rate is a top priority at any time. Therefore, in the interview of data analysis, the frequency of SQL questions seeking retention rate is also very high.
The so-called "retention rate" usually defines an initial behavior and a subsequent behavior. After N days, the remaining proportion of users who meet the initial conditions is the retention rate. In terms of divergence, the concepts of return visit and review are also similar to retention. Such as:
An APP newly registered user form user, which contains fields uid (user id), dt (registration date), active table active, uid (user id), dt (active date), calculates the next day retention rate, second 2-day retention rate and 3-day retention rate of new users every day. 7-day retention rate
This topic takes the new registration as the initial behavior and the activity as the follow-up behavior. Obviously, we can connect the initial behavior with the subsequent behavior to the left, and then aggregate according to the registration date to find the corresponding number of new users and sub-N-day retention.
WITH user as (SELECT 100UID,'2021-01-01' dt UNION ALL SELECT 101UID,'2021-01-01' dt UNION ALL SELECT 102UID,'2021-01-01' dt UNION ALL SELECT 104UID,'2021-01-02' dt UNION ALL SELECT 104UID,'2021-01-02' dt UNION ALL SELECT 105UID,'2021-01-02' dt UNION ALL SELECT 106UID,'2021-01-02' dt), active as (select 100 UID) '2021-01-01'dt union all select 100UID,'2021-01-02'dt union all select 100UID,'2021-01-03'dt union all select 100UID,'2021-01-05'dt union all select 100UID,'2021-01-07'dt union all select 101UID,'2021-01-01'dt union all select 101UID,'2021-01-07'dt union all select 102UID,'2021-01-01'dt union all select 103UID '2021-01-01'dt union all select 103UID,'2021-01-02'dt union all select 103UID,'2021-01-03'dt union all select 103UID,'2021-01-05'dt union all select 104UID,'2021-01-02'dt union all select 104UID,'2021-01-03'dt union all select 104UID,'2021-01-04'dt union all select 105UID,'2021-01-02'dt union all select 105UID '2021-01-03'dt union all select 105UID,'2021-01-04'dt union all select 105UID,'2021-01-05'dt union all select 105UID,'2021-01-07'dt union all select 106UID,'2021-01-02'dt union all select 106UID,'2021-01-04'dt) select user.dt, count (distinct user.uid) new_user, count (distinct if (datediff (active.dt,user.dt) = 1 seminal active.uidjold null) retain1 Count (distinct if (datediff (active.dt,user.dt) = 2) retain2, count (distinct if (datediff (active.dt,user.dt) = 3) retain3, count (distinct if (datediff (active.dt,user.dt) = 4) retain4, count (distinct if (datediff (active.dt,user.dt) = 5) retain5, count (distinct if (datediff (active.dt,user.dt) = 6) retain6) Count (distinct if (datediff (active.dt,user.dt) = 7 datediff active.uidjinNull)) retain7 from user left join active on user.uid=active.uid group by user.dt
The results are as follows:
This method may be more troublesome, you have to write a separate field for each retention indicator required. If you want to save trouble, you can find out the daily new quantity separately, then connect the newly added table with the active table to find out the active quantity of each registration date in subsequent days, and then associate the new quantity with the active quantity again, so as to obtain the retention of any cycle.
This is the end of the content of "the method steps for calculating the maximum continuous active days". Thank you for your reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.