Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to use Lateral types in PostgreSQL

2025-03-26 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)05/31 Report--

This article introduces how to use the Lateral type in PostgreSQL, the content is very detailed, interested friends can refer to, hope to be helpful to you.

PostgreSQL 9.3 uses a new type of federation! The launch of Lateral federation is low-key, but it implements powerful new queries that previously required programming. In this article, I will introduce an analysis of channel transformation that is not possible in PostgreSQL 9.2.

What is LATERAL federation?

The best description of this is at the bottom of the list of optional FROM statements in the document:

The LATERAL keyword can be prefixed with a SELECT FROM subkey. This allows SELECT subitems to be referenced to columns in FROM items before the FROM item appears. Without LATERAL, each SELECT subitem is independent of each other, so other FROM items cannot be cross-referenced.

...

When a FROM item contains LATERAL cross-references, the query is calculated as follows: for each row provided by the FROM image to the cross-reference column, or for the collection of rows provided by multiple FROM images to the reference column, the LATERAL item is calculated using the column value of the row or the collection of rows. The calculated result set is added to the federated query as usual. This process is repeated on the rows or collection of rows in the source table of the column.

This kind of calculation is a little intensive. You can loosely think of LATERAL federation as a foreach choice for SQL, in which PostgreSQL loops each row in a result set and takes that row as a parameter to perform a subquery calculation.

What can we do with this?

Take a look at the following table structure used to record click events:

CREATE TABLE event (user_id BIGINT, event_id BIGINT, time BIGINT NOT NULL, data JSON NOT NULL, PRIMARY KEY (user_id, event_id))

Each event is associated with a user, with an ID, a timestamp, and a JSON blob with event attributes. In the heap, these properties may contain one-click DOM level, window title, session reference, and so on.

Join us to optimize our login page to increase registration. The first step is to calculate which of our channels is losing users.

Example: the channel conversion rate between steps of a registration process.

Suppose we have equipped the device at the front end to record the event log along this process, and all the data will be saved to the event data table above. [1] the initial question is that we have to calculate how many people have viewed our home page, and how many percent of them have entered authentication information within two weeks of that view. If we use an older version of PostgreSQL, we may need to use PL/pgSQL, the built-in procedural language of PostgreSQL, to write some custom functions. In 9.3, we can use a lateral union and compute the results with a funny query without any extensions or PL/pgSQL.

SELECTuser_id,view_homepage,view_homepage_time,enter_credit_card,enter_credit_card_timeFROM (--Get the first time each user viewed the homepage.SELECTuser_id,1 AS view_homepage,min (time) AS view_homepage_timeFROM eventWHEREdata- > > 'type' =' view_homepage'GROUP BY user_id) E1 LEFT JOIN LATERAL (--For each row, get the first time the user_id did the enter_credit_card-- event, if one exists within two weeks of view_homepage_time.SELECT1 AS enter_credit_card Time AS enter_credit_card_timeFROM eventWHEREuser_id = e1.user_id ANDdata- > > 'type' =' enter_credit_card' ANDtime BETWEEN view_homepage_time AND (view_homepage_time + 1000 / 60 / 60 / 24 / 14) ORDER BY timeLIMIT 1) e2 ON true

No one will like more than 30 lines of SQL queries, so let's break these SQL into fragments and analyze them. The first piece is a normal SQL:

SELECT user_id, 1 AS view_homepage, min (time) AS view_homepage_timeFROM eventWHERE data- > > 'type' =' view_homepage'GROUP BY user_id

That is, to get the time when each user first triggered the view_homepage event. Our lateral union then allows us to iterate over each row of the result set and then execute a parameterized subquery. This is equivalent to executing the query on one side for each row of the result set:

SELECT 1 AS enter_credit_card, time AS enter_credit_card_timeFROM eventWHERE user_id = e1.user_id AND data- > > 'type' =' enter_credit_card' AND time BETWEEN view_homepage_time AND (view_homepage_time + 1000 / 60 / 60 / 24 / 14) ORDER BY timeLIMIT 1

For example, for each user, get the time when they trigger the view_homepage_time event within two weeks of triggering the enter_credit_card event. Because this is a lateral union, our subquery can reference to the view_homepage_time result set from the previous subquery. Otherwise, the subquery can only be executed separately, and there is no way to access the result set calculated by another subquery.

Then, oh, we encapsulate the whole into a select, which returns something like the following:

User_id | view_homepage | view_homepage_time | enter_credit_card | enter_credit_card_time-+----567 | 1 | 5234567890 | 1 | 5839367890234 | 1 | 2234567890 | | 345 | 1 | 3234567890 | | 456 | 1 | 6234567890 | | 1234567890 |

Because this is a left union, there will be rows in the query result set that do not match enter_credit_card events, as long as there are view_homepage events. If we summarize all the numerical columns, we will get a clear summary of the channel transformation:

SELECT sum (view_homepage) AS viewed_homepage, sum (enter_credit_card) AS entered_credit_cardFROM (--Get the first time each user viewed the homepage. SELECT user_id, 1 AS view_homepage, min (time) AS view_homepage_time FROM event WHERE data- > > 'type' =' view_homepage' GROUP BY user_id) E1 LEFT JOIN LATERAL (--For each (user_id, view_homepage_time) tuple, get the first time that-- user did the enter_credit_card event, if one exists within two weeks. SELECT 1 AS enter_credit_card, time AS enter_credit_card_time FROM event WHERE user_id = e1.user_id AND data- > > 'type' =' enter_credit_card' AND time BETWEEN view_homepage_time AND (view_homepage_time + 1000 / 60 / 60 / 24 / 14) ORDER BY time LIMIT 1) e2 ON true

... It outputs:

Viewed_homepage | entered_credit_card-+-827 | 10

We can fill this channel with intermediate steps with more lateral syndication to get the key improvements in the process. Let's add a query to use the sample steps between viewing the home page and entering validation information.

SELECT sum (view_homepage) AS viewed_homepage, sum (use_demo) AS use_demo, sum (enter_credit_card) AS entered_credit_cardFROM (--Get the first time each user viewed the homepage. SELECT user_id, 1 AS view_homepage, min (time) AS view_homepage_time FROM event WHERE data- > > 'type' =' view_homepage' GROUP BY user_id) E1 LEFT JOIN LATERAL (--For each row, get the first time the user_id did the use_demo-- event, if one exists within one week of view_homepage_time. SELECT user_id, 1 AS use_demo, time AS use_demo_time FROM event WHERE user_id = e1.user_id AND data- > > 'type' =' use_demo' AND time BETWEEN view_homepage_time AND (view_homepage_time + 1000 / 60 / 60 / 24 / 7) ORDER BY time LIMIT 1) e2 ON true LEFT JOIN LATERAL (--For each row, get the first time the user_id did the enter_credit_card-- event, if one exists within one week of use_demo_time. SELECT 1 AS enter_credit_card, time AS enter_credit_card_time FROM event WHERE user_id = e2.user_id AND data- > > 'type' =' enter_credit_card' AND time BETWEEN use_demo_time AND (use_demo_time + 1000 "60" 60 "247) ORDER BY time LIMIT 1) e3 ON true

This will output:

Viewed_homepage | use_demo | entered_credit_card-+-+-827 | 220 | 86

From viewing the home page to using demo within a week, and then entering credit card information to it within a week, this provides us with a three-step channel conversion. Since then, the powerful PostgreSQL allows us to analyze these data result sets in depth and analyze the performance of our website as a whole. Then we may have the following problems to solve:

Does the use of demo increase the possibility of registration?

Do the users who find our home page through advertising have the same conversion rate as users from other channels?

How will the conversion rate change with the different A _ map B test variables?

The answers to these questions will directly affect product improvements, which can be found in the PostgreSQL database because it now supports lateral federation.

Without lateral federation, we can only do these analyses with the help of PL/pgSQL. Or, if our dataset is small, we may not touch these complex and inefficient queries. In an exploratory data study usage scenario, you may just extract the data from the PostgreSQL and analyze it using the scripting language of your choice. But there are more powerful reasons to express these problems in SQL, especially if you want to package the whole into an easy-to-understand UI and release features to non-technical users.

Note that these queries can be optimized to become more efficient. In this example, if we create a btree index on (user_id, (data- > > 'type'), time), we can calculate each channel step for each user with only one index lookup. If you are using SSD, it costs very little to do a search on it, and that's enough.

On how to use the Lateral type in PostgreSQL to share here, I hope the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report