In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
Today, I will talk to you about the implementation of the pull list in Greenplum database. Many people may not know much about it. In order to let you know more, Xiaobian summarizes the following contents for you. I hope you can gain something according to this article.
History zipper table is a data model, mainly defined for the way data is stored in tables in data warehouse design; as the name implies, the so-called history zipper table is to record all the changes in a transaction from the beginning to the current state, zipper table can avoid the mass storage problem caused by storing all records every day, colleagues are also a common way to deal with slowly changing data.
I. Concept
In a zippered table, each piece of data has an effective date (sdate) and an expiration date (date). Suppose two new users are added in a user table on October 8, 2019, the effective time of these two records is the same day. Since these two records have not been modified until October 8, 2019, the expiration time is infinite, which is set to the maximum value in the database (2999-12-31), as shown in the figure:
The next day (2019-10-09), user 1001 was deleted and the phone number of user 1002 was modified to 165000006. In order to preserve the historical state, the expiration time of user 1001 is modified to 2019-10-09, and user 1002 becomes two records, as shown in the figure:
On the third day (October 10, 2019), user 1003 is added, and the user table data is as shown in the figure:
If you want to query the latest data, just query the data with the expiration time of 2999-12-31; if you want to query the historical data of October 8, filter the data with the effective time of 2019-10-08; if you query the data of October 9, filter the effective time of 2019-10-09; and so on.
II. Creation of tables
The temporary source table T_FIN_ACCTION_SRC receives data pushed from tables of other databases (such as oracle). The table structure is consistent with that of the source database.
--Source table
create table T_FIN_ACCTION_SRC(
eNo varchar(6),
eName varchar(10),
ePhone varchar(11),
eData_date date
);
The target table (i.e. zipper table) T_FIN_ACTION_TAR, here note that the zipper table changes the time field of the source table to the effective time and expiration time.
--Zipper watch
create table T_FIN_ACCTION_TAR(
eNo varchar(6),
eName varchar(10),
ePhone varchar(11),
sdate date,
edate date
);
3. Creation of stored procedures
Here, for ease of reading and code writing, first write the overall stored procedure architecture, and then we add code step by step:
--Pass in the current time (you can also pass in yesterday's time, oh, improvise, if the incoming time is today, then use the time minus one, because we have to process yesterday's data)
create or replace function My_FIN_GL_SUBJECT_PRO(IN P_TODAY VARCHAR)
returns void
as $$
declare
begin
--1. If there is no such primary key in the target table, it is determined as New-New
--2. If the ID is not found in the source table, link-delete
--3. Modification
--3.1 Closed chain: There are records with this primary key in the target table. The status values are different. The update end date is the current day.
--3.2 Open Chain: Add a modified data item in the target table, and the update end date is infinite
end;
$$
language plpgsql;
IV. Process realization of zipper
1. If there is no such primary key in the target table, it is determined as New-New
insert into gplcydb.public.T_FIN_ACCTION_TAR(eNo,eName,ePhone,sdate,edate)
select s.eNo,s.eName,s.ePhone,s.eData_date,to_date('2999-12-31','yyyy-mm-dd')
from gplcydb.public.T_FIN_ACCTION_SRC s
where s.eData_date=(to_date(P_TODAY,'yyyy-mm-dd') - 1)
and not exists(
select 1 from gplcydb.public.T_FIN_ACCTION_TAR t
where
s.eNo=t.eNo
and s.eName=t.eName
and s.ePhone=t.ePhone
);
2. If the ID is not found in the source table, link-delete
update gplcydb.public.T_FIN_ACCTION_TAR a set edate=(to_date(P_TODAY,'yyyy-mm-dd')-1)
where not exists(
select 1 from gplcydb.public.T_FIN_ACCTION_SRC s
where
s.eNo=a.eNo
and a.edate=to_date('2999-12-31', 'yyyy-mm-dd')
);
3. modify
3.1 Closed chain: there are records with this primary key in the target table. The status values are different. The update end date is the current day.
update gplcydb.public.T_FIN_ACCTION_TAR b set edate=(to_date(P_TODAY,'yyyy-mm-dd')-1)
where b.edate=to_date('2999-12-31','yyyy-mm-dd')
and exists(
select 1 from gplcydb.public.T_FIN_ACCTION_SRC s
where
s.eNo = b.eNo and b.sdate < (to_date(P_TODAY,'yyyy-mm-dd')-1)
and (
s.eName b.eName or s.ePhone b.ePhone
)
);
3.2 Open Chain: Add a new piece of modified data in the target table, and the update end date is infinite
insert into gplcydb.public.T_FIN_ACCTION_TAR(eNo,eName,ePhone,sdate,edate)
select s.eNo,s.eName,s.ePhone,(to_date(P_TODAY,'yyyy-mm-dd') - 1),to_date('2999-12-31','yyyy-mm-dd')
from gplcydb.public.T_FIN_ACCTION_SRC s
where
s.eData_date=(to_date(P_TODAY,'yyyy-mm-dd') - 1)
and exists( --Handle new cases of broken data chain
select 1 from (
select eNo,sdate,max(edate) end_date
from gplcydb.public.T_FIN_ACCTION_TAR group by eNo,sdate) t
where
t.eNo=s.eNo
and s.eData_date = t.sdate
and t.end_date
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 203
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.