Offline data warehouse: derivation process of user zipper table

1. Dimension
tables There are two types of dimension tables:
1. The calculation cycle of the full snapshot table
offline data warehouse is usually once a day, and a full copy of dimension data can be saved every day. The
advantages are: simple and effective, low development and maintenance costs, and convenient The disadvantages of understanding and using
are: waste of storage space, especially when the amount of data is relatively large and the data change ratio is low
2. Zipper
table The significance of the zipper table is that it can save the historical status of dimension information more efficiently. It
is suitable for: data will occur Dimensions that change, but do not change frequently (slowly changing dimensions)

2. Design process of zipper table
User dimension data: first day data, 2020-06-14
1. Directly load ods_user_info_inc data into dwd_user_zip, add start_date and end_datez fields, where start_date is set to the first day time, and end_date is set to the maximum time
Insert image description here
2. For daily data, for example 2020-06-15, still add start_date and end_date
Insert image description here
3. Integrate the situation of 1 and 2.
Insert image description here
The data we hope to get is that
Insert image description here
if there is data in 1 and 2, then it means there is Expired data
Solution 1: Use full join
Insert image description here
for the 9999-12-31 partition data. We take the new data first. If not, we take the old partition.

Insert image description here

For expired data, it means that both old and new have data, take the old data, and set start_date and end_date as the expiration date.
Insert image description here
Solution 2
(1) Union old and new
Insert image description here
(2) Sort
Insert image description here
(3) 9999-12-31 Get the first piece of data in the partition
Insert image description here
(4) 2020-06-14 Get the data corresponding to rn=2
Insert image description here

Guess you like

Origin blog.csdn.net/m0_37759590/article/details/132851942