The grouping function is a grouping and aggregation function in postgresql. Through this function, multiple reports of different levels or dimensions can be included in one query. Let's take a look at how to use this function.
1. Build test data
For the convenience of data display, the test data (fruit_sale table) is constructed here for demonstration, and the construction code is as follows:
DROP TABLE IF EXISTS "fruit_sale";
CREATE TABLE "fruit_sale" (
"statistical_date" date,
"product" varchar(255) COLLATE "pg_catalog"."default",
"year" varchar(5) COLLATE "pg_catalog"."default",
"qty" numeric(8),
"amount" numeric(8),
"region" varchar(50) COLLATE "pg_catalog"."default"
)
;
INSERT INTO "fruit_sale" VALUES ('2018-01-01', '西瓜', '2018', 1721, 253541, '华南');
INSERT INTO "fruit_sale" VALUES ('2019-03-01', '西瓜', '2019', 3437, 104221, '华南');
INSERT INTO "fruit_sale" VALUES ('2019-05-01', '西瓜', '2019', 8963, 122630, '华南');
INSERT INTO "fruit_sale" VALUES ('2019-06-01', '苹果', '2019', 1274, 150122, '华南');
INSERT INTO "fruit_sale" VALUES ('2019-05-01', '苹果', '2019', 6319, 282352, '华南');
INSERT INTO "fruit_sale" VALUES ('2018-11-01', '苹果', '2018', 8614, 170263, '华南');
INSERT INTO "fruit_sale" VALUES ('2018-02-01', '西瓜', '2018', 5530, 129644, '华南');
INSERT INTO "fruit_sale" VALUES ('2018-07-01', '西瓜', '2018', 4711, 129644, '华南');
INSERT INTO "fruit_sale" VALUES ('2018-08-01', '西瓜', '2018', 9187, 220605, '华南');
INSERT INTO "fruit_sale" VALUES ('2018-05-01', '西瓜', '2018', 5678, 129644, '华南');
INSERT INTO "fruit_sale" VALUES ('2018-09-01', '西瓜', '2018', 4029, 119187, '华南');
INSERT INTO "fruit_sale" VALUES ('2018-10-01', '西瓜', '2018', 3129, 137928, '华南');
INSERT INTO "fruit_sale" VALUES ('2018-03-01', '西瓜', '2018', 4496, 203471, '华中');
INSERT INTO "fruit_sale" VALUES ('2018-04-01', '西瓜', '2018', 7359, 206686, '华中');
INSERT INTO "fruit_sale" VALUES ('2018-12-01', '西瓜', '2018', 8646, 267718, '华中');
INSERT INTO "fruit_sale" VALUES ('2018-01-01', '苹果', '2018', 5559, 269419, '华中');
INSERT INTO "fruit_sale" VALUES ('2018-04-01', '苹果', '2018', 5590, 182167, '华中');
INSERT INTO "fruit_sale" VALUES ('2018-07-01', '苹果', '2018', 3852, 130764, '华中');
INSERT INTO "fruit_sale" VALUES ('2018-06-01', '西瓜', '2018', 7434, 206686, '华中');
INSERT INTO "fruit_sale" VALUES ('2019-01-01', '苹果', '2019', 5558, 156995, '华中');
INSERT INTO "fruit_sale" VALUES ('2018-08-01', '苹果', '2018', 8625, 235426, '华中');
INSERT INTO "fruit_sale" VALUES ('2018-11-01', '西瓜', '2018', 2633, 175737, '华东');
INSERT INTO "fruit_sale" VALUES ('2019-01-01', '西瓜', '2019', 1223, 113053, '华东');
INSERT INTO "fruit_sale" VALUES ('2019-02-01', '西瓜', '2019', 9079, 200716, '华东');
INSERT INTO "fruit_sale" VALUES ('2019-06-01', '西瓜', '2019', 1991, 167150, '华东');
INSERT INTO "fruit_sale" VALUES ('2018-02-01', '苹果', '2018', 5832, 142631, '华东');
INSERT INTO "fruit_sale" VALUES ('2018-05-01', '苹果', '2018', 1392, 249027, '华东');
INSERT INTO "fruit_sale" VALUES ('2018-06-01', '苹果', '2018', 9694, 179832, '华东');
INSERT INTO "fruit_sale" VALUES ('2018-09-01', '苹果', '2018', 7249, 286565, '华东');
INSERT INTO "fruit_sale" VALUES ('2019-04-01', '西瓜', '2019', 6524, 206686, '华东');
INSERT INTO "fruit_sale" VALUES ('2019-03-01', '苹果', '2019', 6545, 238608, '华东');
INSERT INTO "fruit_sale" VALUES ('2018-12-01', '苹果', '2018', 2140, 139439, '华东');
INSERT INTO "fruit_sale" VALUES ('2018-10-01', '苹果', '2018', 3490, 125275, '华东');
INSERT INTO "fruit_sale" VALUES ('2019-04-01', '苹果', '2019', 9992, 157696, '华中');
INSERT INTO "fruit_sale" VALUES ('2018-03-01', '苹果', '2018', 5276, 120441, '华东');
INSERT INTO "fruit_sale" VALUES ('2019-02-01', '苹果', '2019', 2246, 216573, '华东');
Some screenshots of the data are as follows:
2. Perform data aggregation
2.1 Normal Aggregation
-- 普通聚合
SELECT
product,
YEAR,
SUM ( qty ) qty
FROM
fruit_sale
GROUP BY
product,
YEAR;
The running effect is as follows:
2.2 groupping sets
2.2.1 Multi-dimensional
Aggregate according to multiple dimensions, the aggregation code is as follows:
--grouping set 多维度
SELECT
product,
YEAR,
SUM ( qty ) qty
FROM
fruit_sale
GROUP BY
GROUPING SETS ( product, YEAR );
The realization result is shown in the figure below:
2.2.2 Multi-dimensional, summary
By changing the parameters after set, you can control the dimension and level of aggregation, and use the following code to perform multi-dimensional and aggregate:
-- grouping set 多维度+汇总
SELECT
product,
YEAR,
SUM ( qty ) qty
FROM
fruit_sale
GROUP BY
GROUPING SETS ( product, YEAR, ( ) );
The code execution effect is shown in the figure below:
2.2.3 Multi-dimensional, different levels
By changing the parameters after set, you can control the dimension and level of aggregation, and use the following code to perform multi-dimensional and different-level aggregation:
-- grouping set 多维度+不同级别
SELECT
product,
YEAR,
SUM ( qty ) qty
FROM
fruit_sale
GROUP BY
GROUPING SETS ( product, YEAR, ( product, YEAR ) );
The code execution effect is shown in the figure below:
2.3 cube
When using cube, all grouping sets will be generated according to the specified fields. If the number of specified fields is n, there will be 2 to the nth power combinations (groupings).
2.3.1 Some cubes
code show as below:
-- 部分cube
SELECT GROUPING
( product ) category_id,
product,
YEAR,
SUM ( qty ) qty
FROM
fruit_sale
GROUP BY
YEAR,
CUBE ( product );
The result of running the code is as follows:
2.3.2 Overall cube
Aggregate products and years, and it can be observed that there are 2 to the power of 2 (4) categories, and the code is as follows:
-- 整体cube
SELECT GROUPING
( product, year ) category_id,
product,
YEAR,
SUM ( qty ) qty
FROM
fruit_sale
GROUP BY
CUBE ( product, year )
ORDER BY
GROUPING ( product, year );
Some screenshots of the running results are as follows:
Aggregating products, years and regions, it can be observed that there are 2 to 3 (8) categories, and the code is as follows:
--3字段cube
SELECT GROUPING
( product, year,region ) category_id,
product,
YEAR,
region,
SUM ( qty ) qty
FROM
fruit_sale
GROUP BY
CUBE ( product, year,region )
ORDER BY
GROUPING ( product, year,region );
After the code is executed, some results are as follows:
2.4.rollup
When rollup is used, grouped data with a hierarchical structure will be generated in order according to the specified combination field. If A, B, and C fields are specified, grouping data of A, AB, and ABC levels will be generated respectively.
2.4.1 Partial rollup
Partial rollup can be performed, the following is the corresponding code
--部分rollup
SELECT GROUPING
( product, YEAR ) category_id,
region,
product,
YEAR,
SUM ( qty ) qty
FROM
fruit_sale
GROUP BY
region,
ROLLUP ( product, YEAR );
The running effect is as follows:
2.4.2 Overall rollup
During rollup, different results can be generated depending on the order of the fields.
First perform an overall rollup according to the (region, product, YEAR) fields, and the following is the corresponding code:
SELECT GROUPING
( region, product, YEAR ) category_id,
region,
product,
YEAR,
SUM ( qty ) qty
FROM
fruit_sale
GROUP BY
ROLLUP ( region, product, YEAR )
ORDER BY
GROUPING ( region, product, YEAR );
After the above code is run, some screenshots of the results are as follows. The data in the red box is different from the code in 2.4.1:
then perform an overall rollup according to the ( product, YEAR, region ) fields, and the following is the corresponding code:
--(product, YEAR, region)组合进行rollup
SELECT GROUPING
( product, YEAR, region ) category_id,
region,
product,
YEAR,
SUM ( qty ) qty
FROM
fruit_sale
GROUP BY
ROLLUP ( product, YEAR, region )
ORDER BY
GROUPING ( product, YEAR, region );
According to the above code, the screenshot of the generated result is as follows. The data in the red box is the difference between rollup( product, YEAR, region ) and rollup(region, product, YEAR):