Hql查询案例一 : 级联累计报表查询案例

数据

有如下数据:
用户,日期,月销售额
A,2015-01-08,5
A,2015-01-11,15
B,2015-01-12,5
A,2015-01-12,8
B,2015-01-13,25
A,2015-01-13,5
C,2015-01-09,10
C,2015-01-11,20
A,2015-02-10,4
A,2015-02-11,6
C,2015-01-12,30
C,2015-02-13,10…

需求

使用hql实现如下累积报表:

用户 月份 月总额 累计到当月的总额
A 2015-01 33 33
A 2015-02 10 43
A 2015-03 30 73
B 2015-01 30 30
B 2015-02 15 45

实现:普通方法

1.建表、导入数据

create table t_sale(username string,day string,msale int)
row format delimited fields terminated by ",";

load data local inpath '/root/sale.txt' into table t_sale;

2.按月份将数据分组聚合

select  username,substr(day,1,7) as month_sale,sum(msale)
from t_sale
group by username,substr(day,1,7);

结果:
+———–+————-+——+–+
| username | month_sale | _c2 |
+———–+————-+——+–+
| A      | 2015-01   | 33 |
| A      | 2015-02   | 10 |
| A      | 2015-03   | 20 |
| B      | 2015-01   | 30 |
| B      | 2015-02   | 15 |
| B      | 2015-03   | 45 |
| C      | 2015-01   | 60 |
| C      | 2015-02   | 10 |
| C      | 2015-03   | 30 |
+———–+————-+——+–+
3.将聚合后生成的临时表自关联

select t1.*,t2.* from
(select  username,substr(day,1,7) as month_sale,sum(msale) as cnt
from t_sale
group by username,substr(day,1,7)) t1
join
(select  username,substr(day,1,7) as month_sale,sum(msale) as cnt
from t_sale
group by username,substr(day,1,7)) t2
on t1.username=t2.username;

结果:
+————–+—————-+———+————–+—————-+———+–+
| t1.username | t1.month_sale | t1.cnt | t2.username | t2.month_sale | t2.cnt |
+————–+—————-+———+————–+—————-+———+–+
| A | 2015-01 | 33 | A | 2015-01 | 33 |
| A | 2015-01 | 33 | A | 2015-02 | 10 |
| A | 2015-01 | 33 | A | 2015-03 | 20 |
| A | 2015-02 | 10 | A | 2015-01 | 33 |
| A | 2015-02 | 10 | A | 2015-02 | 10 |
| A | 2015-02 | 10 | A | 2015-03 | 20 |
| A | 2015-03 | 20 | A | 2015-01 | 33 |
| A | 2015-03 | 20 | A | 2015-02 | 10 |
| A | 2015-03 | 20 | A | 2015-03 | 20 |
| B | 2015-01 | 30 | B | 2015-01 | 30 |
| B | 2015-01 | 30 | B | 2015-02 | 15 |
| B | 2015-01 | 30 | B | 2015-03 | 45 |
| B | 2015-02 | 15 | B | 2015-01 | 30 |
| B | 2015-02 | 15 | B | 2015-02 | 15 |
| B | 2015-02 | 15 | B | 2015-03 | 45 |
| B | 2015-03 | 45 | B | 2015-01 | 30 |
| B | 2015-03 | 45 | B | 2015-02 | 15 |
| B | 2015-03 | 45 | B | 2015-03 | 45 |
| C | 2015-01 | 60 | C | 2015-01 | 60 |
| C | 2015-01 | 60 | C | 2015-02 | 10 |
| C | 2015-01 | 60 | C | 2015-03 | 30 |
| C | 2015-02 | 10 | C | 2015-01 | 60 |
| C | 2015-02 | 10 | C | 2015-02 | 10 |
| C | 2015-02 | 10 | C | 2015-03 | 30 |
| C | 2015-03 | 30 | C | 2015-01 | 60 |
| C | 2015-03 | 30 | C | 2015-02 | 10 |
| C | 2015-03 | 30 | C | 2015-03 | 30 |
+————–+—————-+———+————–+—————-+———+–+
4.对自关联后的数据进行筛选,选出右表月份<=左表月份的数据,以便进行累加

select t1.*,t2.* from
(select  username,substr(day,1,7) as month_sale,sum(msale) as cnt
from t_sale
group by username,substr(day,1,7)) t1
join
(select  username,substr(day,1,7) as month_sale,sum(msale) as cnt
from t_sale
group by username,substr(day,1,7)) t2
on t1.username=t2.username
where t1.month_sale>=t2.month_sale;

结果:
+————–+—————-+———+————–+—————-+———+–+
| t1.username | t1.month_sale | t1.cnt | t2.username | t2.month_sale | t2.cnt |
+————–+—————-+———+————–+—————-+———+–+
| A | 2015-01 | 33 | A | 2015-01 | 33 |
| A | 2015-02 | 10 | A | 2015-01 | 33 |
| A | 2015-02 | 10 | A | 2015-02 | 10 |
| A | 2015-03 | 20 | A | 2015-01 | 33 |
| A | 2015-03 | 20 | A | 2015-02 | 10 |
| A | 2015-03 | 20 | A | 2015-03 | 20 |
| B | 2015-01 | 30 | B | 2015-01 | 30 |
| B | 2015-02 | 15 | B | 2015-01 | 30 |
| B | 2015-02 | 15 | B | 2015-02 | 15 |
| B | 2015-03 | 45 | B | 2015-01 | 30 |
| B | 2015-03 | 45 | B | 2015-02 | 15 |
| B | 2015-03 | 45 | B | 2015-03 | 45 |
| C | 2015-01 | 60 | C | 2015-01 | 60 |
| C | 2015-02 | 10 | C | 2015-01 | 60 |
| C | 2015-02 | 10 | C | 2015-02 | 10 |
| C | 2015-03 | 30 | C | 2015-01 | 60 |
| C | 2015-03 | 30 | C | 2015-02 | 10 |
| C | 2015-03 | 30 | C | 2015-03 | 30
5.将数据进行聚合统计,累加当月销售额
本案例完整hql语句

select t1.username,t1.month_sale,max(t1.cnt),sum(t2.cnt) from
(select  username,substr(day,1,7) as month_sale,sum(msale) as cnt
from t_sale
group by username,substr(day,1,7)) t1
join
(select  username,substr(day,1,7) as month_sale,sum(msale) as cnt
from t_sale
group by username,substr(day,1,7)) t2
on t1.username=t2.username
where t1.month_sale>=t2.month_sale
group by t1.username,t1.month_sale
;

结果:
+————–+—————-+——+——+–+
| t1.username | t1.month_sale | _c2 | _c3 |
+————–+—————-+——+——+–+
| A | 2015-01 | 33 | 33 |
| A | 2015-02 | 10 | 43 |
| A | 2015-03 | 20 | 63 |
| B | 2015-01 | 30 | 30 |
| B | 2015-02 | 15 | 45 |
| B | 2015-03 | 45 | 90 |
| C | 2015-01 | 60 | 60 |
| C | 2015-02 | 10 | 70 |
| C | 2015-03 | 30 | 100 |
+————–+—————-+——+——+–+

**:当字段对应数据相同时,实现去重可以使用聚合函数max()、min().

实现:窗口函数实现累加

select username,month,m_sale,
sum(m_sale) over (partition by username order by month rows between unbounded preceding and current row)
from
(select username,substr(day,1,7) as month ,sum(msale) as m_sale
from t_sale
group by username,substr(day,1,7))tmp;

猜你喜欢

转载自blog.csdn.net/amin_hui/article/details/82256278