MySQL——GROUP BY语句详解

MySQL——GROUP BY语句详解

1. GROUP BY语句

​ Group By语句可以根据一个或多个列对结果集进行分组,在分组的列上我们可以使用COUNT, SUM, AVG等函数。它的语法为select column_name, function(column_name) from table_name where column_name operator value group by column_name;

​ 这里,我们使用employee_tbl数据表来分析一些实例。首先,employee_tbl数据表的生成代码如下。

create table employee_tbl(
	id int not null,
    name char(10) not null default '',
    date datetime not null,
    singin tinyint(4) not null default '0',
    primary key(id)
)engine=InnoDB default charset=utf8;
insert into employee_tbl values
            ('1', '小明', '2016-04-22 15:25:33', '1'),
            ('2', '小王', '2016-04-20 15:25:47', '3'),
            ('3', '小丽', '2016-04-19 15:26:02', '2'),
            ('4', '小王', '2016-04-07 15:26:14', '4'),
            ('5', '小明', '2016-04-11 15:26:40', '4'),
            ('6', '小明', '2016-04-04 15:26:54', '2');

​ employee_tbl数据表的结果如下。

+----+------+---------------------+--------+
| id | name | date                | singin |
+----+------+---------------------+--------+
|  1 | 小明 | 2016-04-22 15:25:33 |      1 |
|  2 | 小王 | 2016-04-20 15:25:47 |      3 |
|  3 | 小丽 | 2016-04-19 15:26:02 |      2 |
|  4 | 小王 | 2016-04-07 15:26:14 |      4 |
|  5 | 小明 | 2016-04-11 15:26:40 |      4 |
|  6 | 小明 | 2016-04-04 15:26:54 |      2 |
+----+------+---------------------+--------+

​ 可以使用Group By语句将employee_tbl按name进行分组,并统计每个人有多少条记录,代码和结果如下。

select name, COUNT(*) from employee_tbl group by name;
+------+----------+
| name | COUNT(*) |
+------+----------+
| 小明 |        3 |
| 小王 |        2 |
| 小丽 |        1 |
+------+----------+

​ 使用with rollup可以实现在分组统计数据的基础上再进行总的统计,用NULL表示,代码和结果如下。

select name, SUM(singin) as singin_count from employee_tbl group by name with rollup;
+------+--------------+
| name | singin_count |
+------+--------------+
| 小丽 |            2 |
| 小明 |            7 |
| 小王 |            7 |
| NULL |           16 |
+------+--------------+

​ 可以使用coalesce来设置一个可以取代NULL的名称,select coalesce(a, b, c)说明:如果a == null, 则选择b;如果b == null,则选择c;如果a!=null,则选择a;如果a、b、c都为null,则返回null。代码和结果如下所示。

select coalesce(name, '总数'), SUM(singin) as singin_count from employee_tbl group by name with rollup;
+------------------------+--------------+
| coalesce(name, '总数') | singin_count |
+------------------------+--------------+
| 小丽                   |            2 |
| 小明                   |            7 |
| 小王                   |            7 |
| 总数                   |           16 |
+------------------------+--------------+

2. GROUP BY语句与聚合函数配合

​ 依旧是employee_tbl数据表,我们先执行select * from employee_tbl group by name;,看看会有怎样的结果。

+----+------+---------------------+--------+
| id | name | date                | singin |
+----+------+---------------------+--------+
|  1 | 小明 | 2016-04-22 15:25:33 |      1 |
|  2 | 小王 | 2016-04-20 15:25:47 |      3 |
|  3 | 小丽 | 2016-04-19 15:26:02 |      2 |
+----+------+---------------------+--------+

​ 和employee_tbl数据表对比,原先小明和小王分别对应有3条和2条记录,而通过group by语句最终只剩下了1条记录,那这是什么原因呢?实际上,group by语句在执行后,我们可以认为生成了如下的一个虚拟表(想象出来的)。

+----+------+---------------------+--------+
| id | name | date                | singin |
+----+------+---------------------+--------+
|  1 |     | 2016-04-22 15:25:33 |      1 |
|  5 | 小明 | 2016-04-11 15:26:40 |      4 |
|  6 |     | 2016-04-04 15:26:54 |      2 |
+----+------+---------------------+--------+
|  2 | 小王 | 2016-04-20 15:25:47 |      3 |
|  4 |     | 2016-04-07 15:26:14 |      4 |
+----+------+---------------------+--------+
|  3 | 小丽 | 2016-04-19 15:26:02 |      2 |
+----+------+---------------------+--------+

​ 也就是说相同name的记录合并成了一行,如果执行select * 的话,它只会提取对应单元格中的第一个数据;而聚合函数就可以对多数据的单元格进行处理。所以,我们可以来看看下面这道题。

在这里插入图片描述
在这里插入图片描述

​ 我们首先来创建department表,代码如下。

create database leetcode;
use leetcode;
create table department_1179 (
	id int,
    revenue int,
    month varchar(11) not null,
    primary key(id, month)
)engine=InnoDB default charset=utf8;
insert into department_1179 values
	(1, 8000, 'Jan'),
    (2, 9000, 'Jan'),
    (3, 10000, 'Feb'),
    (1, 7000, 'Feb'),
    (1, 6000, 'Mar');

​ 为了重新格式化department表,获得查询得到的结果表的形式,需要把行转为列,我们先尝试用下面的代码看看是什么效果?

use leetcode;
select id,
(case when month='Jan' then revenue end) as Jan_Revenue,
(case when month='Feb' then revenue end) as Feb_Revenue,
(case when month='Mar' then revenue end) as Mar_Revenue,
(case when month='Apr' then revenue end) as Apr_Revenue,
(case when month='May' then revenue end) as May_Revenue,
(case when month='Jun' then revenue end) as Jun_Revenue,
(case when month='Jul' then revenue end) as Jul_Revenue,
(case when month='Aug' then revenue end) as Aug_Revenue,
(case when month='Sep' then revenue end) as Sep_Revenue,
(case when month='Oct' then revenue end) as Oct_Revenue,
(case when month='Nov' then revenue end) as Nov_Revenue,
(case when month='Dec' then revenue end) as Dec_Revenue 
from department_1179 group by id order by id;

在这里插入图片描述

​ 这样就出现了错误,当id=1时,Jan_Revenue和Mar_Revenue都变成了NULL,这是由于case when只会提取多数据单元格中的第一个数据(id=1时,month对应的多数据单元格中包含Feb、Jan和Mar),如果第一个数据不符合条件,那么不会读取剩下的数据。所以这里我们应该使用聚合函数,如sum(case when month='Jan' then revenue end),当id=1时,它会在Feb、Jan和Mar中寻找符合条件的Jan,并返回其对应的revenue的值。代码和结果如下。

use leetcode;
select id,
sum(case when month='Jan' then revenue end) as Jan_Revenue,
sum(case when month='Feb' then revenue end) as Feb_Revenue,
sum(case when month='Mar' then revenue end) as Mar_Revenue,
sum(case when month='Apr' then revenue end) as Apr_Revenue,
sum(case when month='May' then revenue end) as May_Revenue,
sum(case when month='Jun' then revenue end) as Jun_Revenue,
sum(case when month='Jul' then revenue end) as Jul_Revenue,
sum(case when month='Aug' then revenue end) as Aug_Revenue,
sum(case when month='Sep' then revenue end) as Sep_Revenue,
sum(case when month='Oct' then revenue end) as Oct_Revenue,
sum(case when month='Nov' then revenue end) as Nov_Revenue,
sum(case when month='Dec' then revenue end) as Dec_Revenue 
from department_1179 group by id order by id;

在这里插入图片描述

猜你喜欢

转载自blog.csdn.net/NickHan_cs/article/details/107974107