Oracle查询优化改写技巧与案例总结二

一、简介

本文是总结Oracle查询优化方法与改写技巧的第二篇文章，接着第一篇文章，继续。。。

二、优化技巧

【1】新增插入注意的几点问题：

如果insert语句中没有包含默认值的列，则会添加默认值。
如果包含默认值得列，必须显式指定default,才会添加默认值，否则不会添加
如果已经显式指定了莫列的值为null或值，则不会再加上默认值。

【2】复制数据表的定义和数据

--复制表结构,包括数据
create table emp2222 as select * from emp;
select * from emp2222;

--复制表结构,不包括数据
create table emp222 as select * from emp where 1 = 0;
select * from emp222;

--插入某个表的数据
insert into emp2222 select * from emp

【3】遍历字符串

有时候需要将字符串拆分为单个字符进行一些校验等操作，比如校验下面字符串的拼音首字母是否正确，这时候就需要将字符串拆分为：

with temp as
 (select '天天向上' as l, 'TTXS' as v from dual)
select 
       substr(t.l, level, 1) as first_label,
       substr(t.v, level, 1) as first_value
  from temp t
connect by level <= length(t.l);

connect by 是oracle树查询的一个子句，看如下示例：

将level循环显示4行。基于以上思想，我们就可以将‘天天向上’进行拆分：

with temp as
 (select '天天向上' as l, 'TTXS' as v from dual)
select t.*,
       level,
       substr(t.l, level, 1) as first_label,
       substr(t.v, level, 1) as first_value
  from temp t
connect by level <= length(t.l);

【4】字符串中包含引号的使用方法

在字符串中使用引号的话，只需要把单引号变成双引号即可，如下示例：

select 'g''day mate'
  from dual
union all
select 'beavers''teeth'
  from dual
union all
select ''''
  from dual;

至于q-quote界定符，工作中暂时还没遇到过相关需求，用的相对少，此处不做研究，用到的时候再查阅资料即可。

【5】统计字符在字符串中出现的次数

假设，需要统计以上字符串中字符出现的次数。oracle 11g提供了一个regexp_count(str,'分隔符')用于统计各个子串的个数。

--方法1 : 使用regexp_count(),需要加1数量才对
with temp as
 (select 'CLARK,KING,MILLER' as names from dual)
select regexp_count(names, ',') + 1 as cnt from temp;

除了以上方法，还可以使用translate(str,fromstring,tostring)实现同样的功能：

--方法2 : 使用translate
with temp as
 (select 'CLARK,KING,MILLER' as names from dual)
select length(translate(names, ',' || names, ',')) + 1 as cnt from temp;

但是，如果分隔字符串有多个的话，就要做进一步处理了，要除以分隔符长度。

--如果分隔字符有多个的话,需要除以分隔符长度
--错误写法：
with temp as
 (select 'CLARK$#KING$#MILLER' as names from dual)
select length(translate(names, '$#' || names, '$#')) + 1 as cnt from temp;

--正确写法：
with temp as
 (select 'CLARK$#KING$#MILLER' as names from dual)
select length(translate(names, '$#' || names, '$#')) / length('$#') + 1 as cnt
  from temp;

如果是多个分隔字符，使用regexp_count(str,'分隔字符')就不需要考虑分隔符长度了。

--如果分隔字符串有多个的话,使用regexp_count就不用考虑分隔符长度(注意需要转义特殊字符)
with temp as
 (select 'CLARK$#KING$#MILLER' as names from dual)
select regexp_count(names, '\$#') + 1 as cnt from temp;

【6】删除字符串中不必要的字符

要求去掉员工姓名中含有的【AEIOU】元音字母：

(1) 方法一：使用translate结合replace

select r.ename,
       translate(r.ename, 'AEIOU', 'aaaaa') as n1,
       replace(translate(r.ename, 'AEIOU', 'aaaaa'), 'a', '') as n2
  from emp r;

(2) 方法二：直接使用translate

select r.ename, translate(r.ename, '1AEIOU', '1') as n1 from emp r;

(3) 方法三：使用regexp_replace正则函数

select e.ename, regexp_replace(e.ename, '[AEIOU]') as n1 from emp e;

【7】将字符与数字分隔开

假设需要将部门名称与部门编号分隔开：

(1)方法一：使用regexp_replace正则表达式分割

with temp as
 (select d.dname || d.deptno as val from dept d)
select t.val,
       regexp_replace(t.val, '[1234567890]', '') as dname,
       regexp_replace(t.val, '[^0123456789]', '') as deptno
  from temp t;

(2)方法二：使用translate进行分割

with temp as
 (select d.dname || d.deptno as val from dept d)
select t.val,
       translate(t.val, 'a0123456789', 'a') as dname,
       translate(t.val, '0123456789' || t.val, '0123456789') as deptno
  from temp t;

【8】查询只包含字母或数字的数据

使用正则替换进行过滤：

with temp as
 (select '123' as val
    from dual
  union all
  select ' 234sdf' as val
    from dual
  union all
  select '$#234' as val
    from dual
  union all
  select 'adg' as val
    from dual)
select t.val from temp t where regexp_like(t.val, '^[0-9a-zA-Z]+$');

下面对regexp_replace相关知识进行扩展：

regexp_like(t.val,'AB') 表示t.val like '%A%' or t.val like '%B%'
regexp_like(t.val,'[0-9a-zA-Z]') 表示t.val like '%数字%' or t.val like '%大写字母%' or t.val like '%小写字母%'
^：表示字符串开始
$：表示字符串结束

--对比regexp_like与like

select t.val from temp t where regexp_like(t.val, 'A');
--等价于前后模糊查询
select t.val from temp t where t.val like '%A%';

select t.val from temp t where regexp_like(t.val, '^A');
--等价于‘A’开头模糊查询
select t.val from temp t where t.val like '%A';

select t.val from temp t where regexp_like(t.val, 'A$');
--等价于 ‘A’结尾模糊查询
select t.val from temp t where t.val like 'A%';

select t.val from temp t where regexp_like(t.val, '^A$');
--等价于 ‘A’精确查询
select t.val from temp t where t.val like 'A';

+：表示匹配前面的子表达式一次或多次
*：表示匹配前面的子表达式零次或多次

-- +：表示匹配前面的子表达式一次或多次
-- *：表示匹配前面的子表达式零次或多次

with temp as
 (select '123' as val from dual union all select '234567' as val from dual)
select t.val from temp t;

with temp as
 (select '167' as val from dual union all select '1234567' as val from dual)
select t.val from temp t where regexp_like(t.val, '16+'); --至少匹配6一次

--等价于
--select t.val from temp t where t.val like '16%'

with temp as
 (select '167' as val from dual union all select '1234567' as val from dual)
--select t.val from temp t where regexp_like(t.val, '16*'); --匹配6零次或多次

--等价于
select t.val from temp t where t.val like '1%'

regexp_like(t.val,'^[12]+$')
--等价于
t.val like '1' or t.val like '2' or t.val like '11' or t.val like '22' or t.val like '12' or t.val like '21'

regexp_like(t.val,'^[12]*$')
--等价于
t.val like '1' or t.val like '2' or t.val like '11' or t.val like '22' or t.val like '12' or t.val like '21'or t.val like '

【9】按字符串中的数字进行排序

首先，我们可以将字符串中的除数字之外的都替换为空，可以使用translate或者regexp_replace进行替换,然后再进行排序即可。

替换非数字字符：

with temp as
 (select d.dname || d.deptno || d.loc as val from dept d)
select regexp_replace(t.val, '[^0-9]', '') as val from temp t order by 1 desc;

当然也可以使用translate进行替换：

with temp as
 (select d.dname || d.deptno || d.loc as val from dept d)
select to_number(translate(t.val, '0123456789' || t.val, '0123456789')) as val
  from temp t
 order by 1 desc;

【10】根据表中的行创建一个分隔列表

其实就是listagg聚合函数的使用，或者wm_concat()函数的使用。

--统计各个部门的员工以及总工资
select deptno,
       sum(e.sal) as totalsal,
       listagg(e.ename, ',') within group(order by e.ename) as names,
       wm_concat(e.ename) as names2
  from emp e
 group by e.deptno

【11】分解IP地址

--分解IP地址
select regexp_substr(t.ip, '[^.]+', 1, 1) as a,
       regexp_substr(t.ip, '[^.]+', 1, 2) as b,
       regexp_substr(t.ip, '[^.]+', 1, 3) as c,
       regexp_substr(t.ip, '[^.]+', 1, 4) as d
  from (select '192.168.6.67' as ip from dual) t

【12】将分隔数据转换为可以使用in语句列表

可见另外一篇博客：https://blog.csdn.net/Weixiaohuai/article/details/84789139

【13】常用聚合函数

常用聚合函数有: min() max() avg() count() sum()等，以下是基本使用方法：

select e.deptno,
       avg(e.sal) as avgsal,
       min(e.sal) as minsal,
       max(e.sal) as maxsal,
       sum(e.sal) as totalsal,
       count(*) as totalcount,
       count(e.comm) as commtotalcount,
       avg(e.comm) as error_avgcomm, --错误的平均提成算法
       avg(nvl(e.comm, 0)) as right_avgcomm, --需要先将comm为空的转化为0再计算平均值
       avg(coalesce(e.comm, 0)) as right_avgcomm2
  from emp e
 group by e.deptno

注意点：聚合函数会忽略空值，对sum()求总和没什么影响，但是对avg()和count()来说可能结果就不一致了，需要特别注意这一点，实际项目中根据具体需求来决定是否将空值转换为0之后再进行聚合操作。

【14】生成累计和

案例：按照进入公司的先后顺序(empno排序)来统计成本累计和：

--成本累计和
select e.empno,
       e.ename,
       e.sal,
       sum(e.sal) over(order by e.empno) as totallv --第一行到当前行所有工资的总和
  from emp e
 where e.deptno = 30
 order by e.empno

【15】计算累计差

首先构造测试数据：

--计算累计差
with temp as
 (select 1000 as bh, '预交金额' as xmm, 30000 as val
    from dual
  union all
  select 1001 as bh, '支出一' as xmm, 1000 as val
    from dual --支出1000元
  union all
  select 1002 as bh, '支出二' as xmm, 2000 as val
    from dual --支出2000元
  union all
  select 1003 as bh, '支出三' as xmm, 3000 as val
    from dual --支出3000元
  )
select t.bh, t.xmm, t.val from temp t;

案例：我们需要计算每一笔支出之后剩余的余额还有多少

思想：通过对编号进行排序，将支出的金额变为负数之后再进行累计和，就达到本次案例要求

【a】第一步: 根据编号进行排序

--计算累计差
with temp as
 (select 1000 as bh, '预交金额' as xmm, 30000 as val
    from dual
  union all
  select 1001 as bh, '支出一' as xmm, 1000 as val
    from dual --支出1000元
  union all
  select 1002 as bh, '支出二' as xmm, 2000 as val
    from dual --支出2000元
  union all
  select 1003 as bh, '支出三' as xmm, 3000 as val
    from dual --支出3000元
  )
select rownum, t.bh, t.xmm, t.val from temp t order by t.bh;

【b】使用case when将支出的金额变为负数

--计算累计差
with temp as
 (select 1000 as bh, '预交金额' as xmm, 30000 as val
    from dual
  union all
  select 1001 as bh, '支出一' as xmm, 1000 as val
    from dual --支出1000元
  union all
  select 1002 as bh, '支出二' as xmm, 2000 as val
    from dual --支出2000元
  union all
  select 1003 as bh, '支出三' as xmm, 3000 as val
    from dual --支出3000元
  )
select r.rm,
       r.bh,
       r.xmm,
       r.val,
       case
         when r.rm = 1 then
          r.val
         else
          -r.val
       end as newval
  from (select rownum as rm, t.bh, t.xmm, t.val from temp t order by t.bh) r;

【c】累计和计算

--计算累计差
with temp as
 (select 1000 as bh, '预交金额' as xmm, 30000 as val
    from dual
  union all
  select 1001 as bh, '支出一' as xmm, 1000 as val
    from dual --支出1000元
  union all
  select 1002 as bh, '支出二' as xmm, 2000 as val
    from dual --支出2000元
  union all
  select 1003 as bh, '支出三' as xmm, 3000 as val
    from dual --支出3000元
  )
select r.rm,
       r.bh,
       r.xmm,
       r.val,
       sum(case
             when r.rm = 1 then
              r.val
             else
              -r.val
           end) over(order by r.rm) as ee
  from (select rownum as rm, t.bh, t.xmm, t.val from temp t order by t.bh) r;

【16】更改累计和的数值

首先，构造测试数据：模拟存取款记录数据

with temp as
 (select 1 as id, 'cunkuan' as tye, 100 as val
    from dual
  union all
  select 2 as id, 'cunkuan' as tye, 200 as val
    from dual
  union all
  select 3 as id, 'qukuan' as tye, 50 as val
    from dual
  union all
  select 4 as id, 'cunkuan' as tye, 100 as val
    from dual
  union all
  select 5 as id, 'qukuan' as tye, 100 as val
    from dual)
select t.* from temp t;

案例：统计金额流水，每次存款、取款之后的剩余余额等

思想：同样根据tye是存款还是取款，将取款的金额变为负数之后再进行累加和计算即可

with temp as
 (select 1 as id, 'cunkuan' as tye, 100 as val --存款
    from dual
  union all
  select 2 as id, 'cunkuan' as tye, 200 as val --存款
    from dual
  union all
  select 3 as id, 'qukuan' as tye, 50 as val --取款
    from dual
  union all
  select 4 as id, 'cunkuan' as tye, 100 as val --存款
    from dual
  union all
  select 5 as id, 'qukuan' as tye, 100 as val --取款
    from dual)
select t.id,
       t.tye,
       t.val,
       case
         when t.tye = 'qukuan' then
          -t.val
         else
          t.val
       end as newval, --转换取款金额为负数
       sum(case
             when t.tye = 'qukuan' then
              -t.val
             else
              t.val
           end) over(order by t.id) as yue
  from temp t;

【17】返回各部门工资排名前三名的员工

一看到这个需求，就需要搞清楚前三名是要怎么算，并且还要考虑并列成绩相同的情况。

select e.deptno,
       e.deptno,
       e.sal,
       --row_number()遇到数值相同的还是会排 1 , 2 , 3..
       row_number() over(partition by e.deptno order by e.sal desc) as row_number,
       --rank()遇到数值相同的用相同的数字(下一个会跳跃),类似 1 , 1, 3..
       rank() over(partition by e.deptno order by e.sal desc) as rank,
       --dense_rank()遇到数值相同的用相同的数字(下一个不会跳跃,相当于有两个第一名的感觉),类似 1 , 1 , 2 .. 
       dense_rank() over(partition by e.deptno order by e.sal desc) as dense_rank
  from emp e
 where e.deptno in (20, 30)
 order by e.deptno, e.sal desc

基于row_number()/rank()/dense_rank()，在不同的场景下返回的结果也会有所差异：

如果使用row_number()返回工资第一名的员工，显然会漏掉一条数据，因为对应相同工资有一个序号排为‘2’了。
如果使用dense_rank()返回工资前两名的员工，显然会返回多出一条数据，因为是‘1 ，1 ，2...’。

以下分别使用row_number()/rank()/dense_rank()实现查询工资排名前三名的员工信息：

select r.*
  from (select e.deptno,
               e.sal,
               row_number() over(partition by e.deptno order by e.sal desc) as row_number
          from emp e
         where e.deptno in (20, 30)) r
 where r.row_number <= 3

select r.*
  from (select e.deptno,
               e.sal,
               rank() over(partition by e.deptno order by e.sal desc) as rank
          from emp e
         where e.deptno in (20, 30)) r
 where r.rank <= 3

select r.*
  from (select e.deptno,
               e.sal,
               dense_rank() over(partition by e.deptno order by e.sal desc) as dense_rank
          from emp e
         where e.deptno in (20, 30)) r
 where r.dense_rank <= 3

【18】计算出现次数最多的值

完成这个查询需要三步：

【a】查询各个工资出现的次数：

select e.sal, count(*) as cnt from emp e where e.deptno = 20 group by e.sal

【b】按工资出现的次数进行排序：

select r.sal, r.cnt, dense_rank() over(order by r.cnt desc) as ordernum  --排序号
  from (select e.sal, count(*) as cnt
          from emp e
         where e.deptno = 20
         group by e.sal) r

【c】取出排序为第一的数据：

select *
  from (select r.sal,
               r.cnt,
               dense_rank() over(order by r.cnt desc) as ordernum --排序号
          from (select e.sal, count(*) as cnt
                  from emp e
                 where e.deptno = 20
                 group by e.sal) r) rr
 where rr.ordernum = 1 --排第一名

基于上面的示例，如果加上partition by e.deptno按部门分区进行统计，那么就可以统计各个部门中工资排名第一名的工资信息：

select *
  from (select r.sal,
               r.deptno,
               r.cnt,
               dense_rank() over(partition by r.deptno order by r.cnt desc) as ordernum --排序号
          from (select e.sal, e.deptno, count(*) as cnt
                  from emp e
                 group by e.sal, e.deptno) r) rr
 where rr.ordernum = 1 --排第一名

【19】求总和的百分比

案例：计算各部门的总工资，并且计算部门总工资占总工资的比例

要完成这个查询，需要三步：

【a】统计各个部门的总工资

select e.deptno, sum(e.sal) as depttotalsal from emp e group by e.deptno

【b】统计总工资金额

select r.deptno, r.depttotalsal, sum(depttotalsal) over() as totalsal --累加和计算总工资
  from (select e.deptno, sum(e.sal) as depttotalsal
          from emp e
         group by e.deptno) r

【c】计算占比

select rr.deptno,
       rr.depttotalsal,
       round((rr.depttotalsal / rr.totalsal) * 100, 2) as lv --工资占比
  from (select r.deptno,
               r.depttotalsal,
               sum(depttotalsal) over() as totalsal --累加和计算总工资
          from (select e.deptno, sum(e.sal) as depttotalsal
                  from emp e
                 group by e.deptno) r) rr

除了使用上面的方法，还可以使用ratio_to_report()实现：

select r.deptno,
       round(ratio_to_report(r.depttotalsal) over() * 100, 2) as lv
  from (select e.deptno, sum(e.sal) as depttotalsal
          from emp e
         group by e.deptno) r

【20】查询各员工占本部门工资的占比

要实现这个查询，基于上一个示例，再按照e.deptno进行分区统计即可。

select e.deptno,
       e.sal,
       e.ename,
       round(ratio_to_report(e.sal) over(partition by e.deptno) * 100, 2) as lv
  from emp e

除了上面的方法，还可以使用下面的sql统计各员工工资占本部门总工资的占比：

select ee.sal,
       ee.ename,
       ee.deptno,
       r.depttotalsal,
       round((ee.sal / r.depttotalsal) * 100, 2) as lv --工资占比
  from emp ee
  left join (select sum(e.sal) as depttotalsal, e.deptno
               from emp e
              group by e.deptno) r
    on r.deptno = ee.deptno

三、总结

这是第二部分的总结以及一些案例，下一篇继续总结。积少成多，keeping....