PostgreSQL merge join扫描弊端与优化

PostgreSQL 支持三种JOIN方法，它们分别是nestloop, hash, merge join。其中merge join扫描方法大致为：先按照join key对两表排序，然后按JOIN KEY顺序扫描，两个JOIN对象都只扫描一次。
当其中一个表上存在过滤条件时，自然是只需要对满足条件的数据进行排序即可。但是另一张表呢？正常来说，因为只取两表关联的数据，所以另一张表也只需要对应的满足条件的数据，但实际情况真的如此吗？

例子：
创建两张测试表，每张表1000万条记录，并在关联列上创建索引。
bill@bill=>create table tbl1 (id int, info text);

CREATE TABLE
bill@bill=>create table tbl2 (id int, info text);   
CREATE TABLE
bill@bill=>insert into tbl1 select generate_series(1,10000000),'test';  
INSERT 0 10000000
bill@bill=>insert into tbl2 select * from tbl1;  
INSERT 0 10000000
bill@bill=>create index idx_tbl1 on tbl1(id);
CREATE INDEX
bill@bill=>create index idx_tbl2 on tbl2(id);
CREATE INDEX

进行join查询：在其中一张表上加上过滤条件。

select count(*) from tbl1 join tbl2 on (tbl1.id=tbl2.id and tbl1.id between 2000000 and 2090000);  
  
select count(*) from tbl1 join tbl2 on (tbl1.id=tbl2.id) where tbl1.id between 2000000 and 2090000;

查看执行计划：
可以看到tbl1带过滤条件，在索引扫描时直接过滤，只扫描了需要的90001条。而tbl2则是从头开始扫描，直到超过匹配的条件。

bill@bill=>explain (analyze,verbose,timing,costs,buffers) select count(*) from tbl1 join tbl2 on (tbl1.id=tbl2.id and tbl1.id between 2000000 and 2090000);  
                                                                              QUERY PLAN                                                                              
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=62424.75..62424.76 rows=1 width=8) (actual time=109.961..109.962 rows=1 loops=1)
   Output: count(*)
   Buffers: shared hit=359881 read=1103
   ->  Gather  (cost=62424.54..62424.75 rows=2 width=8) (actual time=109.743..110.954 rows=3 loops=1)
         Output: (PARTIAL count(*))
         Workers Planned: 2
         Workers Launched: 2
         Buffers: shared hit=359881 read=1103
         ->  Partial Aggregate  (cost=61424.54..61424.55 rows=1 width=8) (actual time=105.944..105.945 rows=1 loops=3)
               Output: PARTIAL count(*)
               Buffers: shared hit=359881 read=1103
               Worker 0: actual time=103.675..103.675 rows=1 loops=1
                 Buffers: shared hit=114164 read=339
               Worker 1: actual time=104.715..104.716 rows=1 loops=1
                 Buffers: shared hit=136111 read=415
               ->  Nested Loop  (cost=0.87..61329.56 rows=37992 width=0) (actual time=0.128..102.750 rows=30000 loops=3)
                     Buffers: shared hit=359881 read=1103
                     Worker 0: actual time=0.204..100.589 rows=28548 loops=1
                       Buffers: shared hit=114164 read=339
                     Worker 1: actual time=0.121..101.730 rows=34038 loops=1
                       Buffers: shared hit=136111 read=415
                     ->  Parallel Index Only Scan using idx_tbl1 on bill.tbl1  (cost=0.43..2111.77 rows=37992 width=4) (actual time=0.068..11.897 rows=30000 loops=3)
                           Output: tbl1.id
                           Index Cond: ((tbl1.id >= 2000000) AND (tbl1.id <= 2090000))
                           Heap Fetches: 90001
                           Buffers: shared hit=241 read=737
                           Worker 0: actual time=0.110..12.097 rows=28548 loops=1
                             Buffers: shared hit=78 read=232
                           Worker 1: actual time=0.057..12.210 rows=34038 loops=1
                             Buffers: shared hit=92 read=281
                     ->  Index Only Scan using idx_tbl2 on bill.tbl2  (cost=0.43..1.55 rows=1 width=4) (actual time=0.002..0.003 rows=1 loops=90001)
                           Output: tbl2.id
                           Index Cond: (tbl2.id = tbl1.id)
                           Heap Fetches: 90001
                           Buffers: shared hit=359640 read=366
                           Worker 0: actual time=0.002..0.003 rows=1 loops=28548
                             Buffers: shared hit=114086 read=107
                           Worker 1: actual time=0.002..0.002 rows=1 loops=34038
                             Buffers: shared hit=136019 read=134
 Planning Time: 0.534 ms
 Execution Time: 111.019 ms
(41 rows)

bill@bill=>  explain (analyze,verbose,timing,costs,buffers) select count(*) from tbl1 join tbl2 on (tbl1.id=tbl2.id) where tbl1.id between 2000000 and 2090000;  
                                                                              QUERY PLAN                                                                              
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=62424.75..62424.76 rows=1 width=8) (actual time=108.919..108.919 rows=1 loops=1)
   Output: count(*)
   Buffers: shared hit=360977
   ->  Gather  (cost=62424.54..62424.75 rows=2 width=8) (actual time=107.567..109.730 rows=3 loops=1)
         Output: (PARTIAL count(*))
         Workers Planned: 2
         Workers Launched: 2
         Buffers: shared hit=360977
         ->  Partial Aggregate  (cost=61424.54..61424.55 rows=1 width=8) (actual time=104.436..104.437 rows=1 loops=3)
               Output: PARTIAL count(*)
               Buffers: shared hit=360977
               Worker 0: actual time=103.046..103.046 rows=1 loops=1
                 Buffers: shared hit=79271
               Worker 1: actual time=102.980..102.980 rows=1 loops=1
                 Buffers: shared hit=128277
               ->  Nested Loop  (cost=0.87..61329.56 rows=37992 width=0) (actual time=0.081..101.200 rows=30000 loops=3)
                     Buffers: shared hit=360977
                     Worker 0: actual time=0.111..99.834 rows=19764 loops=1
                       Buffers: shared hit=79271
                     Worker 1: actual time=0.085..99.825 rows=31982 loops=1
                       Buffers: shared hit=128277
                     ->  Parallel Index Only Scan using idx_tbl1 on bill.tbl1  (cost=0.43..2111.77 rows=37992 width=4) (actual time=0.043..10.798 rows=30000 loops=3)
                           Output: tbl1.id
                           Index Cond: ((tbl1.id >= 2000000) AND (tbl1.id <= 2090000))
                           Heap Fetches: 90001
                           Buffers: shared hit=971
                           Worker 0: actual time=0.057..10.747 rows=19764 loops=1
                             Buffers: shared hit=214
                           Worker 1: actual time=0.039..11.111 rows=31982 loops=1
                             Buffers: shared hit=348
                     ->  Index Only Scan using idx_tbl2 on bill.tbl2  (cost=0.43..1.55 rows=1 width=4) (actual time=0.002..0.003 rows=1 loops=90001)
                           Output: tbl2.id
                           Index Cond: (tbl2.id = tbl1.id)
                           Heap Fetches: 90001
                           Buffers: shared hit=360006
                           Worker 0: actual time=0.004..0.004 rows=1 loops=19764
                             Buffers: shared hit=79057
                           Worker 1: actual time=0.002..0.002 rows=1 loops=31982
                             Buffers: shared hit=127929
 Planning Time: 0.269 ms
 Execution Time: 109.790 ms
(41 rows)

因此可以发现，pg中的merge join在当只有一张表存在过滤条件时存在这样的弊端，那么我们该如何优化呢？比较简单的方法就是在另一张也加上过滤条件即可。

bill@bill=>explain (analyze,verbose,timing,costs,buffers) select count(*) from tbl1 join tbl2 on (tbl1.id=tbl2.id) where tbl1.id between 2000000 and 2090000 and tbl2.id between 2000000 and 2090000;  
                                                                   QUERY PLAN                                                                    
-------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=5921.11..5921.12 rows=1 width=8) (actual time=87.742..87.742 rows=1 loops=1)
   Output: count(*)
   Buffers: shared hit=1474
   ->  Merge Join  (cost=1.18..5918.91 rows=880 width=0) (actual time=0.055..79.410 rows=90001 loops=1)
         Merge Cond: (tbl1.id = tbl2.id)
         Buffers: shared hit=1474
         ->  Index Only Scan using idx_tbl1 on bill.tbl1  (cost=0.43..2643.66 rows=91181 width=4) (actual time=0.024..25.033 rows=90001 loops=1)
               Output: tbl1.id
               Index Cond: ((tbl1.id >= 2000000) AND (tbl1.id <= 2090000))
               Heap Fetches: 90001
               Buffers: shared hit=737
         ->  Index Only Scan using idx_tbl2 on bill.tbl2  (cost=0.43..2797.34 rows=96505 width=4) (actual time=0.019..26.880 rows=90001 loops=1)
               Output: tbl2.id
               Index Cond: ((tbl2.id >= 2000000) AND (tbl2.id <= 2090000))
               Heap Fetches: 90001
               Buffers: shared hit=737
 Planning Time: 0.269 ms
 Execution Time: 87.794 ms
(18 rows)

bill@bill=>explain (analyze,verbose,timing,costs,buffers) select * from tbl1 join tbl2 on (tbl1.id=tbl2.id and tbl1.id between 9000000 and 9090000 and tbl2.id between 9000000 and 9090000);  
                                                              QUERY PLAN                                                              
--------------------------------------------------------------------------------------------------------------------------------------
 Merge Join  (cost=1.18..5369.87 rows=721 width=18) (actual time=0.048..72.408 rows=90001 loops=1)
   Output: tbl1.id, tbl1.info, tbl2.id, tbl2.info
   Merge Cond: (tbl1.id = tbl2.id)
   Buffers: shared hit=1474
   ->  Index Scan using idx_tbl1 on bill.tbl1  (cost=0.43..2652.22 rows=91509 width=9) (actual time=0.022..18.988 rows=90001 loops=1)
         Output: tbl1.id, tbl1.info
         Index Cond: ((tbl1.id >= 9000000) AND (tbl1.id <= 9090000))
         Buffers: shared hit=737
   ->  Index Scan using idx_tbl2 on bill.tbl2  (cost=0.43..2284.78 rows=78797 width=9) (actual time=0.020..19.185 rows=90001 loops=1)
         Output: tbl2.id, tbl2.info
         Index Cond: ((tbl2.id >= 9000000) AND (tbl2.id <= 9090000))
         Buffers: shared hit=737
 Planning Time: 0.261 ms
 Execution Time: 77.546 ms
(14 rows)

总结：
pg中的merge join，对于只有一张表存在过滤条件时：
1、有过滤条件的，有索引的表，走索引精准过滤。
2、没有过滤条件的，有索引的，从索引头部开始扫描，直到超出边界停止扫描。所以可能有放大。

foucus、

发布了116 篇原创文章 · 获赞 50 · 访问量 1万+

私信关注

PostgreSQL merge join扫描弊端与优化

猜你喜欢