下面 SQL 有 OR 关联条件。
SELECT A.CONTRACT_ID, B.BORROWER_ID
FROM blfct.bl_rtl_con_overdue_fact A
LEFT JOIN BLpub.Bl_Contract_Dim B ON A.DEALER_ID = B.DEALER_ID
OR A.OVERDUE_DD = B.Overdue_Dd
WHERE A.ETL_DATE BETWEEN DATE '2016-12-19' AND DATE '2016-12-20';
执行计划如下。
Plan hash value: 121649910
------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 163M| 5469M| 4421M (1)|
| 1 | NESTED LOOPS OUTER | | 163M| 5469M| 4421M (1)|
|* 2 | TABLE ACCESS FULL | BL_RTL_CON_OVERDUE_FACT | 181K| 3898K| 2192K (2)|
| 3 | VIEW | | 903 | 11739 | 24354 (1)|
|* 4 | TABLE ACCESS FULL| BL_CONTRACT_DIM | 903 | 12642 | 24354 (1)|
----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("A"."ETL_DATE">=TO_DATE(' 2016-12-19 00:00:00', 'syyyy-mm-dd
hh24:mi:ss') AND "A"."ETL_DATE"<=TO_DATE(' 2016-12-20 00:00:00', 'syyyy-mm-dd
hh24:mi:ss'))
4 - filter("A"."OVERDUE_DD"="B"."OVERDUE_DD" OR "A"."DEALER_ID"="B"."DEALER_ID")
从执行计划中看到,两表走的是嵌套循环。当两表用外连接进行关联,关联条件中有 OR 关联条件,那么这时只能走嵌套循环,而且驱动表固定为主表,此时不能走 HASH 连接,即使通过 HINT:USE_HASH 也无法修改执行计划。如果主表数据量很大,那么这时就会出现严重性能问题。我们可以将外连接的 OR 关联/过滤条件放到查询中,用 case when 进行过滤,从而让 SQL 可以走 HASH 连接。
EXPLAIN PLAN FOR
SELECT A.CONTRACT_ID,
case
when A.DEALER_ID = B.DEALER_ID OR A.OVERDUE_DD = B.Overdue_Dd then
B.BORROWER_ID
end
FROM blfct.bl_rtl_con_overdue_fact A
LEFT JOIN BLpub.Bl_Contract_Dim B ON A.DEALER_ID = B.DEALER_ID
WHERE A.ETL_DATE BETWEEN DATE '2016-12-19' AND DATE '2016-12-20';
执行计划如下。
select * from table(dbms_xplan.display());
Plan hash value: 3927476067
-------------------------------------------------------------------------------------
| Id |Operation | Name |Rows | Bytes |TempSpc|Cost(%CPU)|
-------------------------------------------------------------------------------------
| 0 |SELECT STATEMENT | | 57M| 1965M| | 2218K (2)|
|* 1 | HASH JOIN OUTER | | 57M| 1965M| 6032K| 2218K (2)|
|* 2 | TABLE ACCESS FULL| BL_RTL_CON_OVERDUE_FACT | 181K| 3898K| | 2192K (2)|
| 3 | TABLE ACCESS FULL| BL_CONTRACT_DIM | 640K| 8763K| |24349 (1)|
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("A"."DEALER_ID"="B"."DEALER_ID"(+))
2 - filter("A"."ETL_DATE">=TO_DATE(' 2016-12-19 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND
"A"."ETL_DATE"<=TO_DATE(' 2016-12-20 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
利用 case when 改写外连接 OR 连接条件有个限制:从表只能是 1 的关系,不能是 n 的关系,从表要展示多少个列,就要写多少个 case when。我们利用 EMP 与 DEPT 进行讲解。EMP 与 DEPT 是 n∶1 关系,现有如下 SQL。
select e.*, d.deptno deptno2, d.loc
from scott.emp e
left join scott.dept d on d.deptno = e.deptno
and (d.deptno >= e.sal and e.sal < 1000 or
e.ename like '%O%');
执行计划如下。
----------------------------------------------------------
Plan hash value: 2962868874
-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost(%CPU)|Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 14 | 826 | 17 (0)|00:00:01|
| 1 | NESTED LOOPS OUTER | | 14 | 826 | 17 (0)|00:00:01|
| 2 | TABLE ACCESS FULL | EMP | 14 | 532 | 3 (0)|00:00:01|
| 3 | VIEW | | 1 | 21 | 1 (0)|00:00:01|
|* 4 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 11 | 1 (0)|00:00:01|
|* 5 | INDEX UNIQUE SCAN | PK_DEPT | 1 | | 0 (0)|00:00:01|
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - filter("E"."ENAME" IS NOT NULL AND "E"."ENAME" IS NOT NULL AND
"E"."ENAME" LIKE '%O%' OR "D"."DEPTNO">="E"."SAL" AND "E"."SAL"<1000)
5 - access("D"."DEPTNO"="E"."DEPTNO")
执行计划中两表关联走的是嵌套循环,驱动表是主表 EMP。现在我们添加 HINT:USE_HASH 尝试改变表连接方式。
select /*+ use_hash(e,d) */
e.*, d.deptno deptno2, d.loc
from scott.emp e
left join scott.dept d on d.deptno = e.deptno
and (d.deptno >= e.sal and e.sal < 1000 or
e.ename like '%O%');
14 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 2962868874
-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost(%CPU)|Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 14 | 826 | 17 (0)|00:00:01|
| 1 | NESTED LOOPS OUTER | | 14 | 826 | 17 (0)|00:00:01|
| 2 | TABLE ACCESS FULL | EMP | 14 | 532 | 3 (0)|00:00:01|
| 3 | VIEW | | 1 | 21 | 1 (0)|00:00:01|
|* 4 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 11 | 1 (0)|00:00:01|
|* 5 | INDEX UNIQUE SCAN | PK_DEPT | 1 | | 0 (0)|00:00:01|
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - filter("E"."ENAME" IS NOT NULL AND "E"."ENAME" IS NOT NULL AND
"E"."ENAME" LIKE '%O%' OR "D"."DEPTNO">="E"."SAL" AND "E"."SAL"<1000)
5 - access("D"."DEPTNO"="E"."DEPTNO")
添加 HINT 无法更改执行计划。因为 SQL 语句中从表 DEPT 属于 1 的关系,从表 DEPT 要展示两个列,需要对应写上两个 case when。改写的 SQL 如下。
select e.*,
case
when (d.deptno >= e.sal and e.sal < 1000 or e.ename like '%O%') then
d.deptno
end deptno2,
case
when (d.deptno >= e.sal and e.sal < 1000 or e.ename like '%O%') then
d.loc
end loc
from scott.emp e
left join scott.dept d on d.deptno = e.deptno;
改写后的执行计划如下。
select e.*,
case
when (d.deptno >= e.sal and e.sal < 1000 or e.ename like '%O%') then
d.deptno
end deptno2,
case
when (d.deptno >= e.sal and e.sal < 1000 or e.ename like '%O%') then
d.loc
end loc
from scott.emp e
left join scott.dept d on d.deptno = e.deptno;
14 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 3387915970
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 14 | 686 | 7 (15)| 00:00:01 |
|* 1 | HASH JOIN OUTER | | 14 | 686 | 7 (15)| 00:00:01 |
| 2 | TABLE ACCESS FULL| EMP | 14 | 532 | 3 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL| DEPT | 4 | 44 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("D"."DEPTNO"(+)="E"."DEPTNO")
用 case when 改写之后,两表自动走了 HASH 连接。
如果主表属于 1 的关系,从表属于 n 的关系,我们就不能用 case when 进行等价改写,例子如下。
select d.*, e.deptno deptno2, e.ename, e.sal
from dept d
left join emp e on d.deptno = e.deptno
and (d.deptno >= e.sal and e.sal < 1000 or
e.ename like '%O%');
SQL 中 DEPT 是主表,EMP 是从表,DEPT 与 EMP 是 1∶n 的关系,此时不能将 SQL 改写为如下写法。
select d.*,
case
when (d.deptno >= e.sal and e.sal < 1000 or e.ename like '%O%') then
e.deptno
end deptno2,
case
when (d.deptno >= e.sal and e.sal < 1000 or e.ename like '%O%') then
e.ename
end ename,
case
when (d.deptno >= e.sal and e.sal < 1000 or e.ename like '%O%') then
e.sal
end sal
from dept d
left join emp e on d.deptno = e.deptno;
我们可以将 SQL 改写为如下写法。
select b.*, a.deptno, a.ename, a.sal
from dept b
left join (select d.deptno, e.ename, e.sal
from dept d, emp e
where d.deptno = e.deptno
and (d.deptno >= e.sal and e.sal < 1000 or
e.ename like '%O%')) a on b.deptno = a.deptno;
如果两表是 n∶n 关系,这时就无法对 SQL 进行改写了,在日常工作中一般也遇不到 n∶n 关系。