案例一
2011 年,一位朋友请求优化如下 SQL。
select tpc.policy_id,
tcm.policy_code,
tpf.organ_id,
to_char(tpf.insert_time, 'YYYY-MM-DD') As insert_time,
tpc.change_id,
d.policy_code,
e.company_name,
f.real_name,
tpf.fee_type,
sum(tpf.pay_balance) as pay_balance,
c.actual_type,
tpc.notice_code,
d.policy_type,
g.mode_name as pay_mode
from t_policy_change tpc,
t_contract_master tcm,
t_policy_fee tpf,
t_fee_type c,
t_contract_master d,
t_company_customer e,
t_customer f,
t_pay_mode g
where tpc.change_id = tpf.change_id
and tpf.policy_id = d.policy_id
and tcm.policy_id = tpc.policy_id
and tpf.receiv_status = 1
and tpf.fee_status = 1
and tpf.payment_id is null
and tpf.fee_type = c.type_id
and tpf.pay_mode = g.mode_id
and d.company_id = e.company_id(+)
and d.applicant_id = f.customer_id(+)
and tpf.organ_id in
(select
organ_id
from t_company_organ
start with organ_id = '101'
connect by prior organ_id = parent_id)
group by tpc.policy_id,
tpc.change_id,
tpf.fee_type,
to_char(tpf.insert_time, 'YYYY-MM-DD'),
c.actual_type,
d.policy_code,
g.mode_name,
e.company_name,
f.real_name,
tpc.notice_code,
d.policy_type,
tpf.organ_id,
tcm.policy_code
order by change_id, fee_type;
执行计划如下。
select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------
| Id|Operation | Name |Rows |Bytes|TempSpc| Cost (%CPU)|
-------------------------------------------------------------------------------------
| 0|SELECT STATEMENT | |45962| 11M| |45650 (0)|
| 1| SORT GROUP BY | |45962| 11M| 23M| 45650 (0)|
|* 2| HASH JOIN | |45962| 11M| | 43908 (0)|
| 3| INDEX FULL SCAN |T_FEE_TYPE_IDX_003 | 106| 636| | 1 (0)|
| 4| NESTED LOOPS OUTER | |45962| 11M| | 43906 (0)|
|* 5| HASH JOIN | |45962|7271K| 6824K| 43905 (0)|
| 6| NESTED LOOPS | |45961|6283K| | 42312 (0)|
|* 7| HASH JOIN SEMI | |45961|5655K| 50M| 33120 (1)|
|* 8| HASH JOIN OUTER | | 400K| 45M| 44M| 32315 (1)|
|* 9| HASH JOIN | | 400K| 39M| 27M| 26943 (0)|
|*10| HASH JOIN | | 400K| 23M| | 16111 (0)|
| 11| TABLE ACCESS FULL |T_PAY_MODE | 25| 525| | 2 (0)|
|*12| TABLE ACCESS FULL |T_POLICY_FEE | 400K| 15M| | 16107 (0)|
| 13| TABLE ACCESS FULL |T_CONTRACT_MASTER |1136K| 46M| | 9437 (0)|
| 14| VIEW |index_join_007 |2028K| 30M| | |
|*15| HASH JOIN | | 400K| 45M| 44M| 32315 (1)|
| 16| INDEX FAST FULL SCAN |PK_T_CUSTOMER |2028K| 30M| | 548 (0)|
| 17| INDEX FAST FULL SCAN |IDX_CUSTOMER__BIR_REAL_GEN|2028K| 30M| | 548 (0)|
| 18| VIEW |VW_NSO_1 | 7| 42| | |
|*19| CONNECT BY WITH FILTERING | | | | | |
| 20| NESTED LOOPS | | | | | |
|*21| INDEX UNIQUE SCAN |PK_T_COMPANY_ORGAN | 1| 6| | |
| 22| TABLE ACCESS BY USER ROWID|T_COMPANY_ORGAN | | | | |
| 23| NESTED LOOPS | | | | | |
| 24| BUFFER SORT | | 7| 70| | |
| 25| CONNECT BY PUMP | | | | | |
|*26| INDEX RANGE SCAN |T_COMPANY_ORGAN_IDX_002 | 7| 70| | 1 (0)|
| 27| TABLE ACCESS BY INDEX ROWID |T_POLICY_CHANGE | 1| 14| | 2 (50)|
|*28| INDEX UNIQUE SCAN |PK_T_POLICY_CHANGE | 1| | | 1 (0)|
| 29| INDEX FAST FULL SCAN |IDX1_ACCEPT_DATE |1136K| 23M| | 899 (0)|
| 30| TABLE ACCESS BY INDEX ROWID |T_COMPANY_CUSTOMER | 1| 90| | 2 (50)|
|*31| INDEX UNIQUE SCAN |PK_T_COMPANY_CUSTOMER | 1| | | |
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("TPF"."FEE_TYPE"="C"."TYPE_ID")
5 - access("TCM"."POLICY_ID"="TPC"."POLICY_ID")
7 - access("TPF"."ORGAN_ID"="VW_NSO_1"."$nso_col_1")
8 - access("D"."APPLICANT_ID"="F"."CUSTOMER_ID"(+))
9 - access("TPF"."POLICY_ID"="D"."POLICY_ID")
10 - access("TPF"."PAY_MODE"="G"."MODE_ID")
12 - filter("TPF"."CHANGE_ID" IS NOT NULL AND TO_NUMBER("TPF"."RECEIV_STATUS")=1 AND "TPF"."FEE_STATUS"=1 AND
"TPF"."PAYMENT_ID" IS NULL)
15 - access("indexjoin_alias_012".ROWID="indexjoin_alias_011".ROWID)
19 - filter("T_COMPANY_ORGAN"."ORGAN_ID"='101')
21 - access("T_COMPANY_ORGAN"."ORGAN_ID"='101')
26 - access("T_COMPANY_ORGAN"."PARENT_ID"=NULL)
28 - access("TPC"."CHANGE_ID"="TPF"."CHANGE_ID")
31 - access("D"."COMPANY_ID"="E"."COMPANY_ID"(+))
55 rows selected
Statistics
----------------------------------------------------------
21 recursive calls
0 db block gets
125082 consistent gets
21149 physical reads
0 redo size
2448 bytes sent via SQL*Net to client
656 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
4 sorts (memory)
0 sorts (disk)
11 rows processed
上述 SQL 要执行 12 秒左右,逻辑读 12 万。该 SQL 中,t_policy_fee tpf 有 400 万行,t_contract_master tcm 有 1000 万行。其余表都是小表。
根据 SQL 三段分拆方法首先检查了 SQL 写法,SQL 写法没有明显不妥之处。然后开始检查执行计划。我们注意观察执行计划的统计信息(Statistics),该 SQL 最终只返回 11 行数据(11 rows processed)。SQL中有13个GROUP BY字段,一般而言,GROUP BY字段越少,去重能力越强;GROUP BY字段越多,去重能力越弱。因此,我们判断该SQL在GROUP BY之前只返回少量数据,返回少量数据应该走嵌套循环,而不是走HASH连接。既然推断出该 SQL 最终返回数据量较少,那么 SQL 中的大表都应该走索引,但是 SQL 语句中的两个大表 t_policy_fee tpf 与 t_contract_master tcm 都是走的全表扫描,这显然不对。它们应该走索引,或者作为嵌套循环的被驱动表。
根据上面分析,我们将注意力集中在了大表(Id=12 和 Id=13)上,同时也将注意力集中在了 HASH 连接上。执行计划中 Id=12 有 TO_NUMBER(「TPF」.「RECEIV_STATUS」)=1,开发人员少写了引号,这可能导致 SQL 不走索引。Id=13 前面没有「*」号,这说明 T_CONTRACT_MASTER 没有过滤条件,如果走 HASH 连接,那么该表只能走全表扫描。但是该表有 1000 万条数据,所以只能让它作为嵌套循环被驱动表,然后走连接列的索引。
SQL 语句中有个 in 子查询,并且子查询中有固化子查询关键字 start with,在 7.1 节中讲到,in 子查询中有固化子查询关键字,子查询可以展开(unnest)。这个 in 子查询只返回 1 行数据,在执行计划中它属于 Id=18,然后它与 Id=8 进行的是 HASH 连接。Where 子查询 unnest 之后,一般都会打乱执行计划,也就是说 Id=8,Id=9,Id=10,Id=11,Id=12,Id=13,Id=14 的执行计划都会因为子查询被展开而在一起关联的。
我们再回去看原始 SQL,原始 SQL 中只有 tpf 表有过滤条件,其他表均无过滤条件。而 tpf 表的过滤条件要么是状态字段过滤(tpf.receiv_status = 1 and tpf.fee_status = 1),要么是组织编号过滤 tpf.organ_id in(子查询)。因此判断这些过滤条件并不能过滤掉大部分数据。SQL 中有两处外链接,d.company_id = e.company_id(+),d.applicant_id = f.customer_id(+),如果走嵌套循环,外连接无法更改驱动表。如果走 HASH 连接,外连接可以更改驱动表。
因为 SQL 最终只返回少量数据,我们判断执行计划应该走嵌套循环。走嵌套循环首先要确定好谁做驱动表。根据上面的分析 e,f 首先被排除掉做驱动表的可能性,因为它们是外连接的从表,tpf,tcm 也被排除掉作为驱动表的可能性,因为它们是大表。现在只剩下 tpc,c 和 g 可以作为驱动表候选,tpc,c,g 都是与 tpf 关联的,只需要看谁最小,谁就作为驱动表。而在原始执行计划中,因为 in 子查询被展开了,扰乱了执行计划,导致 Id=11,Id=12,Id=13 走了 HASH 连接,所以笔者对子查询添加了 HINT:NO_UNNEST,让子查询不展开,从而不去干扰执行计划,添加 HINT 后的 SQL 如下。
select tpc.policy_id,
tcm.policy_code,
tpf.organ_id,
to_char(tpf.insert_time, 'YYYY-MM-DD') As insert_time,
tpc.change_id,
d.policy_code,
e.company_name,
f.real_name,
tpf.fee_type,
sum(tpf.pay_balance) as pay_balance,
c.actual_type,
tpc.notice_code,
d.policy_type,
g.mode_name as pay_mode
from t_policy_change tpc,
t_contract_master tcm,
t_policy_fee tpf,
t_fee_type c,
t_contract_master d,
t_company_customer e,
t_customer f,
t_pay_mode g
where tpc.change_id = tpf.change_id
and tpf.policy_id = d.policy_id
and tcm.policy_id = tpc.policy_id
and tpf.receiv_status = '1' ---这里原来没引号,是开发搞忘了写''
and tpf.fee_status = 1
and tpf.payment_id is null
and tpf.fee_type = c.type_id
and tpf.pay_mode = g.mode_id
and d.company_id = e.company_id(+)
and d.applicant_id = f.customer_id(+)
and tpf.organ_id in
(select /*+ no_unnest */
organ_id
from t_company_organ
start with organ_id = '101'
connect by prior organ_id = parent_id)
group by tpc.policy_id,
tpc.change_id,
tpf.fee_type,
to_char(tpf.insert_time, 'YYYY-MM-DD'),
c.actual_type,
d.policy_code,
g.mode_name,
e.company_name,
f.real_name,
tpc.notice_code,
d.policy_type,
tpf.organ_id,
tcm.policy_code
order by change_id, fee_type
执行计划如下。
select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------
| Id|Operation | Name |Rows |Bytes| Cost (%CPU)|
-------------------------------------------------------------------------------------
| 0|SELECT STATEMENT | |20026|4928K| 68615 (30)|
| 1| SORT GROUP BY | |20026|4928K| 28563 (0)|
|* 2| FILTER | | | | |
| 3| NESTED LOOPS | |20026|4928K| 27812 (0)|
| 4| NESTED LOOPS | |20026|4498K| 23807 (0)|
| 5| NESTED LOOPS OUTER | |20026|4224K| 19802 (0)|
| 6| NESTED LOOPS OUTER | |20026|3911K| 15797 (0)|
| 7| NESTED LOOPS | |20026|2151K| 15796 (0)|
|* 8| HASH JOIN | |20026|1310K| 11791 (0)|
| 9| INDEX FULL SCAN |T_FEE_TYPE_IDX_003 | 106| 636 | 1 (0)|
|*10| HASH JOIN | |20026|1192K| 11789 (0)|
| 11| TABLE ACCESS FULL |T_PAY_MODE | 25| 525| 2 (0)|
|*12| TABLE ACCESS BY INDEX ROWID|T_POLICY_FEE |20026| 782K| 11786 (0)|
|*13| INDEX RANGE SCAN |IDX_POLICY_FEE__RECEIV_STATUS |1243K| | 10188 (0)|
| 14| TABLE ACCESS BY INDEX ROWID |T_CONTRACT_MASTER | 1| 43| 2 (50)|
|*15| INDEX UNIQUE SCAN |PK_T_CONTRACT_MASTER | 1| | 1 (0)|
| 16| TABLE ACCESS BY INDEX ROWID |T_COMPANY_CUSTOMER | 1| 90| 2 (50)|
|*17| INDEX UNIQUE SCAN |PK_T_COMPANY_CUSTOMER | 1| | |
| 18| TABLE ACCESS BY INDEX ROWID |T_CUSTOMER | 1| 16| 2 (50)|
|*19| INDEX UNIQUE SCAN |PK_T_CUSTOMER | 1| | 1 (0)|
| 20| TABLE ACCESS BY INDEX ROWID |T_POLICY_CHANGE | 1| 14| 2 (50)|
|*21| INDEX UNIQUE SCAN |PK_T_POLICY_CHANGE | 1| | 1 (0)|
| 22| TABLE ACCESS BY INDEX ROWID |T_CONTRACT_MASTER | 1| 22| 2 (50)|
|*23| INDEX UNIQUE SCAN |PK_T_CONTRACT_MASTER | 1| | 1 (0)|
|*24| FILTER | | | | |
|*25| CONNECT BY WITH FILTERING | | | | |
| 26| NESTED LOOPS | | | | |
|*27| INDEX UNIQUE SCAN |PK_T_COMPANY_ORGAN | 1| 6| |
| 28| TABLE ACCESS BY USER ROWID |T_COMPANY_ORGAN | | | |
| 29| NESTED LOOPS | | | | |
| 30| BUFFER SORT | | 7| 70| |
| 31| CONNECT BY PUMP | | | | |
|*32| INDEX RANGE SCAN |T_COMPANY_ORGAN_IDX_002 | 7| 70| 1 (0)|
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter( EXISTS (SELECT /*+ NO_UNNEST */ 0 FROM "T_COMPANY_ORGAN" "T_COMPANY_ORGAN" WHERE
"T_COMPANY_ORGAN"."PARENT_ID"=NULL AND ("T_COMPANY_ORGAN"."ORGAN_ID"=:B1)))
8 - access("SYS_ALIAS_1"."FEE_TYPE"="C"."TYPE_ID")
10 - access("SYS_ALIAS_1"."PAY_MODE"="G"."MODE_ID")
12 - filter("SYS_ALIAS_1"."CHANGE_ID" IS NOT NULL AND "SYS_ALIAS_1"."FEE_STATUS"=1 AND
"SYS_ALIAS_1"."PAYMENT_ID" IS NULL)
13 - access("SYS_ALIAS_1"."RECEIV_STATUS"='1')
15 - access("SYS_ALIAS_1"."POLICY_ID"="D"."POLICY_ID")
17 - access("D"."COMPANY_ID"="E"."COMPANY_ID"(+))
19 - access("D"."APPLICANT_ID"="F"."CUSTOMER_ID"(+))
21 - access("TPC"."CHANGE_ID"="SYS_ALIAS_1"."CHANGE_ID")
23 - access("TCM"."POLICY_ID"="TPC"."POLICY_ID")
24 - filter("T_COMPANY_ORGAN"."ORGAN_ID"=:B1)
25 - filter("T_COMPANY_ORGAN"."ORGAN_ID"='101')
27 - access("T_COMPANY_ORGAN"."ORGAN_ID"='101')
32 - access("T_COMPANY_ORGAN"."PARENT_ID"=NULL)
58 rows selected.
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
2817 consistent gets
0 physical reads
0 redo size
2268 bytes sent via SQL*Net to client
656 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
40 sorts (memory)
0 sorts (disk)
9 rows processed
添加完 HINT 之后,SQL 能在 1 秒内执行完毕,逻辑读也降低到 2 817。如果不想添加 HINT,我们可以将 in 改成 exists,因为子查询中有固化子查询关键字,这时 SQL 不能展开,会自动走 Filter,也能达到添加 HINT:NO_UNNEST 的效果,但是,这并不是说 exists 比 in 性能好!
我们推荐大家在 Oracle 中使用 in 而不是使用 exists。因为 exists 子查询中有固化子查询关键字会自动走 Filter,想要消除 Filter 只能改写 SQL。in 可以控制走 Filter 或者不走,in 执行计划可控,而 exists 执行计划不可控。
对于in子查询,我们一定要搞清楚in子查询返回多少数据,究竟能起到多大过滤作用。如果in子查询能过滤掉主表大量数据,这时我们一定要让in子查询展开并且作为NL驱动表反向驱动主表,主表作为NL被驱动表,走连接列索引。如果in子查询不能过滤掉主表大量数据,这时要检查in子查询返回数据量多少,如果返回数据量很少,in子查询即使不展开,走Filter也不大会影响SQL性能。如果in子查询返回数据量很多,但是并不能过滤掉主表大量数据,这时一定要让in子查询展开并且与主表走HASH连接。
本案例中,in 子查询返回数据量很少,只有 1 行数据,但是主表并不能用子查询过滤大量数据,因为过滤条件是 tpf.organ_id,组织关系 id 这种列一般基数很低。其实原始 SQL 相当于如下写法。
select tpc.policy_id,
tcm.policy_code,
tpf.organ_id,
to_char(tpf.insert_time, 'YYYY-MM-DD') As insert_time,
tpc.change_id,
d.policy_code,
e.company_name,
f.real_name,
tpf.fee_type,
sum(tpf.pay_balance) as pay_balance,
c.actual_type,
tpc.notice_code,
d.policy_type,
g.mode_name as pay_mode
from t_policy_change tpc,
t_contract_master tcm,
t_policy_fee tpf,
t_fee_type c,
t_contract_master d,
t_company_customer e,
t_customer f,
t_pay_mode g
where tpc.change_id = tpf.change_id
and tpf.policy_id = d.policy_id
and tcm.policy_id = tpc.policy_id
and tpf.receiv_status = 1
and tpf.fee_status = 1
and tpf.payment_id is null
and tpf.fee_type = c.type_id
and tpf.pay_mode = g.mode_id
and d.company_id = e.company_id(+)
and d.applicant_id = f.customer_id(+)
and tpf.organ_id in ('xxx') ---将子查询换成具体值,这样就不会干扰执行计划
group by tpc.policy_id,
tpc.change_id,
tpf.fee_type,
to_char(tpf.insert_time, 'YYYY-MM-DD'),
c.actual_type,
d.policy_code,
g.mode_name,
e.company_name,
f.real_name,
tpc.notice_code,
d.policy_type,
tpf.organ_id,
tcm.policy_code
order by change_id, fee_type;
因为原始 SQL 本意相当于以上 SQL,子查询只起过滤作用,所以使用 **HINT:NO_UNNEST**
,让子查询不去干扰正常执行计划,从而达到优化目的。
案例二
本案例与上一个案例是同一个人的优化请求,SQL 语句如下。
select distinct decode(length(a.category_id),
5,
decode(a.origin_type, 801, 888888, 999999),
a.category_id) category_id,
a.notice_code,
a.treat_status,
lr.real_name as receiver_name,
f.send_code,
f.policy_code,
g.real_name agent_name,
f.organ_id,
f.dept_id,
a.policy_id,
a.change_id,
a.case_id,
a.group_policy_id,
a.fee_id,
a.auth_id,
a.pay_id,
cancel_appoint.appoint_time cancel_appoint_time,
a.insert_time,
a.send_time,
a.end_time,
f.agency_code,
a.REPLY_TIME,
a.REPLY_EMP_ID,
a.FIRST_DUTY,
a.NEED_SEND_PRINT,
11 source
from t_policy_problem a,
t_policy f,
t_agent g,
t_letter_receiver lr,
t_problem_category pc,
t_policy_cancel_appoint cancel_appoint
where f.agent_id = g.agent_id(+)
and a.policy_id = f.policy_id(+)
and lr.main_receiver = 'Y'
and a.category_id = pc.category_id
and a.item_id = lr.item_id
and a.policy_id = cancel_appoint.policy_id(+)
And a.Item_Id = (Select Max(item_id)
From t_Policy_Problem
Where notice_code = a.notice_code)
and a.policy_id is not null
and a.notice_code is not null
and a.change_id is null
and a.case_id is null
and a.group_policy_id is null
and a.origin_type not in (801, 802)
and a.pay_id is null
and a.category_id not in (130103, 130104, 130102, 140102, 140101)
and f.policy_type = '1'
and (a.fee_id is null or (a.fee_id is not null and a.origin_type = 701))
and exists((select 1
from t_dept
where f.dept_id = dept_id
start with dept_id = '1020200028'
connect by parent_id = prior dept_id))
and exists (select 1
from T_COMPANY_ORGAN
where f.organ_id = organ_id
start with organ_id = '10202'
connect by parent_id = prior organ_id)
and pc.NEED_PRITN = 'Y';
朋友说这个 SQL 执行不出结果。执行计划如下。
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------
| Id |Operation | Name |Rows |Bytes|Cost(%CPU)|
-------------------------------------------------------------------------------------
| 0 |SELECT STATEMENT | | 1| 236| 741 (1)|
| 1 | SORT UNIQUE | | 1| 236| 681 (0)|
|* 2 | FILTER | | | | |
| 3 | NESTED LOOPS | | 1| 236| 666 (1)|
| 4 | NESTED LOOPS OUTER | | 1| 219| 665 (1)|
| 5 | NESTED LOOPS | | 1| 203| 664 (1)|
| 6 | NESTED LOOPS OUTER | | 1| 196| 663 (1)|
| 7 | NESTED LOOPS | | 1| 182| 662 (1)|
|* 8 | TABLE ACCESS FULL |T_POLICY_PROBLEM | 1| 107| 660 (0)|
|* 9 | TABLE ACCESS BY INDEX ROWID|T_POLICY | 1| 75| 2 (50)|
|*10 | INDEX UNIQUE SCAN |PK_T_POLICY | 1| | 1 (0)|
| 11 | TABLE ACCESS BY INDEX ROWID|T_POLICY_CANCEL_APPOINT | 1| 14| 2 (50)|
|*12 | INDEX UNIQUE SCAN |UK1_POLICY_CANCEL_APPOINT | 1| | |
|*13 | TABLE ACCESS BY INDEX ROWID |T_PROBLEM_CATEGORY | 1| 7| 2 (50)|
|*14 | INDEX UNIQUE SCAN |PK_T_PROBLEM_CATEGORY | 1| | |
| 15 | TABLE ACCESS BY INDEX ROWID |T_AGENT | 1| 16| 2 (50)|
|*16 | INDEX UNIQUE SCAN |PK_T_AGENT | 1| | |
|*17 | INDEX RANGE SCAN |T_LETTER_RECEIVER_IDX_001 | 1| 17| 2 (0)|
| 18 | SORT AGGREGATE | | 1| 21| |
| 19 | TABLE ACCESS BY INDEX ROWID |T_POLICY_PROBLEM | 1| 21| 2 (50)|
|*20 | INDEX RANGE SCAN |IDX_POLICY_PROBLEM__N_CODE | 1| | 3 (0)|
|*21 | FILTER | | | | |
|*22 | CONNECT BY WITH FILTERING | | | | |
| 23 | NESTED LOOPS | | | | |
|*24 | INDEX UNIQUE SCAN |PK_T_DEPT | 1| 17| 1 (0)|
| 25 | TABLE ACCESS BY USER ROWID |T_DEPT | | | |
| 26 | HASH JOIN | | | | |
| 27 | CONNECT BY PUMP | | | | |
| 28 | TABLE ACCESS FULL |T_DEPT |30601| 896K| 56 (0)|
|*29 | FILTER | | | | |
|*30 | CONNECT BY WITH FILTERING | | | | |
| 31 | NESTED LOOPS | | | | |
|*32 | INDEX UNIQUE SCAN |PK_T_COMPANY_ORGAN | 1| 6| |
| 33 | TABLE ACCESS BY USER ROWID |T_COMPANY_ORGAN | | | |
| 34 | NESTED LOOPS | | | | |
| 35 | BUFFER SORT | | 7| 70| |
| 36 | CONNECT BY PUMP | | | | |
|*37 | INDEX RANGE SCAN |T_COMPANY_ORGAN_IDX_002 | 7| 70| 1 (0)|
-------------------------------------------------------------------------------------
2 - filter("SYS_ALIAS_1"."ITEM_ID"= (SELECT /*+ */ MAX("T_POLICY_PROBLEM"."ITEM_ID") FROM
"T_POLICY_PROBLEM" "T_POLICY_PROBLEM" WHERE "T_POLICY_PROBLEM"."NOTICE_CODE"=:B1) AND
EXISTS (SELECT/*+ */ 0 FROM "T_DEPT" "T_DEPT" AND ("T_DEPT"."DEPT_ID"=:B2)) AND EXISTS
(SELECT /*+ */ 0 FROM"T_COMPANY_ORGAN" "T_COMPANY_ORGAN" WHERE
"T_COMPANY_ORGAN"."PARENT_ID"=NULL AND ("T_COMPANY_ORGAN"."ORGAN_ID"=:B3)))
8 - filter("SYS_ALIAS_1"."POLICY_ID" IS NOT NULL AND "SYS_ALIAS_1"."NOTICE_CODE" IS NOT NULL AND
"SYS_ALIAS_1"."CHANGE_ID" IS NULL AND "SYS_ALIAS_1"."CASE_ID" IS NULL AND
"SYS_ALIAS_1"."GROUP_POLICY_ID" IS NULL AND TO_NUMBER("SYS_ALIAS_1"."ORIGIN_TYPE")<>801 AND
TO_NUMBER("SYS_ALIAS_1"."ORIGIN_TYPE")<>802 AND "SYS_ALIAS_1"."PAY_ID" IS NULL AND
"SYS_ALIAS_1"."CATEGORY_ID"<>130103 AND "SYS_ALIAS_1"."CATEGORY_ID"<>130104 AND
"SYS_ALIAS_1"."CATEGORY_ID"<>130102 AND "SYS_ALIAS_1"."CATEGORY_ID"<>140102 AND
"SYS_ALIAS_1"."CATEGORY_ID"<>140101 AND ("SYS_ALIAS_1"."FEE_ID" IS NULL OR
"SYS_ALIAS_1"."FEE_ID" IS NOT NULL AND TO_NUMBER("SYS_ALIAS_1"."ORIGIN_TYPE")=701))
9 - filter(TO_NUMBER("SYS_ALIAS_3"."POLICY_TYPE")=1)
10 - access("SYS_ALIAS_1"."POLICY_ID"="SYS_ALIAS_3"."POLICY_ID")
12 - access("SYS_ALIAS_1"."POLICY_ID"="CANCEL_APPOINT"."POLICY_ID"(+))
13 - filter("PC"."NEED_PRITN"='Y')
14 - access("SYS_ALIAS_1"."CATEGORY_ID"="PC"."CATEGORY_ID")
filter("PC"."CATEGORY_ID"<>130103 AND "PC"."CATEGORY_ID"<>130104 AND "PC"."CATEGORY_ID"<>130102
AND "PC"."CATEGORY_ID"<>140102 AND "PC"."CATEGORY_ID"<>140101)
16 - access("SYS_ALIAS_3"."AGENT_ID"="G"."AGENT_ID"(+))
17 - access("LR"."MAIN_RECEIVER"='Y' AND "SYS_ALIAS_1"."ITEM_ID"="LR"."ITEM_ID")
20 - access("T_POLICY_PROBLEM"."NOTICE_CODE"=:B1)
21 - filter("T_DEPT"."DEPT_ID"=:B1)
22 - filter("T_DEPT"."DEPT_ID"='1020200028')
24 - access("T_DEPT"."DEPT_ID"='1020200028')
29 - filter("T_COMPANY_ORGAN"."ORGAN_ID"=:B1)
30 - filter("T_COMPANY_ORGAN"."ORGAN_ID"='10202')
32 - access("T_COMPANY_ORGAN"."ORGAN_ID"='10202')
37 - access("T_COMPANY_ORGAN"."PARENT_ID"=NULL)
77 rows selected.
从执行计划中 Id=2 看到,该 SQL 走了 Filter,Id=3、Id=18、Id=21、Id=29 都是 Id=2 的儿子。因为 Filter 类似嵌套循环,如果 Id=3 返回大量数据,会导致 Id=18、Id=21、Id=29 被多次扫描,正是因为 SQL 走的是 Filter,才导致 SQL 执行不出结果。
为什么会走 Filter 呢?我们注意查看 SQL 写法,SQL 语句中有两个 exists(子查询),子查询中有固化子查询关键字 start with,正是因为 SQL 写成了 exists,才导致走了 Filter。于是我们用 in 改写 exists。
select distinct decode(length(a.category_id),
5,
decode(a.origin_type, 801, 888888, 999999),
a.category_id) category_id,
a.notice_code,
a.treat_status,
lr.real_name as receiver_name,
f.send_code,
f.policy_code,
g.real_name agent_name,
f.organ_id,
f.dept_id,
a.policy_id,
a.change_id,
a.case_id,
a.group_policy_id,
a.fee_id,
a.auth_id,
a.pay_id,
cancel_appoint.appoint_time cancel_appoint_time,
a.insert_time,
a.send_time,
a.end_time,
f.agency_code,
a.REPLY_TIME,
a.REPLY_EMP_ID,
a.FIRST_DUTY,
a.NEED_SEND_PRINT,
11 source
from t_policy_problem a,
t_policy f,
t_agent g,
t_letter_receiver lr,
t_problem_category pc,
t_policy_cancel_appoint cancel_appoint
where f.agent_id = g.agent_id(+)
and a.policy_id = f.policy_id(+)
and lr.main_receiver = 'Y'
and a.category_id = pc.category_id
and a.item_id = lr.item_id
and a.policy_id = cancel_appoint.policy_id(+)
And a.Item_Id = (Select Max(item_id)
From t_Policy_Problem
Where notice_code = a.notice_code)
and a.policy_id is not null
and a.notice_code is not null
and a.change_id is null
and a.case_id is null
and a.group_policy_id is null
and a.origin_type not in (801, 802)
and a.pay_id is null
and a.category_id not in (130103, 130104, 130102, 140102, 140101)
and f.policy_type = '1'
and (a.fee_id is null or (a.fee_id is not null and a.origin_type = 701))
and f.dept_id in (select dept_id
from t_dept
start with dept_id = '1020200028'
connect by parent_id = prior dept_id))
and f.organ_id in (select organ_id
from T_COMPANY_ORGAN
start with organ_id = '10202'
connect by parent_id = prior organ_id)
and pc.NEED_PRITN = 'Y';
改写后的执行计划如下。
-------------------------------------------------------------------------------------
| Id|Operation | Name |Rows |Bytes|Cost(%CPU)|
-------------------------------------------------------------------------------------
| 0|SELECT STATEMENT | | 1| 259| 742 (1)|
| 1| SORT UNIQUE | | 1| 259| 740 (0)|
|* 2| FILTER | | | | |
|* 3| HASH JOIN | | 1| 259| 725 (1)|
| 4| NESTED LOOPS | | 1| 253| 723 (1)|
| 5| NESTED LOOPS | | 1| 236| 722 (1)|
| 6| NESTED LOOPS OUTER | | 1| 229| 721 (1)|
| 7| NESTED LOOPS OUTER | | 1| 215| 720 (1)|
|* 8| HASH JOIN | | 1| 199| 719 (1)|
| 9| NESTED LOOPS | | 1| 182| 662 (1)|
|*10| TABLE ACCESS FULL |T_POLICY_PROBLEM | 1| 107| 660 (0)|
|*11| TABLE ACCESS BY INDEX ROWID|T_POLICY | 1| 75| 2 (50)|
|*12| INDEX UNIQUE SCAN |PK_T_POLICY | 1| | 1 (0)|
| 13| VIEW |VW_NSO_1 |30601| 508K| |
|*14| CONNECT BY WITH FILTERING | | | | |
| 15| NESTED LOOPS | | | | |
|*16| INDEX UNIQUE SCAN |PK_T_DEPT | 1| 17| 1 (0)|
| 17| TABLE ACCESS BY USER ROWID|T_DEPT | | | |
| 18| HASH JOIN | | | | |
| 19| CONNECT BY PUMP | | | | |
| 20| TABLE ACCESS FULL |T_DEPT |30601| 896K| 56 (0)|
| 21| TABLE ACCESS BY INDEX ROWID |T_AGENT | 1| 16| 2 (50)|
|*22| INDEX UNIQUE SCAN |PK_T_AGENT | 1| | |
| 23| TABLE ACCESS BY INDEX ROWID |T_POLICY_CANCEL_APPOINT | 1| 14| 2 (50)|
|*24| INDEX UNIQUE SCAN |UK1_POLICY_CANCEL_APPOINT | 1| | |
|*25| TABLE ACCESS BY INDEX ROWID |T_PROBLEM_CATEGORY | 1| 7| 2 (50)|
|*26| INDEX UNIQUE SCAN |PK_T_PROBLEM_CATEGORY | 1| | |
|*27| INDEX RANGE SCAN |T_LETTER_RECEIVER_IDX_001 | 1| 17| 2 (0)|
| 28| VIEW |VW_NSO_2 | 7| 42| |
|*29| CONNECT BY WITH FILTERING | | | | |
| 30| NESTED LOOPS | | | | |
|*31| INDEX UNIQUE SCAN |PK_T_COMPANY_ORGAN | 1| 6| |
| 32| TABLE ACCESS BY USER ROWID |T_COMPANY_ORGAN | | | |
| 33| NESTED LOOPS | | | | |
| 34| BUFFER SORT | | 7| 70| |
| 35| CONNECT BY PUMP | | | | |
|*36| INDEX RANGE SCAN |T_COMPANY_ORGAN_IDX_002 | 7| 70| 1 (0)|
| 37| SORT AGGREGATE | | 1| 21| |
| 38| TABLE ACCESS BY INDEX ROWID |T_POLICY_PROBLEM | 1| 21| 2 (50)|
|*39| INDEX RANGE SCAN |IDX_POLICY_PROBLEM__N_CODE | 1| | 3 (0)|
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("SYS_ALIAS_1"."ITEM_ID"= (SELECT /*+ */ MAX("T_POLICY_PROBLEM"."ITEM_ID") FROM
"T_POLICY_PROBLEM" "T_POLICY_PROBLEM" WHERE "T_POLICY_PROBLEM"."NOTICE_CODE"=:B1))
3 - access("F"."ORGAN_ID"="VW_NSO_2"."$nso_col_1")
8 - access("F"."DEPT_ID"="VW_NSO_1"."$nso_col_1")
10 - filter("SYS_ALIAS_1"."POLICY_ID" IS NOT NULL AND "SYS_ALIAS_1"."NOTICE_CODE" IS NOT NULL AND
"SYS_ALIAS_1"."CHANGE_ID" IS NULL AND "SYS_ALIAS_1"."CASE_ID" IS NULL AND
"SYS_ALIAS_1"."GROUP_POLICY_ID"
IS NULL AND TO_NUMBER("SYS_ALIAS_1"."ORIGIN_TYPE")<>801 AND
TO_NUMBER("SYS_ALIAS_1"."ORIGIN_TYPE")<>802
AND "SYS_ALIAS_1"."PAY_ID" IS NULL AND "SYS_ALIAS_1"."CATEGORY_ID"<>130103 AND
"SYS_ALIAS_1"."CATEGORY_ID"<>130104 AND "SYS_ALIAS_1"."CATEGORY_ID"<>130102 AND
"SYS_ALIAS_1"."CATEGORY_ID"<>140102 AND "SYS_ALIAS_1"."CATEGORY_ID"<>140101 AND
("SYS_ALIAS_1"."FEE_ID" IS NULL OR "SYS_ALIAS_1"."FEE_ID" IS NOT NULL AND
TO_NUMBER("SYS_ALIAS_1"."ORIGIN_TYPE")=701))
11 - filter("F"."POLICY_TYPE"='1')
12 - access("SYS_ALIAS_1"."POLICY_ID"="F"."POLICY_ID")
14 - filter("T_DEPT"."DEPT_ID"='1020200028')
16 - access("T_DEPT"."DEPT_ID"='1020200028')
22 - access("F"."AGENT_ID"="G"."AGENT_ID"(+))
24 - access("SYS_ALIAS_1"."POLICY_ID"="CANCEL_APPOINT"."POLICY_ID"(+))
25 - filter("PC"."NEED_PRITN"='Y')
26 - access("SYS_ALIAS_1"."CATEGORY_ID"="PC"."CATEGORY_ID")
filter("PC"."CATEGORY_ID"<>130103 AND "PC"."CATEGORY_ID"<>130104 AND "PC"."CATEGORY_ID"<>130102
AND "PC"."CATEGORY_ID"<>140102 AND "PC"."CATEGORY_ID"<>140101)
27 - access("LR"."MAIN_RECEIVER"='Y' AND "SYS_ALIAS_1"."ITEM_ID"="LR"."ITEM_ID")
29 - filter("T_COMPANY_ORGAN"."ORGAN_ID"='10202')
31 - access("T_COMPANY_ORGAN"."ORGAN_ID"='10202')
36 - access("T_COMPANY_ORGAN"."PARENT_ID"=NULL)
39 - access("T_POLICY_PROBLEM"."NOTICE_CODE"=:B1)
SQL 改写之后,可以在 35 秒左右出结果,而之前是很久跑不出结果。用 in 代替 exists 之后,两个 in 子查询因为进行了 Subquery Unnesting,消除了 Filter。从执行计划中我们可以看到,两个子查询都走的是 HASH 连接,这样两个 in 子查询都只会被扫描一次。用 in 代替 exists 之后,执行计划中还有 Filter,这时的 Filter 来自于 t_Policy_Problem 自关联。
And a.Item_Id = (Select Max(item_id)
From t_Policy_Problem
Where notice_code = a.notice_code)
在第 8 章中讲到,可以利用分析函数改写自关联。因为当时朋友对 35 秒出结果已经很满意,所以我们没有进一步改写 SQL。本以为能逃过帮忙改写 SQL「一劫」,但是 2012 年刚过完春节,就被朋友骚扰了,朋友要求继续优化,有兴趣的读者可以查看博客:利用分析函数优化自连接。
这里其实相当于 t_Policy_Problem
这个表做自连接,但是这个自连接很坑爹,会导致 t_Policy_Problem 表扫描2次,从执行计划上可以看出,第10步这里它做了一个全表扫描,然后在最后37、38、39 这3步走了索引,然后回表,最后它还要被FILTER过滤 ,恩 这个地方就是 这个SQL的性能瓶颈 那么SQL 改写如下:
WITH t_Policy_Problem_w AS
(SELECT tp.*,
max(item_id) OVER (PARTITION BY notice_code)
max_item_id
FROM t_Policy_Problem tp)
select distinct decode(length(a.category_id),
5,
decode(a.origin_type, 801, 888888, 999999),
a.category_id) category_id,
a.notice_code,
a.treat_status,
lr.real_name as receiver_name,
f.send_code,
f.policy_code,
g.real_name agent_name,
f.organ_id,
f.dept_id,
a.policy_id,
a.change_id,
a.case_id,
a.group_policy_id,
a.fee_id,
a.auth_id,
a.pay_id,
cancel_appoint.appoint_time cancel_appoint_time,
a.insert_time,
a.send_time,
a.end_time,
f.agency_code,
a.REPLY_TIME,
a.REPLY_EMP_ID,
a.FIRST_DUTY,
a.NEED_SEND_PRINT,
11 source
from t_Policy_Problem_w a,
t_policy f,
t_agent g,
t_letter_receiver lr,
t_problem_category pc,
t_policy_cancel_appoint cancel_appoint
where
a.item_id=a.max_item_id
and f.agent_id = g.agent_id(+)
and a.policy_id = f.policy_id(+)
and lr.main_receiver = 'Y'
and a.category_id = pc.category_id
and a.item_id = lr.item_id
and a.policy_id = cancel_appoint.policy_id(+)
and a.policy_id is not null
and a.notice_code is not null
and a.change_id is null
and a.case_id is null
and a.group_policy_id is null
and a.origin_type not in (801, 802)
and a.pay_id is null
and a.category_id not in
(130103, 130104, 130102, 140102, 140101)
and f.policy_type = 1
and (a.fee_id is null or
(a.fee_id is not null and a.origin_type = 701))
and f.organ_id in
(select distinct organ_id
from T_COMPANY_ORGAN
start with organ_id = '107'
connect by parent_id = prior organ_id)
and pc.NEED_PRITN = 'Y'
----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)|
----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 21241 | 5289K| | 17992 (0)|
| 1 | SORT UNIQUE | | 21241 | 5289K| 11M| 17992 (0)|
|* 2 | HASH JOIN | | 21241 | 5289K| 5192K| 17192 (1)|
|* 3 | HASH JOIN OUTER | | 21248 | 4938K| 4856K| 16727 (1)|
|* 4 | HASH JOIN OUTER | | 21248 | 4606K| 4568K| 15994 (1)|
|* 5 | HASH JOIN | | 21248 | 4316K| | 15920 (1)|
|* 6 | TABLE ACCESS FULL | T_PROBLEM_CATEGORY | 371 | 2597 | | 4 (0)|
|* 7 | HASH JOIN | | 29477 | 5786K| 5712K| 15915 (1)|
|* 8 | HASH JOIN | | 62888 | 4974K| | 9575 (1)|
| 9 | VIEW | VW_NSO_1 | 7 | 42 | | |
|* 10 | CONNECT BY WITH FILTERING | | | | | |
| 11 | NESTED LOOPS | | | | | |
|* 12 | INDEX UNIQUE SCAN | PK_T_COMPANY_ORGAN | 1 | 6 | | |
| 13 | TABLE ACCESS BY USER ROWID| T_COMPANY_ORGAN | | | | |
| 14 | NESTED LOOPS | | | | | |
| 15 | BUFFER SORT | | 7 | 70 | | |
| 16 | CONNECT BY PUMP | | | | | |
|* 17 | INDEX RANGE SCAN | T_COMPANY_ORGAN_IDX_002 | 7 | 70 | | 1 (0)|
|* 18 | TABLE ACCESS FULL | T_POLICY | 637K| 45M| | 9569 (0)|
|* 19 | VIEW | | 300K| 34M| | |
| 20 | WINDOW SORT | | 300K| 30M| 88M| 5648 (0)|
| 21 | TABLE ACCESS FULL | T_POLICY_PROBLEM | 300K| 30M| | 706 (0)|
| 22 | TABLE ACCESS FULL | T_POLICY_CANCEL_APPOINT | 86 | 1204 | | 2 (0)|
| 23 | TABLE ACCESS FULL | T_AGENT | 88982 | 1390K| | 619 (0)|
|* 24 | INDEX FAST FULL SCAN | T_LETTER_RECEIVER_IDX_001 | 300K| 4987K| | 251 (0)|
----------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("A"."ITEM_ID"="LR"."ITEM_ID")
3 - access("F"."AGENT_ID"="G"."AGENT_ID"(+))
4 - access("A"."POLICY_ID"="CANCEL_APPOINT"."POLICY_ID"(+))
5 - access("A"."CATEGORY_ID"="PC"."CATEGORY_ID")
6 - filter("PC"."NEED_PRITN"='Y' AND "PC"."CATEGORY_ID"<>130103 AND "PC"."CATEGORY_ID"<>130104 AND
"PC"."CATEGORY_ID"<>130102 AND "PC"."CATEGORY_ID"<>140102 AND "PC"."CATEGORY_ID"<>140101)
7 - access("A"."POLICY_ID"="F"."POLICY_ID")
8 - access("F"."ORGAN_ID"="VW_NSO_1"."$nso_col_1")
10 - filter("T_COMPANY_ORGAN"."ORGAN_ID"='107')
12 - access("T_COMPANY_ORGAN"."ORGAN_ID"='107')
17 - access("T_COMPANY_ORGAN"."PARENT_ID"=NULL)
18 - filter(TO_NUMBER("F"."POLICY_TYPE")=1)
19 - filter("A"."ITEM_ID"="A"."MAX_ITEM_ID" AND "A"."POLICY_ID" IS NOT NULL AND "A"."NOTICE_CODE" IS NOT NULL
AND "A"."CHANGE_ID" IS NULL AND "A"."CASE_ID" IS NULL AND "A"."GROUP_POLICY_ID" IS NULL AND
TO_NUMBER("A"."ORIGIN_TYPE")<>801 AND TO_NUMBER("A"."ORIGIN_TYPE")<>802 AND "A"."PAY_ID" IS NULL AND
"A"."CATEGORY_ID"<>130103 AND "A"."CATEGORY_ID"<>130104 AND "A"."CATEGORY_ID"<>130102 AND
"A"."CATEGORY_ID"<>140102 AND "A"."CATEGORY_ID"<>140101 AND ("A"."FEE_ID" IS NULL OR "A"."FEE_ID" IS NOT NULL
AND TO_NUMBER("A"."ORIGIN_TYPE")=701))
24 - filter("LR"."MAIN_RECEIVER"='Y')
53 rows selected.
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
115995 consistent gets
42204 physical reads
0 redo size
2182416 bytes sent via SQL*Net to client
13289 bytes received via SQL*Net from client
1060 SQL*Net roundtrips to/from client
7 sorts (memory)
0 sorts (disk)
15879 rows processed
利用分析函数MAX OVER (PARTITION BY)
干掉 那个自连接,减少表访问次数,干掉那个FILTER ,现在逻辑读降低到了115995,性能提升了近15倍 。
通过阅读本案例,相信大家应该纠正了 exists 效率比 in 高这种错误认识。如果 where 子查询中没有固化子查询关键字,不管写成 in 还是写成 exists,效率都是一样的,因为 CBO 始终能将子查询展开(unnest)。如果 where 子查询中有固化子查询关键字,这时我们最好用 in 而不是 exists,因为 in 可以控制子查询是否展开,而 exists 无法展开。至于 where 子查询是展开性能好还是不展开性能好,我们要具体情况具体分析。