案例一
2015 年,网络优化班的学生问如何优化以下 SQL。
explain plan for
select gcode,name,idcode,address,noroom,etime from
LY_T_CHREC t where gcode in (
select gcode from LY_T_CHREC t where name='张三' and bdate ='19941109') a
;
Explained
select * from table(dbms_xplan.display(null,null,'ADVANCED -PROJECTION'));
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------
Plan hash value: 953100977
---------------------------------------------------------------------------
| Id | Operation | Name | Rows |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 |
|* 1 | HASH JOIN RIGHT SEMI | | 2 |
|* 2 | TABLE ACCESS BY GLOBAL INDEX ROWID| LY_T_CHREC | 1 |
|* 3 | INDEX RANGE SCAN | IDX_LY_T_CHREC_NAME | 15 |
| 4 | PARTITION HASH ALL | | 200M|
| 5 | TABLE ACCESS FULL | LY_T_CHREC | 200M|
---------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
1 - SEL$5DA710D3
2 - SEL$5DA710D3 / T@SEL$2
3 - SEL$5DA710D3 / T@SEL$2
5 - SEL$5DA710D3 / T@SEL$1
Outline Data
-------------
/*+
BEGIN_OUTLINE_DATA
SWAP_JOIN_INPUTS(@「SEL$5DA710D3」 「T」@「SEL$2」)
USE_HASH(@「SEL$5DA710D3」 「T」@「SEL$2」)
LEADING(@「SEL$5DA710D3」 「T」@「SEL$1」 「T」@「SEL$2」)
INDEX_RS_ASC(@「SEL$5DA710D3」 「T」@「SEL$2」 (「LY_T_CHREC」.「NAME」))
FULL(@「SEL$5DA710D3」 「T」@「SEL$1」)
OUTLINE(@「SEL$2」)
OUTLINE(@「SEL$1」)
UNNEST(@「SEL$2」)
OUTLINE_LEAF(@「SEL$5DA710D3」)
ALL_ROWS
DB_VERSION(『11.2.0.3』)
OPTIMIZER_FEATURES_ENABLE(『11.2.0.3』)
IGNORE_OPTIM_EMBEDDED_HINTS
END_OUTLINE_DATA
*/
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access(「GCODE」=「GCODE」)
2 - filter(「BDATE」='19941109')
3 - access(「NAME」='张三')
朋友提供的信息:子查询返回一个人开房的房号记录,共返回 63 行。该 SQL 就是查与某人相同的房间号的他人的记录。LY_T_CHREC 表有两亿条记录。整个 SQL 执行了 30 分钟还没出结果,子查询可以秒出结果,GCODE、NAME、IDCODE、ADDRESS、NOROOM、ETIME、BDATE 都有索引。
根据以上信息我们得出:该 SQL 主表 LY_T_CHREC 有两亿条数据,没有过滤条件,IN 子查询过滤之后返回 63 行数据,关联列是房间号(GCODE)。LY_T_CHREC 应该存放的是开房记录数据,GCODE 列基数应该比较高。在本书中我们反复强调:小表与大表关联,如果大表连接列基数比较高,可以走嵌套循环,让小表驱动大表,大表走连接列的索引。这里小表就是 IN 子查询,大表就是主表,我们让 IN 子查询作为 NL 驱动表。
select /*+ leading(t@a) use_nl(t@a,t) */
gcode, name, idcode, address, noroom, etime
from zhxx_lgy.LY_T_CHREC t
where gcode in (select /*+ qb_name(a) */
gcode
from zhxx_lgy.LY_T_CHREC t
where name = '张三'
and bdate = '19941109');
最终该 SQL 可以秒出。
案例二
2014 年,一位物流行业的朋友说以下 SQL 要执行 4 个多小时。
SELECT "VOUCHER".FID "ID",
"ENTRIES".FID "ENTRIES.ID",
"ENTRIES".FEntryDC "ENTRIES.ENTRYDC",
"ACCOUNT".FID "ACCOUNT.ID",
"ENTRIES".FCurrencyID "CURRENCY.ID",
"PERIOD".FNumber "PERIOD.NUMBER",
"ENTRIES".FSeq "ENTRIES.SEQ",
"ENTRIES".FLocalExchangeRate "LOCALEXCHANGERATE",
"ENTRIES".FReportingExchangeRate "REPORTINGEXCHANGERATE",
"ENTRIES".FMeasureUnitID "ENTRYMEASUREUNIT.ID",
"ASSISTRECORDS".FID "ASSISTRECORDS.ID",
"ASSISTRECORDS".FSeq "ASSISTRECORDS.SEQ",
CASE
WHEN (("ACCOUNT".FCAA IS NULL) AND
("ACCOUNT".FhasUserProperty <> 1)) THEN
"ENTRIES".FOriginalAmount
ELSE
"ASSISTRECORDS".FOriginalAmount
END "ASSISTRECORDS.ORIGINALAMOUNT",
CASE
WHEN (("ACCOUNT".FCAA IS NULL) AND
("ACCOUNT".FhasUserProperty <> 1)) THEN
"ENTRIES".FLocalAmount
ELSE
"ASSISTRECORDS".FLocalAmount
END "ASSISTRECORDS.LOCALAMOUNT",
CASE
WHEN (("ACCOUNT".FCAA IS NULL) AND
("ACCOUNT".FhasUserProperty <> 1)) THEN
"ENTRIES".FReportingAmount
ELSE
"ASSISTRECORDS".FReportingAmount
END "ASSISTRECORDS.REPORTINGAMOUNT",
CASE
WHEN (("ACCOUNT".FCAA IS NULL) AND
("ACCOUNT".FhasUserProperty <> 1)) THEN
"ENTRIES".FQuantity
ELSE
"ASSISTRECORDS".FQuantity
END "ASSISTRECORDS.QUANTITY",
CASE
WHEN (("ACCOUNT".FCAA IS NULL) AND
("ACCOUNT".FhasUserProperty <> 1)) THEN
"ENTRIES".FStandardQuantity
ELSE
"ASSISTRECORDS".FStandardQuantity
END "ASSISTRECORDS.STANDARDQTY",
CASE
WHEN (("ACCOUNT".FCAA IS NULL) AND
("ACCOUNT".FhasUserProperty <> 1)) THEN
"ENTRIES".FPrice
ELSE
"ASSISTRECORDS".FPrice
END "ASSISTRECORDS.PRICE",
CASE
WHEN ("ACCOUNT".FCAA IS NULL) THEN
NULL
ELSE
"ASSISTRECORDS".FAssGrpID
END "ASSGRP.ID"
FROM T_GL_Voucher "VOUCHER"
LEFT OUTER JOIN T_BD_Period "PERIOD" ON "VOUCHER".FPeriodID =
"PERIOD".FID
INNER JOIN T_GL_VoucherEntry "ENTRIES" ON "VOUCHER".FID =
"ENTRIES".FBillID
INNER JOIN T_BD_AccountView "ACCOUNT" ON "ENTRIES".FAccountID =
"ACCOUNT".FID
LEFT OUTER JOIN T_GL_VoucherAssistRecord "ASSISTRECORDS" ON "ENTRIES".FID =
"ASSISTRECORDS".FEntryID
WHERE "VOUCHER".FID IN
(SELECT "VOUCHER".FID "ID"
FROM T_GL_Voucher "VOUCHER"
INNER JOIN T_GL_VoucherEntry "ENTRIES" ON "VOUCHER".FID =
"ENTRIES".FBillID
INNER JOIN T_BD_AccountView "ACCOUNT" ON "ENTRIES".FAccountID =
"ACCOUNT".FID
INNER JOIN t_bd_accountview PAV ON ((INSTR("ACCOUNT".flongnumber,
pav.flongnumber) = 1 AND
pav.faccounttableid =
"ACCOUNT".faccounttableid) AND
pav.fcompanyid =
"ACCOUNT".fcompanyid)
WHERE (("VOUCHER".FCompanyID IN ('fSSF82rRSKexM3KKN1d0tMznrtQ=')) AND
(("VOUCHER".FBizStatus IN (5)) AND
((("VOUCHER".FPeriodID IN ('+wQxkBFVRiKnV7OniceMDoI4jEw=')) AND
"ENTRIES".FCurrencyID =
'dfd38d11-00fd-1000-e000-1ebdc0a8100dDEB58FDC') AND
(pav.FID IN ('vyPiKexLRXiyMb41VSVVzJ2pmCY='))))))
ORDER BY "ID" ASC, "ENTRIES.SEQ" ASC, "ASSISTRECORDS.SEQ" ASC;
执行计划如下。
-------------------------------------------------------------------------------------
| Id |Operation |Name |Rows|Bytes|Cost (%CPU)|
-------------------------------------------------------------------------------------
| 0 |SELECT STATEMENT | | 13| 5733| 486 (1)|
| 1 | SORT ORDER BY | | 13| 5733| 486 (1)|
| 2 | VIEW |VM_NWVW_2 | 13| 5733| 486 (1)|
| 3 | HASH UNIQUE | | 13|11115| 486 (1)|
| 4 | NESTED LOOPS OUTER | | 13|11115| 485 (1)|
| 5 | NESTED LOOPS | | 9| 6606| 471 (1)|
| 6 | NESTED LOOPS | | 9| 6057| 467 (1)|
| 7 | MERGE JOIN OUTER | | 1| 473| 459 (1)|
| 8 | HASH JOIN | | 1| 427| 458 (1)|
| 9 | NESTED LOOPS | | | | |
| 10 | NESTED LOOPS | | 258|83850| 390 (0)|
| 11 | NESTED LOOPS | | 6| 1332| 3 (0)|
| 12 | TABLE ACCESS BY INDEX ROWID|T_BD_ACCOUNTVIEW | 1| 111| 2 (0)|
| 13 | INDEX UNIQUE SCAN |PK_BD_ACCOUNTVIEW | 1| | 1 (0)|
| 14 | INDEX RANGE SCAN |IX_BD_ACTCOMLNUM | 6| 666| 1 (0)|
| 15 | INDEX RANGE SCAN |IX_GL_VCHAACCT | 489| | 1 (0)|
| 16 | TABLE ACCESS BY INDEX ROWID |T_GL_VOUCHERENTRY | 42| 4326| 65 (0)|
| 17 | INDEX RANGE SCAN |IX_GL_VCH_11 |7536| 750K| 68 (0)|
| 18 | BUFFER SORT | | 1| 46| 391 (0)|
| 19 | INDEX RANGE SCAN |IX_PERIOD_ENC | 1| 46| 1 (0)|
| 20 | TABLE ACCESS BY INDEX ROWID |T_GL_VOUCHERENTRY | 17| 3400| 8 (0)|
| 21 | INDEX RANGE SCAN |IX_GL_VCHENTRYFQ1 | 17| | 1 (0)|
| 22 | TABLE ACCESS BY INDEX ROWID |T_BD_ACCOUNTVIEW | 1| 61| 1 (0)|
| 23 | INDEX UNIQUE SCAN |PK_BD_ACCOUNTVIEW | 1| | 1 (0)|
| 24 | TABLE ACCESS BY INDEX ROWID |T_GL_VOUCHERASSISTRECORD| 1| 121| 2 (0)|
| 25 | INDEX RANGE SCAN |IX_GL_VCHASSREC_11 | 2| | 1 (0)|
-------------------------------------------------------------------------------------
Note
-----
- 'PLAN_TABLE' is old version
执行计划中居然是’PLAN_TABLE’ is old version,无法看到谓词信息,这需要重建 PLAN_TABLE。因为没有谓词信息,所以就不打算从执行计划入手优化 SQL 了,而是选择直接分析 SQL,从 SQL 层面优化。
SQL 语句中,select 到 from 之间没有标量子查询,没有自定义函数,from 后面有 5 个表关联,where 条件中只有一个 in(子查询),没有其他过滤条件。SQL 语句中用到的表大小如图 9-8 所示。
图 9-8
SQL 语句中有 4 个表都是大表,只有一个表 T_BD_PERIOD 是小表,在 SQL 语句中与 T_GL_VOUCHER 外关联,是外连接的从表。如果走嵌套循环,T_BD_PERIOD 只能作为被驱动表,因此排除了让小表 T_BD_PERIOD 作为嵌套循环驱动表的可能性。如果该 SQL 没有过滤条件,以上 SQL 只能走 HASH 连接。
SQL 语句中唯一的过滤条件就是 in(子查询),因此只能把优化 SQL 的希望寄托在子查询身上。in(子查询)与表 T_GL_VOUCHER 进行关联,T_GL_VOUCHER 同时也是外连接的主表,如果 in(子查询)能过滤掉 T_GL_VOUCHER 大量数据,那么可以让 T_GL_VOUCHER 作为嵌套循环驱动表,一直与后面的表 NL 下去,这样或许能优化 SQL。如果 in(子查询)不能过滤掉大量数据,那么 SQL 就无法优化,最终只能全走 HASH。询问 in(子查询)返回多少行,运行多久,得到反馈:in(子查询)返回 16 880 条数据,耗时 23 秒。于是我们将 SQL 改写为 with as 子句,而且固化(/+ materialize /)with as 子查询,让 with as 子句作为嵌套循环驱动表。
with x as (
SELECT /*+ materialize */ "VOUCHER".FID "ID"
FROM T_GL_Voucher "VOUCHER"
INNER JOIN T_GL_VoucherEntry "ENTRIES" ON "VOUCHER".FID =
"ENTRIES".FBillID
INNER JOIN T_BD_AccountView "ACCOUNT" ON "ENTRIES".FAccountID =
"ACCOUNT".FID
INNER JOIN t_bd_accountview PAV ON ((INSTR("ACCOUNT".flongnumber,
pav.flongnumber) = 1 AND
pav.faccounttableid =
"ACCOUNT".faccounttableid) AND
pav.fcompanyid =
"ACCOUNT".fcompanyid)
WHERE (("VOUCHER".FCompanyID IN ('fSSF82rRSKexM3KKN1d0tMznrtQ=')) AND
(("VOUCHER".FBizStatus IN (5)) AND
((("VOUCHER".FPeriodID IN ('+wQxkBFVRiKnV7OniceMDoI4jEw=')) AND
"ENTRIES".FCurrencyID =
'dfd38d11-00fd-1000-e000-1ebdc0a8100dDEB58FDC') AND
(pav.FID IN ('vyPiKexLRXiyMb41VSVVzJ2pmCY=')))))
)
SELECT "VOUCHER".FID "ID",
"ENTRIES".FID "ENTRIES.ID",
"ENTRIES".FEntryDC "ENTRIES.ENTRYDC",
"ACCOUNT".FID "ACCOUNT.ID",
"ENTRIES".FCurrencyID "CURRENCY.ID",
"PERIOD".FNumber "PERIOD.NUMBER",
"ENTRIES".FSeq "ENTRIES.SEQ",
"ENTRIES".FLocalExchangeRate "LOCALEXCHANGERATE",
"ENTRIES".FReportingExchangeRate "REPORTINGEXCHANGERATE",
"ENTRIES".FMeasureUnitID "ENTRYMEASUREUNIT.ID",
"ASSISTRECORDS".FID "ASSISTRECORDS.ID",
"ASSISTRECORDS".FSeq "ASSISTRECORDS.SEQ",
CASE
WHEN (("ACCOUNT".FCAA IS NULL) AND
("ACCOUNT".FhasUserProperty <> 1)) THEN
"ENTRIES".FOriginalAmount
ELSE
"ASSISTRECORDS".FOriginalAmount
END "ASSISTRECORDS.ORIGINALAMOUNT",
CASE
WHEN (("ACCOUNT".FCAA IS NULL) AND
("ACCOUNT".FhasUserProperty <> 1)) THEN
"ENTRIES".FLocalAmount
ELSE
"ASSISTRECORDS".FLocalAmount
END "ASSISTRECORDS.LOCALAMOUNT",
CASE
WHEN (("ACCOUNT".FCAA IS NULL) AND
("ACCOUNT".FhasUserProperty <> 1)) THEN
"ENTRIES".FReportingAmount
ELSE
"ASSISTRECORDS".FReportingAmount
END "ASSISTRECORDS.REPORTINGAMOUNT",
CASE
WHEN (("ACCOUNT".FCAA IS NULL) AND
("ACCOUNT".FhasUserProperty <> 1)) THEN
"ENTRIES".FQuantity
ELSE
"ASSISTRECORDS".FQuantity
END "ASSISTRECORDS.QUANTITY",
CASE
WHEN (("ACCOUNT".FCAA IS NULL) AND
("ACCOUNT".FhasUserProperty <> 1)) THEN
"ENTRIES".FStandardQuantity
ELSE
"ASSISTRECORDS".FStandardQuantity
END "ASSISTRECORDS.STANDARDQTY",
CASE
WHEN (("ACCOUNT".FCAA IS NULL) AND
("ACCOUNT".FhasUserProperty <> 1)) THEN
"ENTRIES".FPrice
ELSE
"ASSISTRECORDS".FPrice
END "ASSISTRECORDS.PRICE",
CASE
WHEN ("ACCOUNT".FCAA IS NULL) THEN
NULL
ELSE
"ASSISTRECORDS".FAssGrpID
END "ASSGRP.ID"
FROM T_GL_Voucher "VOUCHER"
LEFT OUTER JOIN T_BD_Period "PERIOD" ON "VOUCHER".FPeriodID =
"PERIOD".FID
INNER JOIN T_GL_VoucherEntry "ENTRIES" ON "VOUCHER".FID =
"ENTRIES".FBillID
INNER JOIN T_BD_AccountView "ACCOUNT" ON "ENTRIES".FAccountID =
"ACCOUNT".FID
LEFT OUTER JOIN T_GL_VoucherAssistRecord "ASSISTRECORDS" ON "ENTRIES".FID =
"ASSISTRECORDS".FEntryID
WHERE "VOUCHER".FID IN
(select id from x)
ORDER BY "ID" ASC, "ENTRIES.SEQ" ASC, "ASSISTRECORDS.SEQ" ASC;
改写后的执行计划如下。
-------------------------------------------------------------------------------------
|Id |Operation |Name |Rows |Bytes|Cost(%CPU)|
--------------------------------------------------------------------------------------------
| 0|SELECT STATEMENT | | 24|11208| 506 (1)|
| 1| TEMP TABLE TRANSFORMATION | | | | |
| 2| LOAD AS SELECT |SYS_TEMP_0FD9D6853_1AD5C99D| | | |
| 3| HASH JOIN | | 1| 415| 458 (1)|
| 4| NESTED LOOPS | | | | |
| 5| NESTED LOOPS | | 258|83850| 390 (0)|
| 6| NESTED LOOPS | | 6| 1332| 3 (0)|
| 7| TABLE ACCESS BY INDEX ROWID|T_BD_ACCOUNTVIEW | 1| 111| 2 (0)|
| 8| INDEX UNIQUE SCAN |PK_BD_ACCOUNTVIEW | 1| | 1 (0)|
| 9| INDEX RANGE SCAN |IX_BD_ACTCOMLNUM | 6| 666| 1 (0)|
| 10| INDEX RANGE SCAN |IX_GL_VCHAACCT | 489| | 1 (0)|
| 11| TABLE ACCESS BY INDEX ROWID |T_GL_VOUCHERENTRY | 42| 4326| 65 (0)|
| 12| INDEX RANGE SCAN |IX_GL_VCH_11 | 7536| 662K| 68 (0)|
| 13| SORT ORDER BY | | 24|11208| 48 (5)|
| 14| NESTED LOOPS OUTER | | 24|11208| 47 (3)|
| 15| NESTED LOOPS | | 17| 6086| 21 (5)|
| 16| NESTED LOOPS | | 17| 5253| 13 (8)|
| 17| NESTED LOOPS OUTER | | 1| 121| 5 (20)|
| 18| NESTED LOOPS | | 1| 87| 4 (25)|
| 19| VIEW |VW_NSO_1 | 1| 29| 2 (0)|
| 20| HASH UNIQUE | | 1| 24| |
| 21| VIEW | | 1| 24| 2 (0)|
| 22| TABLE ACCESS FULL |SYS_TEMP_0FD9D6853_1AD5C99D| 1| 29| 2 (0)|
| 23| INDEX RANGE SCAN |IX_GL_VCH_FIDCMPNUM | 1| 58| 1 (0)|
| 24| INDEX RANGE SCAN |IX_PERIOD_ENC | 1| 34| 1 (0)|
| 25| TABLE ACCESS BY INDEX ROWID|T_GL_VOUCHERENTRY | 17| 3196| 8 (0)|
| 26| INDEX RANGE SCAN |IX_GL_VCHENTRYFQ1 | 17| | 1 (0)|
| 27| TABLE ACCESS BY INDEX ROWID |T_BD_ACCOUNTVIEW | 1| 49| 1 (0)|
| 28| INDEX UNIQUE SCAN | PK_BD_ACCOUNTVIEW | 1| | 1 (0)|
| 29| TABLE ACCESS BY INDEX ROWID |T_GL_VOUCHERASSISTRECORD | 1| 109| 2 (0)|
| 30| INDEX RANGE SCAN |IX_GL_VCHASSREC_11 | 2| | 1 (0)|
-------------------------------------------------------------------------------------
将 SQL 改写之后,能在 1 分钟内执行完毕,最终 SQL 返回 42 956 条数据。
为什么要将 in 子查询改写为 with as 呢?这是因为原始 SQL 中,in 子查询比较复杂,想直接使用 HINT 让 in 子查询作为嵌套循环驱动表反向驱动主表比较困难,所以将 in 子查询改写为 with as。需要注意的是 with as 子句中必须要添加 HINT:**/*+ materialize */**
,同时主表与子查询关联列必须有索引,如果不添加 HINT: /*+ materialize */
,如果主表与子查询关联列没有索引,优化器就不会自动将 with as 作为嵌套循环驱动表。with as 子句添加了**/*+ materialize */**
会生成一个临时表,这时,就将复杂的 in 子查询简单化了,之后优化器会将 with as 子句展开(unnesting),将子查询展开一般是子查询与主表进行 HASH 连接,或者是子查询作为嵌套循环驱动表与主表进行关联,一般不会是主表作为嵌套循环驱动表,因为主表作为嵌套循环驱动表可以直接走 Filter,不用展开。优化器发现 with as 子句数据量较小,而主表较大,而且主表连接列有索引,于是自动让 with as 子句固化的结果作为了嵌套循环驱动表。
注:with as 其实就是把一大堆重复用到的sql语句放在with as里面,取一个别名,后面的查询就可以用它,这样对于大批量的sql语句起到一个优化的作用,而且清楚明了。