本案例发生在 2011 年,当时作者罗老师在惠普担任开发 DBA,支撑宝洁公司的数据仓库项目。为了避免泄露信息,他对 SQL 语句做了适当修改。Obiee 终端用户发来邮件说某报表执行了 30 分钟还不出结果,请求协助。通过与 Obiee 开发人员合作,找到报表 SQL 语句如下。
select sum(T2083114.MANUL_COST_OVRRD_AMT) as c1,
sum(nvl(T2083114.REVSD_VAR_ESTMT_COST_AMT , 0)) as c2,
T2084525.ACCT_LONG_NAME as c3,
T2084525.NAME as c4,
T2083424.PRMTN_NAME as c5,
T2083424.PRMTN_ID as c6,
case when case when T2083424.CORP_PRMTN_TYPE_CODE = 'Target Account'
then 'Corporate' else T2083424.CORP_PRMTN_TYPE_CODE end is null
then 'Private' else case when T2083424.CORP_PRMTN_TYPE_CODE = 'Target Account'
then 'Corporate' else T2083424.CORP_PRMTN_TYPE_CODE end end as c7,
T2083424.PRMTN_STTUS_CODE as c8,
T2083424.APPRV_BY_DESC as c9,
T2083424.APPRV_STTUS_CODE as c10,
T2083424.AUTO_UPDT_GTIN_IND as c11,
T2083424.CREAT_DATE as c12,
T2083424.PGM_START_DATE as c13,
T2083424.PGM_END_DATE as c14,
nvl(case when T2083424.PRMTN_STTUS_CODE = 'Confirmed'
then cast(( TRUNC( TO_DATE('2011-06-07' , 'YYYY-MM-DD') ) - TRUNC( T2083424.PGM_END_DATE ) ) as VARCHAR ( 10 ) ) end , '') as c15,
T2083424.PRMTN_STOP_DATE as c16,
T2083424.SHPMT_START_DATE as c17,
T2083424.SHPMT_END_DATE as c18,
T2083424.CNBLN_WK_CNT as c19,
T2083424.ACTVY_DETL_POP as c20,
T2083424.CMMNT_DESC as c21,
T2083424.PRMTN_AVG_POP as c22,
T2084525.CHANL_TYPE_DESC as c23,
T2083424.PRMTN_SKID as c24
from
OPT_ACCT_FDIM T2084525 /* OPT_ACCT_PRMTN_FDIM */ ,
OPT_BUS_UNIT_FDIM T2083056,
OPT_CAL_MASTR_DIM T2083357 /* OPT_CAL_MASTR_DIM01 */ ,
OPT_PRMTN_FDIM T2083424,
OPT_ACTVY_FCT T2083114
where (T2083056.BUS_UNIT_SKID = T2083114.BUS_UNIT_SKID and T2083114.BUS_UNIT_SKID = T2084525.BUS_UNIT_SKID
and T2083114.DATE_SKID = T2083357.CAL_MASTR_SKID and T2083114.BUS_UNIT_SKID = T2083424.BUS_UNIT_SKID
and T2083114.PRMTN_SKID = T2083424.PRMTN_SKID and T2083056.BUS_UNIT_NAME = 'Chile'
and T2083114.ACCT_PRMTN_SKID = T2084525.ACCT_SKID and T2083357.FISC_YR_ABBR_NAME = 'FY10/11'
and T2084525.ACCT_LONG_NAME is not null and (case when T2083424.CORP_PRMTN_TYPE_CODE = 'Target Account'
then 'Corporate' else T2083424.CORP_PRMTN_TYPE_CODE end in ('Alternate BDF', 'Corporate', 'Private'))
and (T2084525.ACCT_LONG_NAME in ('ADELCO - CHILE - 0066009018', 'ALIMENTOS FRUNA - CHILE - 0066009049',
'CENCOSUD - CHILE - 0066009007', 'COMERCIAL ALVI - CHILE - 0066009070', 'D&S - CHILE - 0066009008',
'DIPAC - CHILE - 0066009024', 'DIST. COMERCIAL - CHILE - 0066009087', 'DISTRIBUCION LAGOS S.A. - CHILE - 2001146505',
'ECOMMERCE ESCALA 1 - 1900001746', 'EMILIO SANDOVAL - CHILE - 2000402293', 'F. AHUMADA - CHILE - 0066009023',
'FALABELLA - CHILE - 2000406971', 'FRANCISCO LEYTON - CHILE - 0066009142', 'MAICAO - CHILE - 0066009135',
'MARGARITA UAUY - CHILE - 0066009146', 'PREUNIC - CHILE - 0066009032', 'PRISA DISTRIBUCION - CHILE - 2001419970',
'RABIE - CHILE - 0066009015', 'S Y B FARMACEUTICA S.A. - CHILE - 2000432938',
'SOC. INV. LA MUNDIAL LTDA - CHILE - 2001270967', 'SOCOFAR - CHILE - 0066009028',
'SODIMAC - CHILE - 2000402358', 'SOUTHERN CROSS - CHILE - 2002135799',
'SUPERM. MONSERRAT - CHILE - 0066009120', 'TELEMERCADOS EUROPA - CHILE - 0066009044'))
and T2083424.PRMTN_LONG_NAME in (select distinct T2083424.PRMTN_LONG_NAME as c1
from
OPT_ACCT_FDIM T2084525 /* OPT_ACCT_PRMTN_FDIM */ ,
OPT_BUS_UNIT_FDIM T2083056,
OPT_CAL_MASTR_DIM T2083357 /* OPT_CAL_MASTR_DIM01 */ ,
OPT_PRMTN_FDIM T2083424,
OPT_PRMTN_PROD_FLTR_LKP T2083698
where ( T2083056.BUS_UNIT_SKID = T2083698.BUS_UNIT_SKID and T2083357.CAL_MASTR_SKID = T2083698.DATE_SKID
and T2083698.ACCT_PRMTN_SKID = T2084525.ACCT_SKID and T2083424.PRMTN_SKID = T2083698.PRMTN_SKID
and T2083424.BUS_UNIT_SKID = T2083698.BUS_UNIT_SKID and T2083056.BUS_UNIT_NAME = 'Chile'
and T2083357.FISC_YR_ABBR_NAME = 'FY10/11' and T2083698.BUS_UNIT_SKID = T2084525.BUS_UNIT_SKID
and (case when T2083424.CORP_PRMTN_TYPE_CODE = 'Target Account' then 'Corporate'
else T2083424.CORP_PRMTN_TYPE_CODE end in ('Alternate BDF', 'Corporate', 'Private'))
and (T2084525.ACCT_LONG_NAME in ('ADELCO - CHILE - 0066009018',
'ALIMENTOS FRUNA - CHILE - 0066009049', 'CENCOSUD - CHILE - 0066009007',
'COMERCIAL ALVI - CHILE - 0066009070', 'D&S - CHILE - 0066009008',
'DIPAC - CHILE - 0066009024', 'DIST. COMERCIAL - CHILE - 0066009087',
'DISTRIBUCION LAGOS S.A. - CHILE - 2001146505', 'ECOMMERCE ESCALA 1 - 1900001746',
'EMILIO SANDOVAL - CHILE - 2000402293', 'F. AHUMADA - CHILE - 0066009023',
'FALABELLA - CHILE - 2000406971', 'FRANCISCO LEYTON - CHILE - 0066009142',
'MAICAO - CHILE - 0066009135', 'MARGARITA UAUY - CHILE - 0066009146',
'PREUNIC - CHILE - 0066009032', 'PRISA DISTRIBUCION - CHILE - 2001419970',
'RABIE - CHILE - 0066009015', 'S Y B FARMACEUTICA S.A. - CHILE - 2000432938',
'SOC. INV. LA MUNDIAL LTDA - CHILE - 2001270967', 'SOCOFAR - CHILE - 0066009028',
'SODIMAC - CHILE - 2000402358', 'SOUTHERN CROSS - CHILE - 2002135799',
'SUPERM. MONSERRAT - CHILE - 0066009120', 'TELEMERCADOS EUROPA - CHILE - 0066009044')) ) ) )
group by T2083424.PRMTN_SKID, T2083424.PRMTN_ID, T2083424.PRMTN_NAME, T2083424.SHPMT_END_DATE,
T2083424.SHPMT_START_DATE, T2083424.PRMTN_STTUS_CODE, T2083424.APPRV_STTUS_CODE, T2083424.CMMNT_DESC,
T2083424.PGM_START_DATE, T2083424.PGM_END_DATE, T2083424.CREAT_DATE, T2083424.APPRV_BY_DESC,
T2083424.AUTO_UPDT_GTIN_IND, T2083424.PRMTN_STOP_DATE, T2083424.ACTVY_DETL_POP, T2083424.CNBLN_WK_CNT,
T2083424.PRMTN_AVG_POP, T2084525.NAME, T2084525.CHANL_TYPE_DESC, T2084525.ACCT_LONG_NAME,
case when case when T2083424.CORP_PRMTN_TYPE_CODE = 'Target Account' then 'Corporate'
else T2083424.CORP_PRMTN_TYPE_CODE end is null then 'Private' else case
when T2083424.CORP_PRMTN_TYPE_CODE = 'Target Account' then 'Corporate'
else T2083424.CORP_PRMTN_TYPE_CODE end end ,
nvl(case when T2083424.PRMTN_STTUS_CODE = 'Confirmed'
then cast(( TRUNC( TO_DATE('2011-06-07' , 'YYYY-MM-DD') ) - TRUNC( T2083424.PGM_END_DATE ) ) as VARCHAR ( 10 ) ) end , '')
order by c24, c3;
该 SQL 是 Obiee 报表工具自动生成的,所以看起来有些凌乱。对于很长的 SQL,我们可以运用 SQL 三段分拆方法,快速查看 SQL 写法有没有性能问题。经过检查,SQL 写法没有任何问题。检查完 SQL 写法之后,我们没有直接检查执行计划,因为执行计划也比较长,因此使用自己编写的脚本抓出该 SQL 要用到的表信息,如下所示。
TABLE_NAME Size(Mb) PARTITIONED DEGREE NUM_ROWS
--------------------- ---------- -------------------- ---------- -------------
*OPT_BUS_UNIT_FDIM .001037598 NO 1 16
*OPT_CAL_MASTR_DIM 38.1284523 NO 1 37435
OPT_CAL_MASTR_DIM 38.1284523 NO 1 37435
*OPT_PRMTN_FDIM 74.6365929 YES 1 52140
OPT_PRMTN_FDIM 74.6365929 YES 1 52140
OPT_ACTVY_FCT 19.3430614 YES 1 157230
*OPT_ACCT_FDIM 36.6709185 YES 2 95415
OPT_ACCT_FDIM 36.6709185 YES 2 95415
OPT_PRMTN_PROD_FLTR_LKP 1523.87207 YES 2 30148975
「」号表示该表在执行计划中使用到了索引。一般情况下,只有大表才会引发 SQL 性能问题,SQL 中 OPT_PRMTN_PROD_FLTR_LKP 表走的是全表扫描,有 3 000 万行数据,1.5GB,其他表都是小表。需要说明的是,表 OPT_PRMTN_PROD_FLTR_LKP 大小应该不止 1.5GB,因为当时没有通过 DBA_SEGMENTS 来获取表大小,而是通过 DBA_TABLES
中`NUM_ROWS`AVG_ROW_LEN*
估算得来,因为 OPT_PRMTN_PROD_FLTR_LKP 是一个分区表,DBA_TABLES 中的统计不是十分准确。找到大表之后,在我们查看执行计划的时候首先就应该关注大表,SQL 的执行计划如图 9-9 所示(因为执行计划比较长,所以采用截图方式并且省略了谓词)。
图 9-9 SQL 执行计划
Id=30 就是大表在执行计划中的位置,Id=29 是 Id=30 的父亲,它与 Id=8 对齐。Id=7 是嵌套循环,它是 Id=8 与 Id=29 的父亲。通过分析执行计划,我们发现 OPTPRMTN_PROD_FLTR LKP 做了嵌套循环(Id=7)的被驱动表,而且没有走索引,这就是为什么 Obiee 报表执行了 30 分钟还没执行完毕。我们查看 Id=30 的过滤条件如下。
30 - filter("T2083056"."BUS_UNIT_SKID"="T2083698"."BUS_UNIT_SKID" AND
"T2083424"."PRMTN_SKID"="T2083698"."PRMTN_SKID" AND
"T2083424"."BUS_UNIT_SKID"="T2083698"."BUS_UNIT_SKID")
我们根据过滤条件创建索引从而让 NL 被驱动表走索引。
create index OPT_PRMTN_PROD_FLTR_LKP_NX1 ON OPT_PRMTN_PROD_FLTR_LKP(BUS_UNIT_SKID,PRMTN_SKID) nologging parallel ;
Index created.
Elapsed: 00:00:33.04
创建索引花了 33 分钟,如图 9-10 所示,我们再来看一下 SQL 的执行计划,查看带有 A-TIME 的执行计划。
创建完索引之后,Obiee 报表能在 4 分钟内执行完所有数据。我们注意观察执行计划 Id=11,优化器评估返回 5 行数据,但是实际上返回了 11248 行数据,这导致后续表连接方式全采用了嵌套循环。Id=11 是两表 HASH 连接之后的结果集,如果能够纠正 Id=11 估算 Rows 的误差,那么优化器应该能够自我优化该报表。Id=11 是两个表中两个列关联的结果集,优化器一般对多个列进行 Rows 估算的时候通常容易算错,于是对 Id=11 中两个表的连接列收集了扩展统计信息。
图 9-10
SELECT DBMS_STATS.CREATE_EXTENDED_STATS(USER, 'OPT_ACCT_FDIM', '(BUS_UNIT_SKID, ACCT_SKID)') FROM DUAL;
DBMS_STATS.CREATE_EXTENDED_STATS(USER,'OPT_ACCT_FDIM','(BUS_UNIT_SKID,ACCT_SKID)')
----------------------------------------------------------------------------------
SYS_STUJ8OD#X2IPA_B9_CH00B046T
SELECT DBMS_STATS.CREATE_EXTENDED_STATS(USER, 'OPT_ACTVY_FCT', '(BUS_UNIT_SKID, ACCT_PRMTN_SKID)') FROM DUAL;
DBMS_STATS.CREATE_EXTENDED_STATS(USER,'OPT_ACTVY_FCT','(BUS_UNIT_SKID,ACCT_PRMTN_SKID)')
-----------------------------------------------------------------------------------
SYS_STU#CVQNKK5CCM0W2XEQWSRXSM
BEGIN
DBMS_STATS.GATHER_TABLE_STATS(ownname => 'XXXXX', ---为了保密,用户名做了更改
tabname => 'OPT_ACCT_FDIM',
estimate_percent => 20,
method_opt => 'for all columns size auto',
degree => 6,
granularity => 'ALL',
cascade=>TRUE
);
END;
PL/SQL procedure successfully completed.
Elapsed: 00:00:57.76
BEGIN
DBMS_STATS.GATHER_TABLE_STATS(ownname => 'XXXX', ---为了保密,用户名做了更改
tabname => 'OPT_ACTVY_FCT',
estimate_percent => 20,
method_opt => 'for all columns size auto',
degree => 6,
granularity => 'ALL',
cascade=>TRUE
);
END;
PL/SQL procedure successfully completed.
Elapsed: 00:01:15.10
收集完扩展统计信息之后,SQL 能在 1 秒左右执行完毕,带有 A-Time 的执行计划如图 9-11 所示。
图 9-11
大家在工作中如果遇到多列过滤或者多列关联 Rows 估算出现较大偏差的时候,不妨收集扩展统计信息试一试。
其实当时是项目经理找到作者罗老师来优化 SQL 的,当时他应该是被美国宝洁的客户批评了。客户的原话是说:「我已经抽完一支烟了,报表还没打开,我原本以为当我抽完第二支烟的时候报表能打开,谁知当我抽完第三支烟的时候报表还没打开!」罗老师优化完报表之后,幽默地说了句,现在客户可以在掏打火机、烟还没点燃之前就能打开报表了。