author:彭程
介绍
dble优化是在生成的语法抽象树的基础上,基于一些规则改写其逻辑关系表达式,并将一些能下推的关系表达式进行下推。
代码分析
优化过程主要在MyOptimizer#optimize方法中,代码结构如下,下面具体分析各个优化步骤
public static PlanNode optimize(PlanNode node) {TraceManager.TraceObject traceObject = TraceManager.threadTrace("optimize-for-sql");TraceManager.log(ImmutableMap.of("plan-node", node), traceObject);try {// PreProcessor SubQuery ,transform in sub query to join预处理子查询if (SystemConfig.getInstance().isInSubQueryTransformToJoin()) { //判断子查询是否转换为joinnode = SubQueryPreProcessor.optimize(node);} else {node = SubQueryPreNoTransformProcessor.optimize(node);}updateReferredTableNodes(node);int existGlobal = checkGlobalTable(node, new HashSet<>());if (node.type() == PlanNode.PlanNodeType.QUERY || node.isExistView() || existGlobal != 1 || node.isWithSubQuery() || node.isContainsSubQuery() || !PlanUtil.hasNoFakeNode(node)) {// optimizer sub query [Derived Tables (Subqueries in the FROM Clause)]//子查询优化器[派生表(FROM子句中的子查询)]// transform right join to left join 将right join转为left joinnode = JoinPreProcessor.optimize(node);//优化总是为真或总是为假的过滤条件// filter expr which is always true/falsenode = FilterPreProcessor.optimize(node);// push down the filter which may contains ER KEYnode = FilterJoinColumnPusher.optimize(node);node = JoinERProcessor.optimize(node);if (existGlobal >= 0) {node = GlobalTableProcessor.optimize(node); //存在全局表优化}// push down filter//将过滤条件下推node = FilterPusher.optimize(node);//将orderby下推node = OrderByPusher.optimize(node);//将limit下推,如存在union情况node = LimitPusher.optimize(node);node = SelectedProcessor.optimize(node);boolean useJoinStrategy = SystemConfig.getInstance().isUseJoinStrategy();if (useJoinStrategy) {node = JoinStrategyProcessor.optimize(node);}}return node;} catch (MySQLOutPutException e) {LoggerFactory.getLogger(MyOptimizer.class).error(node.toString(), e);throw e;} finally {TraceManager.finishSpan(traceObject);}}
预处理子查询SubQueryPreProcessor#optimize和SubQueryPreNoTransformProcessor#optimize
代码如下
// PreProcessor SubQuery ,transform in sub query to join预处理子查询if (SystemConfig.getInstance().isInSubQueryTransformToJoin()) { //判断子查询是否转换为joinnode = SubQueryPreProcessor.optimize(node);} else {node = SubQueryPreNoTransformProcessor.optimize(node);}
dble根据SystemConfig.isInSubQueryTransformToJoin参数进行判断,从SystemConfig文件可以看到inSubQueryTransformToJoin参数为只读参数,代表是否将insubquery转换为join,默认为false
readOnlyParams.add(new ParamInfo("inSubQueryTransformToJoin",sysConfig.isInSubQueryTransformToJoin() + "","The inSubQuery is transformed into the join ,the default value is false"));
判断为false会进入SubQueryPreNoTransformProcessor#optimize方法
public static PlanNode optimize(PlanNode qtn) {TraceManager.TraceObject traceObject = TraceManager.threadTrace("for-subquery");try {MergeHavingFilter.optimize(qtn); //对having进行优化,将非聚合函数条件转为where条件qtn = handlerComparisonsSubQuery(qtn);return qtn;} finally {TraceManager.log(ImmutableMap.of("plan-node", qtn), traceObject);TraceManager.finishSpan(traceObject);}}
首先执行MergeHavingFilter#optimize方法,对having条件进行优化,主要是判断having条件是否可以转为where条件,判断依据为条件中是否有聚合函数,代码分析如下
public static void optimize(PlanNode qtn) {if (qtn.getHavingFilter() != null) {List<Item> subFilters = FilterUtils.splitFilter(qtn.getHavingFilter()); //将每一个having条件放入列表List<Item> canMergeSubs = new ArrayList<>(); //可以转换的条件列表for (Item subFilter : subFilters) {if (!subFilter.isWithSumFunc()) { //判断是否为聚合条件canMergeSubs.add(subFilter);}}subFilters.removeAll(canMergeSubs);qtn.having(FilterUtils.and(subFilters)); //将不满足条件的having条件放回having条件qtn.setWhereFilter(FilterUtils.and(qtn.getWhereFilter(), FilterUtils.and(canMergeSubs))); //将满足条件的having条件转为where条件}for (PlanNode child : qtn.getChildren()) //如果有子查询递归子查询optimize(child);}}
如以下SQL语句,having条件有两个:id>1和max(id)>1,前一个条件就能优化为where条件,执行优化后结果如下:
select content,max(id) from tb_enum_sharding group by content having id>=1 and max(id)>1;

优化前 优化后
JoinPreProcessor#optimize
JoinPreProcessor#optimize会将SQL的right join都转为left join,如A right join B on A.id = B.id将转为B left join B on A.id = B.id,JoinPreProcessor#optimize是调用JoinPreProcessor#findAndChangeRightJoinToLeftJoin方法
// transform right join to left join 将right join转为left joinnode = JoinPreProcessor.optimize(node);
public static PlanNode optimize(PlanNode qtn) {TraceManager.TraceObject traceObject = TraceManager.threadTrace("optimize-for-join-order");try {qtn = findAndChangeRightJoinToLeftJoin(qtn);return qtn;} finally {TraceManager.log(ImmutableMap.of("plan-node", qtn), traceObject);TraceManager.finishSpan(traceObject);}}
JoinPreProcessor#findAndChangeRightJoinToLeftJoin会便利所有的子句,对每一个子句都递归调用该方法,如果不存在子句了会继续调用JoinNode#exchangeLeftAndRight方法
private static PlanNode findAndChangeRightJoinToLeftJoin(PlanNode qtn) {for (PlanNode child : qtn.getChildren()) {findAndChangeRightJoinToLeftJoin(child); //对每一个子句递归调用优化函数}if (qtn instanceof JoinNode && ((JoinNode) qtn).isRightOuterJoin()) {JoinNode jn = (JoinNode) qtn;jn.exchangeLeftAndRight(); //交换左右连接}return qtn;}
JoinNode#exchangeLeftAndRight方法中实现了左右node交换
public void exchangeLeftAndRight() { //交换左右连接PlanNode tmp = this.getLeftNode();this.setLeftNode(this.getRightNode());this.setRightNode(tmp);boolean tmpOuter = this.leftOuter;this.leftOuter = this.rightOuter;this.rightOuter = tmpOuter;this.buildJoinColumns(true);}
示例SQL如下,执行该优化后可以看到所有的左连接转化为了右连接
select * from tb_enum_sharding a right join tb_hash_sharding b on a.id = b.id where a.id >= 0;
LimitPusher#optimize
LimitPusher#optimize可以对limit进行下推,如存在union情况,可以将limit下推到每一个子句之中,LimitPusher#optimize是调用LimitPusher#findChild方法实现的
//将limit下推,如存在union情况node = LimitPusher.optimize(node);
public static PlanNode optimize(PlanNode qtn) {TraceManager.TraceObject traceObject = TraceManager.threadTrace("optimize-for-limit");try {qtn = findChild(qtn);return qtn;} finally {TraceManager.log(ImmutableMap.of("plan-node", qtn), traceObject);TraceManager.finishSpan(traceObject);}}
LimitPusher#findChild方法记录了有效的limit数据,并调用LimitPusher#pushLimit方法实现下推
private static PlanNode findChild(PlanNode qtn) { //limit优化器,对union进行优化if (qtn instanceof MergeNode) {// optimizer limit// union: push down limit to children// union all:push down limit to children and add distinctMergeNode node = (MergeNode) qtn;long limitFrom = node.getLimitFrom(); //记录limit fromlong limitTo = node.getLimitTo(); //记录limit toif (limitFrom != -1 && limitTo != -1) { //如果存在有效的limitfor (PlanNode child : node.getChildren()) { //下推到每一个子句pushLimit(child, limitFrom, limitTo, node.isUnion());}}} else if ((qtn instanceof JoinNode) || (qtn instanceof QueryNode)) {for (PlanNode child : qtn.getChildren()) {findChild(child);}}return qtn;}
LimitPusher#pushLimit方法对每一个子句下推了limit
private static void pushLimit(PlanNode node, long limitFrom, long limitTo, boolean isUnion) {if (isUnion) {node.setDistinct(true);}if (node.getLimitFrom() == -1 && node.getLimitTo() == -1) {node.setLimitFrom(0);node.setLimitTo(limitFrom + limitTo);}}
示例SQL如下,经过limit优化之后可以看到limit已经下推到了每个子句中
select * from tb_enum_sharding a union select * from tb_hash_sharding b limit 2;

优化前 优化后

