PlanOptimizer
// This method was introduced separate logical planning from query analyzing stage
// and allow plans to be overwritten by CachedSqlQueryExecution
protected Plan createPlan(Analysis analysis,
Session session,
List<PlanOptimizer> planOptimizers,
PlanNodeIdAllocator idAllocator,
Metadata metadata,
TypeAnalyzer typeAnalyzer,
StatsCalculator statsCalculator,
CostCalculator costCalculator,
WarningCollector warningCollector)
{
// 逻辑执行计划器
LogicalPlanner logicalPlanner = new LogicalPlanner(
session,
planOptimizers,
idAllocator,
metadata,
typeAnalyzer,
statsCalculator,
costCalculator,
warningCollector);
// 逻辑执行计划
return logicalPlanner.plan(analysis);
}
优化器进行执行
public Plan plan(Analysis analysis)
{
return plan(analysis, Stage.OPTIMIZED_AND_VALIDATED);
}
public Plan plan(Analysis analysis, Stage stage)
{
// 获取根节点 root, 从statement 执行计划语句
PlanNode root = planStatement(analysis, analysis.getStatement());
//检查执行计划的有效性
planSanityChecker.validateIntermediatePlan(root, session, metadata, typeAnalyzer, planSymbolAllocator.getTypes(), warningCollector);
if (stage.ordinal() >= Stage.OPTIMIZED.ordinal()) {
for (PlanOptimizer optimizer : planOptimizers) {
if (OptimizerUtils.isEnabledLegacy(optimizer, session, root)) {
//对生成的逻辑执行进行优化 执行计划优化器
root = optimizer.optimize(root, session, planSymbolAllocator.getTypes(), planSymbolAllocator, idAllocator,
warningCollector);
requireNonNull(root, format("%s returned a null plan", optimizer.getClass().getName()));
}
}
}
if (stage.ordinal() >= Stage.OPTIMIZED_AND_VALIDATED.ordinal()) {
// make sure we produce a valid plan after optimizations run. This is mainly to catch programming errors
planSanityChecker.validateFinalPlan(root, session, metadata, typeAnalyzer, planSymbolAllocator.getTypes(), warningCollector);
}
TypeProvider types = planSymbolAllocator.getTypes();
StatsProvider statsProvider = new CachingStatsProvider(statsCalculator, session, types);
CostProvider costProvider = new CachingCostProvider(costCalculator, statsProvider, Optional.empty(), session, types);
return new Plan(root, types, StatsAndCosts.create(root, statsProvider, costProvider));
}
以PredicatePushDown为例,讲解谓词下沉,如何进行优化
Predicate PushDown优化器用于将过滤条件进行下推, 同样可以减少下层节点所处理的数据量,提高执行效率。
- 当经过Project Node时, 仅下推确定性的过滤条件, 如果包含不确定性的过滤条件如rand函数, 则根据不确定性的过滤条件添加一个Filter Node。
- 当经过Union Node时, 将过滤条件下推到Union的所有子查询中。
- 对于Join Node, 由于Join操作对应的是两张表, 如果过滤条件是针对这两张表的,则将过滤条件下推到读取对应的表上,并且如果过滤条件的列与连接条件中的列相同时, 则将过滤条件同时下推到Join的左右两侧的表上。如:
过滤条件l_order key=1会被下推到读取表line item之上, 过滤条件o_order key=1会被下推到读取表orders之上。
顺序