PlanOptimizer

    1. // This method was introduced separate logical planning from query analyzing stage
    2. // and allow plans to be overwritten by CachedSqlQueryExecution
    3. protected Plan createPlan(Analysis analysis,
    4. Session session,
    5. List<PlanOptimizer> planOptimizers,
    6. PlanNodeIdAllocator idAllocator,
    7. Metadata metadata,
    8. TypeAnalyzer typeAnalyzer,
    9. StatsCalculator statsCalculator,
    10. CostCalculator costCalculator,
    11. WarningCollector warningCollector)
    12. {
    13. // 逻辑执行计划器
    14. LogicalPlanner logicalPlanner = new LogicalPlanner(
    15. session,
    16. planOptimizers,
    17. idAllocator,
    18. metadata,
    19. typeAnalyzer,
    20. statsCalculator,
    21. costCalculator,
    22. warningCollector);
    23. // 逻辑执行计划
    24. return logicalPlanner.plan(analysis);
    25. }

    优化器进行执行

    1. public Plan plan(Analysis analysis)
    2. {
    3. return plan(analysis, Stage.OPTIMIZED_AND_VALIDATED);
    4. }
    5. public Plan plan(Analysis analysis, Stage stage)
    6. {
    7. // 获取根节点 root, 从statement 执行计划语句
    8. PlanNode root = planStatement(analysis, analysis.getStatement());
    9. //检查执行计划的有效性
    10. planSanityChecker.validateIntermediatePlan(root, session, metadata, typeAnalyzer, planSymbolAllocator.getTypes(), warningCollector);
    11. if (stage.ordinal() >= Stage.OPTIMIZED.ordinal()) {
    12. for (PlanOptimizer optimizer : planOptimizers) {
    13. if (OptimizerUtils.isEnabledLegacy(optimizer, session, root)) {
    14. //对生成的逻辑执行进行优化 执行计划优化器
    15. root = optimizer.optimize(root, session, planSymbolAllocator.getTypes(), planSymbolAllocator, idAllocator,
    16. warningCollector);
    17. requireNonNull(root, format("%s returned a null plan", optimizer.getClass().getName()));
    18. }
    19. }
    20. }
    21. if (stage.ordinal() >= Stage.OPTIMIZED_AND_VALIDATED.ordinal()) {
    22. // make sure we produce a valid plan after optimizations run. This is mainly to catch programming errors
    23. planSanityChecker.validateFinalPlan(root, session, metadata, typeAnalyzer, planSymbolAllocator.getTypes(), warningCollector);
    24. }
    25. TypeProvider types = planSymbolAllocator.getTypes();
    26. StatsProvider statsProvider = new CachingStatsProvider(statsCalculator, session, types);
    27. CostProvider costProvider = new CachingCostProvider(costCalculator, statsProvider, Optional.empty(), session, types);
    28. return new Plan(root, types, StatsAndCosts.create(root, statsProvider, costProvider));
    29. }

    以PredicatePushDown为例,讲解谓词下沉,如何进行优化

    Predicate PushDown优化器用于将过滤条件进行下推, 同样可以减少下层节点所处理的数据量,提高执行效率。

    • 当经过Project Node时, 仅下推确定性的过滤条件, 如果包含不确定性的过滤条件如rand函数, 则根据不确定性的过滤条件添加一个Filter Node。
    • 当经过Union Node时, 将过滤条件下推到Union的所有子查询中。
    • 对于Join Node, 由于Join操作对应的是两张表, 如果过滤条件是针对这两张表的,则将过滤条件下推到读取对应的表上,并且如果过滤条件的列与连接条件中的列相同时, 则将过滤条件同时下推到Join的左右两侧的表上。如:

    image.png
    过滤条件l_order key=1会被下推到读取表line item之上, 过滤条件o_order key=1会被下推到读取表orders之上。

    顺序
    image.png