在定义pass时,通常会设置相应的优化等级,例如:

1 FunctionPass

  1. Pass FoldConstant() {
  2. runtime::TypedPackedFunc<Function(Function, Module, PassContext)> pass_func =
  3. [=](Function f, Module m, PassContext pc) {
  4. return Downcast<Function>(FoldConstant(f, m));
  5. };
  6. return CreateFunctionPass(pass_func, 2, "FoldConstant", {});
  7. }
  8. TVM_REGISTER_API("relay._transform.FoldConstant")
  9. .set_body_typed(FoldConstant);

说明:CreateFunctionPass(pass_func, 2, “FoldConstant”, {});中的第2个参数2,表示该Pass的优化等级是2。
那么,这个优化等级有什么用呢?在用户提供的优化级别下运行时,帮助Pass基础结构识别是否需要执行某个Pass。假设用户指定的优化级别是2,那么所有优化级别小于或等于2的Pass都执行,大于2的Pass都不执行。

  1. # only the passes that have optimization level less or equal to 2 will be executed
  2. with _transform.PassContext(opt_level=2,
  3. required_pass=["QuantizeAnnotate",
  4. "QuantizeCalibrate",
  5. "QuantizeRealize"]):
  6. with quantize_context():
  7. mod = quantize_seq(mod)

上面的例子是执行连续的Pass,所有优化级别小于或等于2的都执行,大于2的不执行。

  1. opt_level = 3
  2. target = tvm.target.cuda()
  3. with relay.build_config(opt_level=opt_level):
  4. graph, lib, params = relay.build_module.build(
  5. mod, target, params=params)

上面的例子是在relay.build时,表示所有优化级别小于或等于3的都执行,大于3的不执行。

build_config其实也是调用了PassContext,代码如下:

  1. @register_relay_node
  2. class PassContext(RelayNode):
  3. """The basis where a Relay optimization/analysis runs on.
  4. Each pass context contains a number of auxiliary information that is used
  5. to help an optimization pass. Such information includes the error reporter
  6. to record the errors of during the optimization, etc.
  7. opt_level : Optional[int]
  8. The optimization level of this pass.
  9. fallback_device : Optional[Union[int, str, TVMContext]]
  10. The fallback device type. It is also used as the default device for
  11. operators that are not annotated during heterogeneous execution.
  12. required_pass : Optional[Union[List[str], Set[str], Tuple[str]]]
  13. The list of passes that are required by a certain pass.
  14. disabled_pass : Optional[Union[List[str], Set[str], Tuple[str]]]
  15. The list of passes that are disabled.
  16. force_transform_avgpool2d : Optional[Bool]
  17. The flag indicated whether force transform avgpool2d to conv2d in VaccTransformAvgPool pass.
  18. ksize_global_avgpool2d_transform : Optional[int]
  19. The kernel type we used to decompose large global avgpool2d.
  20. svd_topn_percent : Optional[int]
  21. svd_topn_percent = sum(TopN singular values) / sum(All singular values).
  22. """
  23. def __init__(self,
  24. opt_level=2,
  25. fallback_device=_nd.cpu(),
  26. required_pass=None,
  27. disabled_pass=None,
  28. force_transform_avgpool2d=True,
  29. ksize_global_avgpool2d_transform=0,
  30. svd_topn_percent = 100):
  31. if isinstance(fallback_device, str):
  32. fallback_device = _nd.context(fallback_device).device_type
  33. elif isinstance(fallback_device, TVMContext):
  34. fallback_device = fallback_device.device_type
  35. if not isinstance(fallback_device, int):
  36. raise TypeError("required_pass is expected to be the type of " +
  37. "int/str/TVMContext.")
  38. required = list(required_pass) if required_pass else []
  39. if not isinstance(required, (list, tuple)):
  40. raise TypeError("required_pass is expected to be the type of " +
  41. "list/tuple/set.")
  42. disabled = list(disabled_pass) if disabled_pass else []
  43. if not isinstance(disabled, (list, tuple)):
  44. raise TypeError("disabled_pass is expected to be the type of " +
  45. "list/tuple/set.")
  46. if not (svd_topn_percent <= 100 and svd_topn_percent > 0):
  47. raise TypeError("svd_top_n_weight is expected to be in (0, 100].")
  48. self.__init_handle_by_constructor__(_transform.PassContext, opt_level,
  49. fallback_device, required,
  50. disabled, force_transform_avgpool2d,
  51. ksize_global_avgpool2d_transform,
  52. svd_topn_percent)
  53. def __enter__(self):
  54. _transform.EnterPassContext(self)
  55. return self
  56. def __exit__(self, ptype, value, trace):
  57. _transform.ExitPassContext(self)
  58. @staticmethod
  59. def current():
  60. """Return the current pass context."""
  61. return _transform.GetCurrentPassContext()
  62. def build_config(opt_level=2,
  63. fallback_device=_nd.cpu(),
  64. required_pass=None,
  65. disabled_pass=None,
  66. force_transform_avgpool2d=True,
  67. ksize_global_avgpool2d_transform=0,
  68. svd_topn_percent = 100):
  69. """Configure the build behavior by setting config variables.
  70. Parameters
  71. ----------
  72. opt_level: int, optional
  73. Optimization level. The optimization pass name and level are as the
  74. following:
  75. .. code-block:: python
  76. OPT_PASS_LEVEL = {
  77. "SimplifyInference": 0,
  78. "OpFusion": 1,
  79. "FoldConstant": 2,
  80. "FoldScaleAxis": 3,
  81. "AlterOpLayout": 3,
  82. "CanonicalizeOps": 3,
  83. "CanonicalizeCast": 3,
  84. "EliminateCommonSubexpr": 3,
  85. "CombineParallelConv2D": 4,
  86. "CombineParallelDense": 4
  87. }
  88. fallback_device : int, str, or tvm.TVMContext, optional
  89. The fallback device. It is also used as the default device for
  90. operators without specified device during heterogeneous execution.
  91. required_pass: set of str, optional
  92. Optimization passes that are required regardless of optimization level.
  93. disabled_pass: set of str, optional
  94. Optimization passes to be disabled during optimization.
  95. force_transform_avgpool2d : bool, optional
  96. The flag indicated whether force transform avgpool2d to conv2d in VaccTransformAvgPool pass.
  97. The initial purpose of adding this flag is to support transform avgpool2d in inception_v3 model in Keras and ONNX.
  98. If enabled, VaccTransformAvgPool pass will ignore the count_include_pad check and treat count_include_pad false
  99. as true. The accuracy loss caused by this approximate convert is little through Table 60 in spec.
  100. If disabled, VaccTransformAvgPool pass will do count_include_pad check and reject when check failed.
  101. ksize_global_avgpool2d_transform : int, optional
  102. The kernel type we used to decompose large global avgpool2d.
  103. We provide 2 options to choose: 0 represents 3x3 s2; 1 represents 5x5 s4.
  104. As our hardware do better support for 3x3 kernel, so we choose 3x3 s2 as default.
  105. svd_topn_percent : int, optional
  106. svd_topn_percent% = sum(TopN singular values) / sum(All singular values).
  107. Returns
  108. -------
  109. pass_context: PassContext
  110. The pass context for optimizations.
  111. """
  112. return PassContext(opt_level, fallback_device, required_pass,
  113. disabled_pass, force_transform_avgpool2d,
  114. ksize_global_avgpool2d_transform, svd_topn_percent)