- [翻译]如何使用Relay Pass 基础结构 — tvm 0.7.dev1文档
- 如何使用Relay Pass 基础结构
- Zhi Chen">作者: Zhi Chen
- ">Let’s first create a relay Module which contains one or multiple Relay# functions for optimization.f = example()mod = relay.Module.from_expr(f)
# Now we can apply constant folding on the module.# fold_const here is a callback that doesn’t take any parameters.fold_const = relay.transform.FoldConstant()# Then, we can invoke the pass on the given module. Note that the constant# folding pass works at the function-level. That being said, each function in# the module will be applied with the optimization. Users don’t need to iterate# through individual functions manually to apply this pass.mod = fold_const(mod)# We can see from the updated program that the constants are folded.print(mod)输出:def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {%0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]) / ty=Tensor[(1, 64, 54, 54), float32] /;%1 = add(%0, meta[relay.Constant][0] / ty=Tensor[(1, 64, 54, 54), float32] / / ty=Tensor[(1, 64, 54, 54), float32] /) / ty=Tensor[(1, 64, 54, 54), float32] /;%2 = add(%1, meta[relay.Constant][1] / ty=Tensor[(1, 64, 54, 54), float32] / / ty=Tensor[(1, 64, 54, 54), float32] /) / ty=Tensor[(1, 64, 54, 54), float32] /;%3 = add(%1, meta[relay.Constant][1] / ty=Tensor[(1, 64, 54, 54), float32] / / ty=Tensor[(1, 64, 54, 54), float32] /) / ty=Tensor[(1, 64, 54, 54), float32] /;add(%2, %3) / ty=Tensor[(1, 64, 54, 54), float32] /}
// meta data omitted. you can use show_meta_data=True to include meta data可以以类似方式应用更多优化。例如,我们可以消除z和z1使用的通用表达式。mod = relay.transform.EliminateCommonSubexpr()(mod)print(mod)输出:def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {%0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]) / ty=Tensor[(1, 64, 54, 54), float32] /;%1 = add(%0, meta[relay.Constant][0] / ty=Tensor[(1, 64, 54, 54), float32] / / ty=Tensor[(1, 64, 54, 54), float32] /) / ty=Tensor[(1, 64, 54, 54), float32] /;%2 = add(%1, meta[relay.Constant][1] / ty=Tensor[(1, 64, 54, 54), float32] / / ty=Tensor[(1, 64, 54, 54), float32] /) / ty=Tensor[(1, 64, 54, 54), float32] /;add(%2, %2) / ty=Tensor[(1, 64, 54, 54), float32] /}
// meta data omitted. you can use show_meta_data=True to include meta data一些优化(例如融合)也是参数化的。例如,选择级别0不允许将算子融合在一起。用户可以通过参数 fuse_opt_level来启用此功能。mod = relay.transform.FuseOps(fuse_opt_level=0)(mod)
# We can observe that the optimized module contains functions that only have# a signle primitive op.print(mod)输出:def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {%0 = fn (%p0: Tensor[(1, 64, 56, 56), float32], %p1: Tensor[(64, 64, 3, 3), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {nn.conv2d(%p0, %p1, padding=[0, 0, 0, 0]) / ty=Tensor[(1, 64, 54, 54), float32] /};%1 = %0(%x, %weight) / ty=Tensor[(1, 64, 54, 54), float32] /;%2 = fn (%p01: Tensor[(1, 64, 54, 54), float32], %p11: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {add(%p01, %p11) / ty=Tensor[(1, 64, 54, 54), float32] /};%3 = %2(%1, meta[relay.Constant][0] / ty=Tensor[(1, 64, 54, 54), float32] / / ty=Tensor[(1, 64, 54, 54), float32] /) / ty=Tensor[(1, 64, 54, 54), float32] /;%4 = fn (%p02: Tensor[(1, 64, 54, 54), float32], %p12: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {add(%p02, %p12) / ty=Tensor[(1, 64, 54, 54), float32] /};%5 = %4(%3, meta[relay.Constant][1] / ty=Tensor[(1, 64, 54, 54), float32] / / ty=Tensor[(1, 64, 54, 54), float32] /) / ty=Tensor[(1, 64, 54, 54), float32] /;%6 = fn (%p03: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {add(%p03, %p03) / ty=Tensor[(1, 64, 54, 54), float32] /};%6(%5) / ty=Tensor[(1, 64, 54, 54), float32] /}
// meta data omitted. you can use show_meta_data=True to include meta data### (2)顺序应用pass序列
[翻译]如何使用Relay Pass 基础结构 — tvm 0.7.dev1文档
如何使用Relay Pass 基础结构
作者: Zhi Chen
随着Relay中优化pass的数量的增加,执行这些pass并手动维护其依赖关系变得很麻烦。因此,我们引入了一个基础结构来管理优化pass。Relay程序的优化可以应用在各种粒度上,包括[](https://tvm.apache.org/docs/api/python/relay/transform.html#tvm.relay.transform.FunctionPass)
函数级, tvm.relay.transform.ModulePass模块级 ,或者,用户可以依靠tvm.transform.Sequential在Relay程序上应用一系列pass,其中pass之间的依赖关系可以通过pass基础结构来解决。有关这些pass的各种类型的更多详细信息,请参阅relay pass基础结构本教程介绍了开发人员如何使用relay pass基础来执行某些优化并创建优化pipeline。import numpy as npimport tvmimport tvm.relay as relay## 创建一个示例relay程序
首先,我们创建一个简单的Relay程序。本教程中示例的各种优化将使用该程序。def example(): shape = (1, 64, 54, 54) c_data = np.empty(shape).astype(“float32”) c = relay.const(c_data) weight = relay.var(‘weight’, shape=(64, 64, 3, 3)) x = relay.var(“x”, relay.TensorType((1, 64, 56, 56), “float32”)) conv = relay.nn.conv2d(x, weight) y = relay.add(c, c) y = relay.multiply(y, relay.const(2, “float32”)) y = relay.add(conv, y) z = relay.add(y, c) z1 = relay.add(y, c) z2 = relay.add(z, z1) return relay.Function([x], z2)让我们为conv2d op注册布局转换,以便我们可以在示例中应用布局转换pass。布局转换pass如何工作不在本教程的讨论范围之内。@relay.op.register_alter_op_layout(“nn.conv2d”, level=101)def alter_conv2d(attrs, inputs, tinfos): data, weight = inputs new_attrs = dict(attrs) new_attrs[‘data_layout’] = ‘NCHW16c’ return relay.nn.conv2d(data, weight, **new_attrs)## 优化程序
现在我们要优化程序。relay的特色就是具有许多优化。我们将选择其中一些应用于此示例程序。有多种优化relay程序的方法。下面我们将为每个方法提供示例。### (1)手动应用优化pass
Let’s first create a relay Module which contains one or multiple Relay# functions for optimization.f = example()mod = relay.Module.from_expr(f)
# Now we can apply constant folding on the module.# fold_const here is a callback that doesn’t take any parameters.fold_const = relay.transform.FoldConstant()# Then, we can invoke the pass on the given module. Note that the constant# folding pass works at the function-level. That being said, each function in# the module will be applied with the optimization. Users don’t need to iterate# through individual functions manually to apply this pass.mod = fold_const(mod)# We can see from the updated program that the constants are folded.print(mod)输出:def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {%0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]) / ty=Tensor[(1, 64, 54, 54), float32] /;%1 = add(%0, meta[relay.Constant][0] / ty=Tensor[(1, 64, 54, 54), float32] / / ty=Tensor[(1, 64, 54, 54), float32] /) / ty=Tensor[(1, 64, 54, 54), float32] /;%2 = add(%1, meta[relay.Constant][1] / ty=Tensor[(1, 64, 54, 54), float32] / / ty=Tensor[(1, 64, 54, 54), float32] /) / ty=Tensor[(1, 64, 54, 54), float32] /;%3 = add(%1, meta[relay.Constant][1] / ty=Tensor[(1, 64, 54, 54), float32] / / ty=Tensor[(1, 64, 54, 54), float32] /) / ty=Tensor[(1, 64, 54, 54), float32] /;add(%2, %3) / ty=Tensor[(1, 64, 54, 54), float32] /}
// meta data omitted. you can use show_meta_data=True to include meta data可以以类似方式应用更多优化。例如,我们可以消除z和z1使用的通用表达式。mod = relay.transform.EliminateCommonSubexpr()(mod)print(mod)输出:def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {%0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]) / ty=Tensor[(1, 64, 54, 54), float32] /;%1 = add(%0, meta[relay.Constant][0] / ty=Tensor[(1, 64, 54, 54), float32] / / ty=Tensor[(1, 64, 54, 54), float32] /) / ty=Tensor[(1, 64, 54, 54), float32] /;%2 = add(%1, meta[relay.Constant][1] / ty=Tensor[(1, 64, 54, 54), float32] / / ty=Tensor[(1, 64, 54, 54), float32] /) / ty=Tensor[(1, 64, 54, 54), float32] /;add(%2, %2) / ty=Tensor[(1, 64, 54, 54), float32] /}
// meta data omitted. you can use show_meta_data=True to include meta data一些优化(例如融合)也是参数化的。例如,选择级别0不允许将算子融合在一起。用户可以通过参数 fuse_opt_level来启用此功能。mod = relay.transform.FuseOps(fuse_opt_level=0)(mod)
# We can observe that the optimized module contains functions that only have# a signle primitive op.print(mod)输出:def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {%0 = fn (%p0: Tensor[(1, 64, 56, 56), float32], %p1: Tensor[(64, 64, 3, 3), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {nn.conv2d(%p0, %p1, padding=[0, 0, 0, 0]) / ty=Tensor[(1, 64, 54, 54), float32] /};%1 = %0(%x, %weight) / ty=Tensor[(1, 64, 54, 54), float32] /;%2 = fn (%p01: Tensor[(1, 64, 54, 54), float32], %p11: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {add(%p01, %p11) / ty=Tensor[(1, 64, 54, 54), float32] /};%3 = %2(%1, meta[relay.Constant][0] / ty=Tensor[(1, 64, 54, 54), float32] / / ty=Tensor[(1, 64, 54, 54), float32] /) / ty=Tensor[(1, 64, 54, 54), float32] /;%4 = fn (%p02: Tensor[(1, 64, 54, 54), float32], %p12: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {add(%p02, %p12) / ty=Tensor[(1, 64, 54, 54), float32] /};%5 = %4(%3, meta[relay.Constant][1] / ty=Tensor[(1, 64, 54, 54), float32] / / ty=Tensor[(1, 64, 54, 54), float32] /) / ty=Tensor[(1, 64, 54, 54), float32] /;%6 = fn (%p03: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {add(%p03, %p03) / ty=Tensor[(1, 64, 54, 54), float32] /};%6(%5) / ty=Tensor[(1, 64, 54, 54), float32] /}
// meta data omitted. you can use show_meta_data=True to include meta data### (2)顺序应用pass序列
如上所述,应用pass实际上是乏味的,并且可能要求用户更好地了解它们之间的依赖性。例如,融合目前不适用于let绑定。因此,如果在融合之前应用relay.transform.ToANormalForm(),我们将无法融合算子(即使这些算子之前是可融合的 ),因为此pass会为每个表达式生成let绑定,以规范化Relay程序。因此,Relay提供了 [tvm.transform.Sequential](https://tvm.apache.org/docs/api/python/ir.html#tvm.transform.Sequential)
通过指定每个pass所需的pass并将它们打包为整体来减轻开发人员显式处理这些问题的负担。例如,现在可以使用下面的顺序样式来应用同样的pass。[tvm.transform.Sequential](https://tvm.apache.org/docs/api/python/ir.html#tvm.transform.Sequential)
与torch.nn.sequential和mxnet.gluon.block类似。例如,torch.nn.sequential用于包含一系列PyTorch模块,这些模块将被添加以构建网络。它专注于网络层。取而代之的是,在我们的pass基础结构中的tvm.transform.Sequential工作在优化pass上。# Now let’s execute some passes through Sequential
f = example()mod = relay.Module.fromexpr(f)# Glob the interested passes.seq = relay.transform.Sequential([relay.transform.FoldConstant(), relay.transform.EliminateCommonSubexpr(), relay.transform.FuseOps(fuse_opt_level=2)])mod1 = seq(mod)print(mod1)输出:def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {%4 = fn (%p0: Tensor[(1, 64, 56, 56), float32], %p1: Tensor[(64, 64, 3, 3), float32], %p2: Tensor[(1, 64, 54, 54), float32], %p3: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {%0 = nn.conv2d(%p0, %p1, padding=[0, 0, 0, 0]) / ty=Tensor[(1, 64, 54, 54), float32] /;%1 = add(%0, %p2) / ty=Tensor[(1, 64, 54, 54), float32] /;%2 = add(%1, %p3) / ty=Tensor[(1, 64, 54, 54), float32] /;%3 = add(%1, %p3) / ty=Tensor[(1, 64, 54, 54), float32] /;add(%2, %3) / ty=Tensor[(1, 64, 54, 54), float32] /};%4(%x, %weight, meta[relay.Constant][0] / ty=Tensor[(1, 64, 54, 54), float32] / / ty=Tensor[(1, 64, 54, 54), float32] /, meta[relay.Constant][1] / ty=Tensor[(1, 64, 54, 54), float32] / / ty=Tensor[(1, 64, 54, 54), float32] /) / ty=Tensor[(1, 64, 54, 54), float32] /}
// meta data omitted. you can use show_meta_data=True to include meta data从转换后的Relay程序中,我们可以看到仍然有两个相同的加法运算。这是因为EliminateCommonSubexpr实际上并未执行。原因是在tvm.transform.Sequential下,默认情况下只有优化级别小于或等于2的pass才被执行。但是,下面的pass提供了一个配置接口,供用户自定义要执行的优化级别。with relay.build_config(opt_level=3): mod2 = seq(mod)print(mod2)输出:def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {%3 = fn (%p0: Tensor[(1, 64, 56, 56), float32], %p1: Tensor[(64, 64, 3, 3), float32], %p2: Tensor[(1, 64, 54, 54), float32], %p3: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {%0 = nn.conv2d(%p0, %p1, padding=[0, 0, 0, 0]) / ty=Tensor[(1, 64, 54, 54), float32] /;%1 = add(%0, %p2) / ty=Tensor[(1, 64, 54, 54), float32] /;%2 = add(%1, %p3) / ty=Tensor[(1, 64, 54, 54), float32] /;add(%2, %2) / ty=Tensor[(1, 64, 54, 54), float32] /};%3(%x, %weight, meta[relay.Constant][0] / ty=Tensor[(1, 64, 54, 54), float32] / / ty=Tensor[(1, 64, 54, 54), float32] /, meta[relay.Constant][1] / ty=Tensor[(1, 64, 54, 54), float32] / / ty=Tensor[(1, 64, 54, 54), float32] /) / ty=Tensor[(1, 64, 54, 54), float32] /}
// meta data omitted. you can use show_meta_data=True to include meta data现在我们可以看到仅保留了两个相同的加法之一。此外,用户可以使用disabled_pass配置有选择地禁用某些pass ,这与通用编译器(例如Clang和GCC)使用的-fno-xxx选项相似。例如,我们可以按以下方式禁用EliminateCommonSubexpr。打印的模块将再次显示两个相同的加法运算。with relay.build_config(opt_level=3, disabled_pass=[“EliminateCommonSubexpr”]): mod3 = seq(mod)print(mod3)输出:def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {%4 = fn (%p0: Tensor[(1, 64, 56, 56), float32], %p1: Tensor[(64, 64, 3, 3), float32], %p2: Tensor[(1, 64, 54, 54), float32], %p3: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {%0 = nn.conv2d(%p0, %p1, padding=[0, 0, 0, 0]) / ty=Tensor[(1, 64, 54, 54), float32] /;%1 = add(%0, %p2) / ty=Tensor[(1, 64, 54, 54), float32] /;%2 = add(%1, %p3) / ty=Tensor[(1, 64, 54, 54), float32] /;%3 = add(%1, %p3) / ty=Tensor[(1, 64, 54, 54), float32] /;add(%2, %3) / ty=Tensor[(1, 64, 54, 54), float32] /};%4(%x, %weight, meta[relay.Constant][0] / ty=Tensor[(1, 64, 54, 54), float32] / / ty=Tensor[(1, 64, 54, 54), float32] /, meta[relay.Constant][1] / ty=Tensor[(1, 64, 54, 54), float32] / / ty=Tensor[(1, 64, 54, 54), float32] /) / ty=Tensor[(1, 64, 54, 54), float32] /}
// meta data omitted. you can use show_meta_data=True to include meta data到目前为止应用的pass都是与目标无关的。pass基础结构还提供了使pass目标相关的方法。例如,布局变更阶段属于这种类别。with relay.build_config(opt_level=3): mod4 = seq(mod)print(mod4)
seq1 = relay.transform.Sequential([relay.transform.AlterOpLayout()])with relay.build_config(opt_level=3): with tvm.target.create(“llvm”): mod5 = seq1(mod)print(mod5)输出:def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {%3 = fn (%p0: Tensor[(1, 64, 56, 56), float32], %p1: Tensor[(64, 64, 3, 3), float32], %p2: Tensor[(1, 64, 54, 54), float32], %p3: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {%0 = nn.conv2d(%p0, %p1, padding=[0, 0, 0, 0]) / ty=Tensor[(1, 64, 54, 54), float32] /;%1 = add(%0, %p2) / ty=Tensor[(1, 64, 54, 54), float32] /;%2 = add(%1, %p3) / ty=Tensor[(1, 64, 54, 54), float32] /;add(%2, %2) / ty=Tensor[(1, 64, 54, 54), float32] /};%3(%x, %weight, meta[relay.Constant][0] / ty=Tensor[(1, 64, 54, 54), float32] / / ty=Tensor[(1, 64, 54, 54), float32] /, meta[relay.Constant][1] / ty=Tensor[(1, 64, 54, 54), float32] / / ty=Tensor[(1, 64, 54, 54), float32] /) / ty=Tensor[(1, 64, 54, 54), float32] /}
// meta data omitted. you can use show_meta_data=True to include meta datadef @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {%0 = layout_transform(%x, src_layout=”NCHW”, dst_layout=”NCHW16c”) / ty=Tensor[(1, 4, 56, 56, 16), float32] /;%1 = nn.conv2d(%0, %weight, padding=[0, 0, 0, 0], data_layout=”NCHW16c”) / ty=Tensor[(1, 4, 54, 54, 16), float32] /;%2 = add(meta[relay.Constant][0] / ty=Tensor[(1, 64, 54, 54), float32] / / ty=Tensor[(1, 64, 54, 54), float32] /, meta[relay.Constant][0] / ty=Tensor[(1, 64, 54, 54), float32] / / ty=Tensor[(1, 64, 54, 54), float32] /) / ty=Tensor[(1, 64, 54, 54), float32] /;%3 = multiply(%2, 2f / ty=float32 /) / ty=Tensor[(1, 64, 54, 54), float32] /;%4 = layout_transform(%3, src_layout=”NCHW”, dst_layout=”NCHW16c”) / ty=Tensor[(1, 4, 54, 54, 16), float32] /;%5 = add(%1, %4) / ty=Tensor[(1, 4, 54, 54, 16), float32] /;%6 = layout_transform(meta[relay.Constant][0] / ty=Tensor[(1, 64, 54, 54), float32] / / ty=Tensor[(1, 64, 54, 54), float32] /, src_layout=”NCHW”, dst_layout=”NCHW16c”) / ty=Tensor[(1, 4, 54, 54, 16), float32] /;%7 = add(%5, %6) / ty=Tensor[(1, 4, 54, 54, 16), float32] /;%8 = add(%5, %6) / ty=Tensor[(1, 4, 54, 54, 16), float32] /;%9 = add(%7, %8) / ty=Tensor[(1, 4, 54, 54, 16), float32] /;layout_transform(%9, src_layout=”NCHW16c”, dst_layout=”NCHW”) / ty=Tensor[(1, 64, 54, 54), float32] /}
// meta data omitted. you can use show_meta_data=True to include meta data## 使用Python 装饰器来实施pass
下一个示例说明了如何使用Python装饰器通过pass基础流程编排定制的优化pipeline。此功能极大地简化了pass的执行。例如,用户可以简单地定义一个修饰的类来进行函数级别的优化,如以下示例所示。transform_function包装一个类,以用c的倍数替换所有常量。稍后,当我们调用自定义pass时,将访问给定模块中的每个函数,并且将替换函数中的每个常量。@relay.transform.function_pass(opt_level=1)class CustomPipeline: “””Simple test function to replace one argument to another.”””
def __init(self, multiplier): self.multiplier = multiplier
# This function can define a pass. def transform_function(self, func, mod, ctx): obj = self
class ReplaceConstant(tvm.relay.ExprMutator): def visit_constant(self, c): return relay.multiply(obj.multiplier, c) return ReplaceConstant().visit(func)
f = example()mod = relay.Module.from_expr(f)custom_pass = CustomPipeline(multiplier=relay.const(3, “float32”))assert custom_pass.info.name == “CustomPipeline”mod3 = custom_pass(mod)print(mod3)输出:def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {%0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]) / ty=Tensor[(1, 64, 54, 54), float32] /;%1 = multiply(3f / ty=float32 /, meta[relay.Constant][0] / ty=Tensor[(1, 64, 54, 54), float32] / / ty=Tensor[(1, 64, 54, 54), float32] /) / ty=Tensor[(1, 64, 54, 54), float32] /;%2 = add(%1, %1) / ty=Tensor[(1, 64, 54, 54), float32] /;%3 = multiply(3f / ty=float32 /, 2f / ty=float32 /) / ty=float32 /;%4 = multiply(%2, %3) / ty=Tensor[(1, 64, 54, 54), float32] /;%5 = add(%0, %4) / ty=Tensor[(1, 64, 54, 54), float32] /;%6 = add(%5, %1) / ty=Tensor[(1, 64, 54, 54), float32] /;%7 = add(%5, %1) / ty=Tensor[(1, 64, 54, 54), float32] /;add(%6, %7) / ty=Tensor[(1, 64, 54, 54), float32] /}
// meta data omitted. you can use show_meta_data=True to include meta data## 调试pass
relay为用户提供了即插即用的调试pass,该pass在完成特定pass之后打印IR。例如,我们可以在常量折叠和融合完成后打印出IR,方法是在它们之后添加调试pass :tvm.transform.PrintIR()。f = example()mod = relay.Module.from_expr(f)seq = relay.transform.Sequential([relay.transform.FoldConstant(), relay.transform.PrintIR(), relay.transform.EliminateCommonSubexpr(), relay.transform.FuseOps(), relay.transform.PrintIR()])with relay.build_config(opt_level=3): mod = seq(mod)print(“done”)输出:done