1. 表达式 API 简介
2. 数据结构
- 2.1 类定义
  - 2.1.1 常用类
  - 2.1.2 相关类
- 2.2 关系图
3. 表达式基础数据类型 —— VARP
- 3.1 变量类型
- 3.2 读 / 写
  - 3.2.1 获取变量信息
  - 3.2.2 读写数据
include
- 3.3 计算
  - 3.3.1 计算 / 建图
  - 3.3.2 常量转化
4. 模型推理

1. 表达式 API 简介

表达式 API 是把 AI 推理与通用计算结合在一起的 API 设计，它提供如下功能

模型推理
数值计算
搭建模型

2.1.2 相关类

Expr ：每个Variable 引用 Expr 的一个输出，用户不需要直接使用该数据结构
RuntimeManager ：内部包含一个 Runtime ，分配 Module 、Expr 所需的计算资源，Runtime 对应 CPU / OpenCL / Vulkan / Metal 各类实现后端
Executor：包含若干个 RuntimeManager ，提供内存管理接口，每个 Executor 必须在单线程环境下运行。默认提供全局 Executor ，需要并发执行时，可自行创建。
ExecutorScope : 用于在子线程中绑定 Executor ，多线程并发必需
Tensor ：MNN 基础数据结构，Interpreter-Session 系列 API 基础数据结构
2.2 关系图

3. 表达式基础数据类型 —— VARP

表达式是一个延迟计算引擎，它提供如下功能：
（1）构造计算图
（2）加载、保存、修改计算图
（3）基于计算图进行计算（借助 MNN 推理引擎实现）

API 设计上使用”响应式编程”，修改输入的值之后，在对应的输出节点取值即可，没有显示的计算调用。

3.1 变量类型

用户操作的数据类型为 VARP，可按Tensor去读取它的值，按保存时的方式不同，分成三类

3.1.1 Input

由 MNN::Express::_Input 创建，或者加载模型而得，在保存时仅存储维度信息（shape），可以写入值

3.1.2 Const / Trainable

由 MNN::Express::_Const / _Trainable 创建，或者加载模型而得，在保存时存储数值，不能写入，只能读取

3.1.3 Function

非输入或者常量，一切由计算而得的变量，不能写入，在保存时存储与之相关的计算图
Function 变量可通过 fix_as_const / fix_as_input 调用转换为相应类型，转换时将值计算出来，并去除前置节点依赖。

3.2 读 / 写

3.2.1 获取变量信息

用 getInfo 可以获取 Variable::Info ，该数据结构描述变量基础信息

dim : 维度数组
order : 该变量维度排列
- 类型：NHWC / NCHW / NC4HW4
  - NHWC 对应 Tensor::Tensorflow
  - NCHW 对应 Tensor::Caffe
  - NC4HW4 对应 Tensor::CaffC4
- Convolution / Deconvolution / Image 等基于图像的计算，需要输入 NC4HW4 的类型
type : 数值类型
- code : 浮点/整数/非负整数
- bits : 位数

size: Tensor 的数组大小，一般为 dim 连乘积

  struct Info {
      Dimensionformat order = NHWC;
      std::vector<int> dim;
      halide_type_t type;
      int size;
      void syncSize();
  };
  const Info* getInfo();

3.2.2 读写数据

由 _Input 方法建出来的 VARP 可以写入数据，先获取写入数据的指针，然后填写数据
示例如下： ```cpp

VARP input = _Input({1, 3, 224, 224}, NHWC, halide_type_of()); float inputPtr = input->writeMap(); ::memset(inputPtr, 0, input->getInfo()->size sizeof(float));

任意变量在数据可用后，均可读取数据，示例如下：
```cpp
#include <MNN/expr/ExprCreator.hpp>
using namespace MNN::Express;
void demo() {
    VARP x = _Input({1, 3, 224, 224}, NHWC, halide_type_of<float>());
    int inputSize = x->getInfo()->size;
    float* inputPtr = x->writeMap<float>();
    for (int i=0; i<inputSize; ++i) {
        inputPtr[i] = (float)i / 1000.0f;
    }
    const float* inputPtr2 = x->readMap<float>();
    // Do something with inputPtr2
    VARP y = x * x + x;
    int outputSize = y->getInfo()->size;
    const float* outputPtr = y->readMap<float>();
    for (int i=0; i<outputSize; ++i) {
        printf("%d: %f\n", i, outputPtr[i]);
    }
}

若变量的值不可用（未被计算出来），readMap 将返回 nullptr ，如下情况 ```cpp
include
using namespace MNN::Express;

void demo() { VARP x = _Input({1, 3, 224, 224}, NHWC, halide_type_of()); VARP y = x x + x; const float outputPtr = y->readMap(); //由于 x 未写入数据，y 无法计算，outputPtr 此时为空 }

<a name="2ad3c213"></a>
### 3.2.3 加载 / 存储
无论是哪种类型变量，均可保存为 mnn 模型，或从 mnn 模型加载，相关接口如下：
```cpp
// 加载
static std::vector<VARP> load(const char* fileName);
static std::map<std::string, VARP> loadMap(const char* fileName);
static std::vector<VARP> load(const uint8_t* buffer, size_t length);
static std::map<std::string, VARP> loadMap(const uint8_t* buffer, size_t length);
// 存储
static void save(const std::vector<VARP>& vars, const char* fileName);

示例

#include <MNN/expr/ExprCreator.hpp>
using namespace MNN::Express;
// 加载 model.mnn ，保存 prob 的计算部分
void splitDemp() {
    auto varMap = Variable::loadMap("model.mnn");
    std::vector<VARP> vars = std::vector<VARP> {varMap["prob"]};
    Variable::save(vars, "part.mnn");
}
// 保存变量数据
void saveOutput(float* data0, size_t n0, float* data1, size_t n1) {
    VARP input0 = _Const(data0, NHWC, {n0});
    VARP input1 = _Const(data1, NHWC, {n1});
    Variable::save({input0, input1}, "result.mnn");
}
// 加载输入输出分别为 input 和 output 的 model.mnn ，输入数据到 input ，计算 output
void loadAndCompute() {
    auto varMap = Variable::loadMap("model.mnn");
    float* inputPtr = varMap["input"]->writeMap<float>();
    size_t inputSize = varMap["input"]->getInfo()->size;
    for (int i=0; i<inputSize; ++i) {
        inputPtr[i] = (float)i/(float)1000;
    }
    auto outputPtr = varMap["output"]->readMap<float>();
    auto outputSize = varMap["output"]->getInfo()->size;
    for (int i=0; i<outputSize; ++i) {
        printf("%f, ", outputPtr[i]);
    }
}

3.3 计算

3.3.1 计算 / 建图

对 VARP 进行计算操作，等效于创建计算图
MNN/expr/ExprCreator.hpp 中有各类对 VARP 进行计算的函数，此处不详细介绍

3.3.2 常量转化
执行 Varibale 中的计算图，得到数据，并将 Variable 转换为指定的类型
```
enum InputType {
  INPUT = 0,
  CONSTANT = 1,
  TRAINABLE = 2,
};
bool fix(InputType type) const;
```
分三种类型
INPUT : 输入类型，在 Variable 保存时仅保留维度信息，不存储数据
CONSTANT : 常量类型，在 Variable 保存时同时保留维度信息与存储数据
TRAINABLE : 可训练类型，在 Variable 保存时同时保留维度信息与存储数据，在转换 mnn 模型为可训练模型时，按该类型去决定 Variable 是否需要更新值

示例：

#include <MNN/expr/ExprCreator.hpp>
using namespace MNN::Express;
void demo() {
    auto varp = _Input({1, 3, 224, 224}, NHWC);
    {
        // Init value init
        auto ptr = varp->writeMap<float>();
        auto size = varp->getInfo()->size;
        for (int i=0; i<size; ++i) {
            ptr[i] = (float)i / 100.0f;
        }
    }
    auto input = varp * _Scalar<float>(1.0f/255.0f);
    output = input * input + input;
    // fix input 之后，1.0f / 255.0f 的预处理不会保存到计算图里面
    input.fix(VARP::INPUT);
    // graph.mnn 描述 x * x + x 这个计算图
    Variable::save({output}, "graph.mnn");
    // fix output 之后，保存输出的数值而非计算图
    output.fix(VARP::CONSTANT);
    Variable::save({varp}, "output.mnn");
}

4. 模型推理

4.1 总流程

4.2 创建 / 销毁

4.2.1 Executor 创建

MNN中Executor给用户提供接口来配置推理后端、线程数等属性，以及做性能统计、算子执行的回调函数、内存回收等功能。提供一个全局的Exector对象，用户不用创建或持有对象即可直接使用。

新建Exector示例：

NNForwardType type = MNN_FORWARD_CPU;
MNN::BackendConfig backend_config;    // default backend config 
// static std::shared_ptr<Executor> newExecutor(MNNForwardType type,
//                                              const BackendConfig& config,
//                                              int numberThread);
std::shared_ptr<MNN::Express::Executor> executor(
    MNN::Express::Executor::newExecutor(type, backend_config, 4));
MNN::Express::ExecutorScope scope(executor);

使用默认全局Exector示例：

// NNForwardType type = MNN_FORWARD_CPU;
MNN::BackendConfig backend_config;    // default backend config 
MNN::Express::Executor::getGlobalExecutor()->setGlobalExecutorConfig(type, backend_config, 4);

4.2.2 加载MNN模型创建 Module

const std::string model_file = "/tmp/mymodule.mnn"; // model file with path
const std::vector<std::string> input_names{"input_1", "input_2", "input_3"};
const std::vector<std::string> output_names{"output_1"};
Module::Config mdconfig; // default module config
std::unique_ptr<Module> module; // module 
module.reset(Module::load(input_names, output_names, model_filename.c_str(), &mdconfig));

input_names 和 output_names 可以为空，MNN 会在模型中搜索输入输出，顺序需要自行用 4.3.1 中的 getInfo 查询

4.2.3 基于已有的 Module 创建 Module

std::unique_ptr<Module> module_shallow_copy;
module_shallow_copy.reset(Module::clone(module.get()));

调用 Module 的 clone 方法可以复制新的 Module
复制后的 Module 与原 Module 共享常量内存
常用于同一个模型的多实例并发
4.2.4 销毁
如4.2.2所示，一般用share_ptr保存Module指针，使用结束可不用专门处理，自动会销毁。
4.3 使用
4.3.1 信息查询
调用 getInfo 函数可获取Module信息

const Info* getInfo() const;

Info 定义如下

struct Info {
    // Input info load from model
    std::vector<Variable::Info> inputs;
    // The Module's defaultFormat, NCHW or NHWC
    Dimensionformat defaultFormat;
    // Runtime Info
    std::shared_ptr<MNN::Express::Executor::RuntimeManager> runTimeManager;
    // Input Names By Order
    std::vector<std::string> inputNames;
    // Output Names By Order
    std::vector<std::string> outputNames;
    // Version of MNN which build the model
    std::string version;
};

可参考 tools/cpp/GetMNNInfo.cpp 使用

4.3.2 推理

示例代码：

int dim = 224；
std::vector<VARP> inputs(3);
inputs[0] = _Input({1, dim}, NHWC, halide_type_of<int>());
inputs[1] = _Input({1, dim}, NHWC, halide_type_of<int>());
inputs[2] = _Input({1, dim}, NHWC, halide_type_of<int>());
// set input data
std::vector<int*> input_pointer = {inputs[0]->writeMap<int>(),
                                   inputs[1]->writeMap<int>(),
                                   inputs[2]->writeMap<int>()};
for (int i = 0; i < inputs[0]->getInfo->size; ++i) {
    input_pointer[0] = i + 1;
    input_pointer[1] = i + 2;
    input_pointer[2] = i + 3;
}
std::vector<VARP> outputs  = module->onForward(inputs);
auto output_ptr = outputs[0]->readMap<float>();

4.3.3 调试与回调函数

设置为debug模式，可以保留算子的的名称字符串，内存、算力统计信息。打开该模式需要先创建运行时管理器。

创建自定义运行时管理器

MNN::ScheduleConfig config;  // use default 
BackendConfig backendConfig; // use default backendconfig
config.backendConfig     = &backendConfig;
std::shared_ptr<Executor::RuntimeManager> runtime_mgr(Executor::RuntimeManager::createRuntimeManager(config));

调试模式

使用自定义运行时管理器，加载模型、打开调试模式。复用加载模型

const std::string model_file = "/tmp/mymodule.mnn"; // model file with path
const std::vector<std::string> input_names{"input_1", "input_2", "input_3"};
const std::vector<std::string> output_names{"output_1"};
Module::Config mdconfig; // default module config
runtime_mgr->setMode(Interpreter::Session_Debug);
std::shared_ptr<Module> user_module;
user_module.reset(Module::load(inputNames, outputNames, modelName.c_str(), runtime_mgr, &mConfig));

设置与触发回调函数


MNN::TensorCallBackWithInfo beforeCallBack = [&](const std::vector<MNN::Tensor*>& ntensors, const OperatorInfo* info) {
    // do any thing you want.
    auto opName = info->name();
    for (int i = 0; i < ntensors.size(); ++i) {
        auto ntensor    = ntensors[i];
        print("input op name:%s, shape:", opName.c_str());
        ntensor->printShape();
    }
    return true;
};
MNN::TensorCallBackWithInfo callBack = [&](const std::vector<MNN::Tensor*>& ntensors,  const OperatorInfo* info) {
    auto opName = info->name();
    for (int i = 0; i < ntensors.size(); ++i) {
        auto ntensor    = ntensors[i];
        print("output op name:%s, shape:", opName.c_str());
        ntensor->printShape();
    }
    return true;
};
// set callback function
Express::Executor::getGlobalExecutor()->setCallBack(std::move(beforeCallBack), std::move(callBack));
// forward would trigger callback
std::vector<VARP> outputs  = user_module->onForward(inputs);

表达式 API