概览

MNN在C++的基础上，增加了Python扩展。扩展单元包括两个部分 —— MNN和MNNTools。MNN负责推理和训练；MNNTools则提供了一系列的工具集。扩展包含mnn、mnnconvert、mnnquant、mnnvisual。其中，mnn列出所有支持的命令；mnnconvert负责将其他模型转化为mnn模型文件；mnnquant则对mnn模型文件提供了量化能力；mnnvisual对mnn文件模型结构进行可视化操作，将图拓扑结构保存成图片。

安装

依赖库安装

graphviz

macOS下使用下列脚本：

brew install graphviz

Linux下使用下列脚本：

apt-get install graphviz

版本限制

当前支持Python2.7、3.5、3.6、3.7，但Windows下不支持2.7。

macOS下使用下列脚本：

pip install -U MNN

Linux下使用下列脚本：

pip install -U pip
pip install -U MNN

Linux下由于PyPi组织对于包管理的限定，Linux发行版必须采用带ManyLinux Tag的wheel包，老版本的pip管理工具无法找到对应的安装包，必须将pip管理工具升级到最新版本。

mnn命令

mac/linux下输入mnn命令，示例如下：

mnn
mnn toolsets has following command line tools
    $mnn
        list out mnn commands
    $mnnconvert
        convert other model to mnn model
    $mnnquant
        quantize  mnn model
    $mnnvisual
        save mnn model topology to image

windows平台:
选择a方案:找到mnn安装路径并设置环境变量,直接使用mnn命令
选择b方案:运行替代命令

python(3) -m MNN.tools.mnn

mnnconvert命令

mac/linux下输入mnnconvert命令,示例如下：

mnnconvert -h
usage: mnnconvert [-h] -f {TF,CAFFE,ONNX,TFLITE,MNN} --modelFile MODELFILE
                  [--prototxt PROTOTXT] --MNNModel MNNMODEL [--fp16 FP16]
optional arguments:
  -h, --help            show this help message and exit
  -f {TF,CAFFE,ONNX,TFLITE,MNN}, --framework {TF,CAFFE,ONNX,TFLITE,MNN}
                        model type, for example:TF/CAFFE/ONNX/TFLITE/MNN
  --modelFile MODELFILE
                        tensorflow Pb or caffeModel, for
                        example:xxx.pb/xxx.caffemodel
  --prototxt PROTOTXT   only used for caffe, for example: xxx.prototxt
  --MNNModel MNNMODEL   MNN model, ex: xxx.mnn
  --fp16 FP16 if set True, you will get a fp16 float model but not a fp32 float model

Windows平台:
选择a方案：找到mnnconvert的安装路径并设置环境变量.直接使用mnnconvert命令
选择b方案：运行替代命令

python(3) -m MNN.tools.mnnconvert

mnnquant 命令

mac/linux下输入mnnquant命令,示例如下：

mnnquant -h
usage: mnnquant [-h] src_mnn dst_mnn config
positional arguments:
  src_mnn     src mnn file, for example:src.mnn
  dst_mnn     dst mnn file, for example:dst.mnn
  config      config json file, for example:config.json
optional arguments:
  -h, --help  show this help message and exit

Windows平台:
选择a方案：找到mnnconvert的安装路径并设置环境变量.直接使用mnnquant命令
选择b方案：运行替代命令

python(3) -m MNN.tools.mnnquant

mnnvisual命令

mac/linux下运行mnnvisual命令,示例如下：

mnnvisual -h
usage: mnnvisual [-h] target_mnn target_pic
positional arguments:
  target_mnn  target mnn file, for example:target.mnn
  target_pic  target picture file, for example:target.png, or target.jpeg
optional arguments:
  -h, --help  show this help message and exit

Windows平台:
选择a方案：找到mnnvisual的安装路径并设置环境变量.直接使用mnnvisual命令
选择b方案：运行替代命令

python(3) -m MNN.tools.mnnvisual

产生的模型结构图中,红色方块是Tensor名称,蓝色椭圆是Op类型

MNN V2 API

MNN

data type

MNN.Halide_Type_Int
MNN.Halide_Type_Int64
MNN.Halide_Type_Float
MNN.Halide_Type_Double
MNN.Halide_Type_Uint8
MNN.Halide_Type_String

tensor type

MNN.Tensor_DimensionType_Tensorflow
MNN.Tensor_DimensionType_Caffe
MNN.Tensor_DimensionType_Caffe_C4

error code（即将加入）

NO_ERROR = 0,
OUT_OF_MEMORY = 1,
NOT_SUPPORT = 2,
COMPUTE_SIZE_ERROR = 3,
NO_EXECUTION = 4,//User error
INPUT_DATA_ERROR = 10,
CALL_BACK_STOP = 11,
#User error
TENSOR_NOT_SUPPORT = 20,
TENSOR_NEED_DIVIDE = 21,

Interpreter Class

Interpreter(model_name)

加载.mnn模型文件创建MNN解释器
参数：

model_name，模型文件路径，str类型，默认编码

解释器对象

示例：

interpreter = MNN.Interpreter("lenet.mnn")

createSession(config)

根据配置创建session
参数:

config，session配置，optional，字典类型，key-value含义如下
- backend：推理后端，str类型，可选值：”CPU”(默认), “OPENCL”, “OPENGL”, “VULKAN”, “METAL”, “TRT”, “CUDA”, “HIAI”。上述选项是否可用取决于相应后端的推理代码是否打包进了MNN.whl，pypi上的MNN目前只支持CPU后端，内部有需要的话联系MNN团队单独支持
- numThread：int / long类型，在CPU后端时表示推理线程数，在GPU后端表示GPU-Mode
- saveTensors: ，中间结果的tensor name，类型为tuple（元素类型为str），参考注解1
- inputPaths：输入tensor name，类型为tuple（元素类型为str）
- outputPaths：输出tensor name，类型为tuple（元素类型为str）

session对象

注解：

MNN会复用网络内各层输入输出的内存，因此如果需要保留某层的推理结果，需要单独指定其name

示例：

session = interpreter.createSession({'numThread':2L, 'saveTensors': ('t1', 't2'),'inputPaths':('op5', 'op6'), 'outputPaths':('op30', 'op31')})
session = interpreter.createSession()

resizeSession(session)

为session分配内存，进行推理准备工作
参数：

session对象，required

None

示例：

interpreter.resizeSession(session)

runSession(session)

运行session做网络推理
参数：

session对象，required

error code，

示例：

interpreter.runSession(session)

runSessionWithCallBack(session, begincallback, endcallback)

运行session做网络推理，指定每个层推理前后要执行的callback函数
参数：

session对象，required
begincallback，optional，python callable类型（如function或lambda），网络每个层推理之前被运行。输入参数为返回false时直接中断推理，并调用endcallback。callback输入参数如下：
- tensors，该层输入tensor
- name，该层name
endcallback，optional，网络每个层推理之后被运行，其他同上

error code，

示例：

interpreter.runSessionWithCallBack(session)

def begin_callback(tensors, name):
    '''the begin callback,check op name and each tensor's shape'''
    print(name)
    for tensor in tensors:
        print(tensor.getShape())
    return True
def end_callback(tensors, name):
    '''the end callback,check op name and each tensor's shape'''
    print(name)
    for tensor in tensors:
        print(tensor.getShape())
    return True
    interpreter.runSessionWithCallBack(session, begin_callback, end_callback)

getSessionOutput(session, output_tensorname)

获取网络输出tensor
参数：

session对象
输出tensor name，str类型。optional，不指定时获取所有输出tensor

输出tensor对象，其内存可能在device端，且内部格式可能为NC4HW4，所以不要直接getData / getNumpyData。

示例：

output_tensor = interpreter.getSessionOutput(session)
output_tensor = interpreter.getSessionOutput(session, "output")

getSessionInput(session, input_tensorname)

获取网络输入tensor
参数：

session对象，required
输入tensor name，str类型。

输入tensor对象，其内存可能在device端，且内部格式可能为NC4HW4，所以不要直接getData / getNumpyData。

示例：

input_tensor = interpreter.getSessionInput(session, "input")

resizeTensor(tensor,shape)

改变tensor shape，并重新分配内存
参数：

tensor对象，required
tensor目标shape，required。tuple类型（元素为int）

None

示例：

interpreter.resizeTensor(input_tensor,(100, 1, 28, 28))

getSessionInputAll(session)

获取网络所有输入tensor
参数：

session对象，required

python字典，以输入tensor的name为key，以输入tensor为value

示例：

inputs = interpreter.getSessionInputAll(session）

getSessionOutputAll(session)

获取网络所有输出tensor
参数：

session对象，required

python字典，以输出tensor的name为key，以输出tensor为value

示例：

outputs = interpreter.getSessionInputAll(session）

cache()

无入参无返回值
示例：

interpreter.cache()

removeCache()

无入参无返回值
示例：

interpreter.removeCache()

Session Class

cache()

无入参无返回值
示例：

interpreter.cache()

removeCache()

无入参无返回值
示例：

interpreter.removeCache()

MNN.Tensor

Tensor Class

Tensor(shape, data_type, data, dimType)

创建Tensor对象
参数：

tensor shape，required。tuple类型（元素为int且>=0）
数据类型，required。类型参考：
数据，required。类型如下
- tuple，需要被flatten成一维
- numpy，shape应与tensor shape一致，或被flatten成一维。recommended
tensor dim类型，参考：

tensor对象

示例：

tmpTensor = MNN.Tensor((100, 1, 28, 28), MNN.Halide_Type_Float, torch_tensor.numpy(), MNN.Tensor_DimensionType_Caffe)
tmpTensor = MNN.Tensor((100, 1, 28, 28), MNN.Halide_Type_Float, tuple(torch_tensor.reshape(100 * 1 * 28 * 28)), MNN.Tensor_DimensionType_Caffe)

getShape()

获取tensor的shape，tuple类型（元素为int）

shape = input_tensor.getShape()

getDataType()

获取tensor的数据类型，参考：

data_type = input_tensor.getDataType()

getData()

获取tensor数据，返回类型为tuple（flattened，一维）。如果tensor是通过getSessionInput / getSessionOutput获取，其dim type可能是CaffeC4，

getData()
    #get the data of the tensor
    #return: data array of the tensor, as one tuple, be careful that due to C4
    # the length may be longer than the theory size
    #example: data = input_tensor.getData()
    #From MNN 1.0.1, the output is changed to numpy format

fromNumpy(data)

拷贝numpy数组的数据到Tensor
参数：

numpy array

None

示例：

input_tensor.fromNumpy(np.zeros([2,2]))

getHost()

获取Tensor内部指针（不要使用）
示例：

host = input_tensor.getHost()

getDimensionType()

获取tensor type，参考：
示例：

dimTpye = input_tensor.getDimensionType()

copyFrom(src_tensor)/copyFromHostTensor(src_tensor)

拷贝另一tensor数据至自身
参数：

src_tensor，required，Tensor类型。当调用copyFromHostTensor时要求其实际内存位于CPU后端

bool类型，True表示拷贝成功

示例：

input_tensor.copyFrom(tmp_tensor)
input_tensor.copyFromHostTensor(tmp_tensor)

copyToHostTensor

拷贝自身数据至另一tensor
参数：

dst_tensor，required，Tensor类型

bool类型，True表示拷贝成功

示例：

output_tensor.copyToHostTensor(tmp_tensor)

MNN V3 API

main package structure

import MNN
MNN.expr            # a module that handles expression which is the basic of v3 architecture
MNN.expr.Var        # the Var class handles the abstract of a variable, which is similar to pytorch's Tensor
MNN.data            # data module
MNN.nn                # a module that handles neural network logic
MNN.nn.compress        # model compression module
MNN.nn.loss            # neural network losses
MNN.optim            #a module that handles optimization logic

MNN.expr

a module that holds the basic building block to construct graph using MNN V3 expression mechanism.

Config设置

F.set_config(backend=F.Backend.*, memory_mode=F.MemoryMode.*, power_mode=F.PowerMode.*, precision_mode=F.PrecisionMode.*, thread_num=1)
backend: 后端设置, F.Backend.CPU 表示CPU后端，F.Backend.HIAI 表示华为NPU后端，F.Backend.OPENCL 表示opencl后端
F.Backend.METAL 表示iOS/Mac上的Metal后端
默认为CPU后端
memory_mode: 内存设置, F.MemoryMode.Low 表示低内存占用，F.MemoryMode.High 表示高内存占用，默认F.MemoryMode.Normal
power_mode: 功耗设置, F.PowerMode.Low 表示低功耗，F.PowerMode.High 表示高功耗，默认F.PowerMode.Normal
precision_mode: 精度设置, F.PrecisionMode.Low 表示低精度，F.PrecisionMode.High 表示高精度，默认F.PrecisionMode.Normal
thread_num: 线程数设置 (1 ~ 8)

开启fp16: F.set_config(precision_mode=F.PrecisionMode.Low)

MNN.expr.Var

variable in MNN expression, behave like pytorch tensor

MNN.expr.Var.read()

Returns the variable's output.
Args:
None
Returns:
A sequence. Contains the variable's output.

MNN.expr.Var.write(data)

Writes to the variable's input.
Args:
data: A sequence. Contains the input of the variable. 
Returns:
None

MNN.expr.Var.shape

Returns the variable's shape.
Args:
None
Returns:
A sequence. Contains the variable's shape.

MNN.expr.Var.data_foramat

Returns the variable's data_foramat.
Args:
None
Returns:
A enum. Contains the variable's data_foramat.

MNN.expr.Var.dtype

Returns the variable's dtype.
Args:
None
Returns:
A enum. Contains the variable's dtype.

MNN.expr.Var.size

Returns the variable's size.
Args:
None
Returns:
A int. Contains the variable' size.

MNN.expr.Var.name

Property, Get and set the variable's name.
Args:
None
Returns:
A string. Contains the variable's name.

MNN.expr.Var.fix_as_const()

Detaches the variable, and transforms to a const. 
"Detach" means:
a. shrinks the current node and all the proceeding nodes' computing logic to a signle node(current node) and freezes it. 
b. from now non, detaches the connections to all the proceeding nodes.  
This function first computes the output of the variable w/ proceeding variables, then detaches the connection, and transforms to a const variable.  
Args:
None
Returns:
None

MNN.expr.Var.fix_as_placeholder()

Detaches the variable, and transforms to a placeholder. 
Refers to MNN.expr.Var.fix_as_const().  
This function first computes the output of the variable w/ proceeding variables, then detaches the connection, and transforms to a placeholder variable. 
Args:
None
Returns:
None

MNN.expr.Var.fix_as_trainable()

Detaches the variable, and transforms to a trainable. 
Refers to MNN.expr.Var.fix_as_const().  
This function first computes the output of the variable w/ proceeding variables, then detaches the connection, and transforms to a placeholder variable. 
Args:
None
Returns:
None

MNN expression utils

MNN.expr.gc()

Triggers gabage collection. 
Args:
None
Returns:
None

MNN.expr.set_thread_number(number_thread)

Sets the nubmer of thread when executing.
Args:
number_thread: A int. 
Returns:
None

MNN.expr.load_as_list(file_name)

Loads from the mnn file, and returns a list of variables. 
Args:
file_name: A string. The full path name of the mnn file. 
Returns:
A list of variables.

MNN.expr.load_as_dict(file_name)

Loads from the mnn file, and returns a dict. The key of the dict is the variable's name, and the value of the dict is the variable. 
Args:
file_name: A string. The full path name of the mnn file. 
Returns:
A dict. The key of the dict is the variable's name, and the value of the dict is the variable.

MNN.expr.save(vars, file_name, for_inference = True)

Saves to the mnn file. It'll check all the dependencies of variable, and the whole needed graph to file.   
Args:
vars: A sequence. Contains the variables. 
file_name: A string. The full path name of the mnn file. 
for_inference: bool, if true, the model will be optimized for inference and remove or fuse useless ops
Returns:
None

MNN expression ops

MNN.expr.sign(x)

Computes the sign of x eltment-wise.
sign(x) = 0 if x=0
sign(x) =-1 if x<0
sign(x) = 1 if x>0
Args:
x: A variable. 
Returns:
A variable. Has the same shape as x.

MNN.expr.abs(x)

Computes the absolute value of a variable.
Args:
x: A variable. Must be one of the following types: dtype.float or dtype.int.
Returns:
A variable. Has the same type as x.

MNN.expr.negative(x)

Computes numerical negative value element-wise.
Args:
x: A variable. Must be one of the following types: dtype.float or dtype.int.
Returns:
A variable. Has the same type as x.

MNN.expr.floor(x)

Returns element-wise largest integer not greater than x.
Args:
x: A variable. Must be one of the following types: dtype.float or dtype.int.
Returns:
A variable. Has the same type as x.

MNN.expr.ceil(x)

Returns element-wise smallest integer not less than x.
Args:
x: A variable. Must be one of the following types: dtype.float or dtype.int.
Returns:
A variable. Has the same type as x.

MNN.expr.square(x)

Computes square of x element-wise.
Args:
x: A variable. Must be one of the following types: dtype.float or dtype.int.
Returns:
A variable. Has the same type as x.

MNN.expr.sqrt(x)

Computes square root of x element-wise.
Args:
x: A variable. Must be one of the following types: dtype.float or dtype.int.
Returns:
A variable. Has the same type as x.

MNN.expr.rsqrt(x)

Computes reciprocal of square root of x element-wise.
Args:
x: A variable. Must be one of the following types: dtype.float or dtype.int.
Returns:
A variable. Has the same type as x.

MNN.expr.exp(x)

Computes exponential of x element-wise.
Args:
x: A variable. Must be one of the following types: dtype.float or dtype.int.
Returns:
A variable. Has the same type as x.

MNN.expr.log(x)

Computes natural logarithm of x element-wise.
Args:
x: A variable. Must be one of the following types: dtype.float or dtype.int.
Returns:
A variable. Has the same type as x.

MNN.expr.sin(x)

Computes sine of x element-wise.
Given an input variable, this function computes sine of every element in the variable.
Input range is (-inf, inf) and output range is [-1,1].
Args:
x: A variable. Must be one of the following types: dtype.float or dtype.int.
Returns:
A variable. Has the same type as x.

MNN.expr.cos(x)

Computes cos of x element-wise.
Given an input variable, this function computes cosine of every element in the variable.
Input range is (-inf, inf) and output range is [-1,1]. If input lies outside the boundary, nan is returned.
Args:
x: A variable. Must be one of the following types: dtype.float or dtype.int.
Returns:
A variable. Has the same type as x.

MNN.expr.tan(x)

Computes tan of x element-wise.
Given an input variable, this function computes tangent of every element in the variable.
Input range is (-inf, inf) and output range is (-inf, inf). If input lies outside the boundary, nan is returned.
Args:
x: A variable. Must be one of the following types: dtype.float or dtype.int.
Returns:
A variable. Has the same type as x.

MNN.expr.asin(x)

Computes the trignometric inverse sine of x element-wise.
The asin operation returns the inverse of sin, such that if y = sin(x) then, x = asin(y).
Note: The output of asin will lie within the invertible range of sine, i.e [-pi/2, pi/2].
Args:
x: A variable. Must be one of the following types: dtype.float or dtype.int.
Returns:
A variable. Has the same type as x.

MNN.expr.acos(x)

Computes acos of x element-wise.
Args:
x: A variable. Must be one of the following types: dtype.float or dtype.int.
Returns:
A variable. Has the same type as x.

MNN.expr.atan(x)

Computes atan of x element-wise.
Args:
x: A variable. Must be one of the following types: dtype.float or dtype.int.
Returns:
A variable. Has the same type as x.

MNN.expr.reciprocal(x)

Computes the reciprocal of x element-wise.
Args:
x: A variable. Must be one of the following types: dtype.float or dtype.int.
Returns:
A variable. Has the same type as x.

MNN.expr.log1p(x)

Computes natural logarithm of (1 + x) element-wise.
Args:
x: A variable. Must be one of the following types: dtype.float or dtype.int.
Returns:
A variable. Has the same type as x.

MNN.expr.tanh(x)

Computes hyperbolic tangent of x element-wise.
Given an input variable, this function computes hyperbolic tangent of every element in the variable.
Input range is [-inf, inf] and output range is [-1,1].
Args:
x: A variable. Must be one of the following types: dtype.float or dtype.int.
Returns:
A variable. Has the same type as x.

MNN.expr.sigmoid(x)

Computes sigmoid of x element-wise.
Args:
x: A variable. Must be one of the following types: dtype.float or dtype.int.
Returns:
A variable. Has the same type as x.

MNN.expr.add(x, y)

Returns x + y element-wise.
Args:
x: A variable. Must be one of the following types: dtype.float or dtype.int, dtype.int64, dtype.uint8.
y: A variable. Must have the same type as x.
Returns:
A variable. Has the same type as x.

MNN.expr.subtract(x, y)

Returns x - y element-wise.
Args:
x: A variable. Must be one of the following types: dtype.float or dtype.int, dtype.int64, dtype.uint8.
y: A variable. Must have the same type as x.
Returns:
A variable. Has the same type as x.

MNN.expr.multiply(x, y)

Returns x * y element-wise.
Args:
x: A variable. Must be one of the following types: dtype.float or dtype.int, dtype.int64, dtype.uint8.
y: A variable. Must have the same type as x.
Returns:
A variable. Has the same type as x.

MNN.expr.divide(x, y)

Returns Python style division of x by y. 
Args:
x: A variable. Must be one of the following types: dtype.float or dtype.int, dtype.int64, dtype.uint8.
y: A variable. Must have the same type as x.
Returns:
A variable. Has the same type as x.

MNN.expr.pow(x, y)

Computes the power of one value to another. 
Args:
x: A variable. Must be one of the following types: dtype.float or dtype.int, dtype.int64.
y: A variable. Must have the same type as x.
Returns:
A variable. Has the same type as x.

MNN.expr.minimum(x, y)

Returns the min of x and y (i.e. x < y ? x : y) element-wise. 
Args:
x: A variable. Must be one of the following types: dtype.float or dtype.int64.
y: A variable. Must have the same type as x.
Returns:
A variable. Has the same type as x.

MNN.expr.maximum(x, y)

Returns the max of x and y (i.e. x > y ? x : y) element-wise.
Args:
x: A variable. Must be one of the following types: dtype.float or dtype.int64.
y: A variable. Must have the same type as x.
Returns:
A variable. Has the same type as x.

MNN.expr.bias_add(value, bias)

Adds bias to value.
This is (mostly) a special case of add where bias is restricted to 1-D.
Broadcasting is supported, so value may have any number of dimensions.
Unlike add, the type of bias is allowed to differ from value in the case where both types are quantized.
Args:
value: A variable with type dtype.float or dtype.int.
bias: A 1-D variable with size matching the channel dimension of value.
Must be the same type as value unless value is a quantized type, in which case a different quantized type may be used.
Returns:
A variable with the same type as value.

MNN.expr.greater(x, y)

Returns the truth value of (x > y) element-wise.
Args:
x: A variable. Must be one of the following types: dtype.float or dtype.int.
y: A variable. Must have the same type as x.
Returns:
A variable of dtype.int.

MNN.expr.greater_equal(x, y)

Returns the truth value of (x >= y) element-wise.
Args:
x: A variable. Must be one of the following types: dtype.float or dtype.int.
y: A variable. Must have the same type as x.
Returns:
A variable of dtype.int.

MNN.expr.less(x, y)

Returns the truth value of (x < y) element-wise.
Args:
x: A variable. Must be one of the following types: dtype.float or dtype.int.
y: A variable. Must have the same type as x.
Returns:
A variable of dtype.int.

MNN.expr.floordiv(x, y)

Divides x / y elementwise, rounding toward the most negative integer.
Args:
x: A variable. Must be one of the following types: dtype.float or dtype.int.
y: A variable. Must have the same type as x.
Returns:
A variable with the same type as x.

MNN.expr.equal(x, y)

Returns the truth value of (x == y) element-wise.
Args:
x: A variable. Must be one of the following types: dtype.float or dtype.int.
y: A variable. Must have the same type as x.
Returns:
A variable of dtype.int.

MNN.expr.less_equal(x, y)

Returns the truth value of (x <= y) element-wise.
Args:
x: A variable. Must be one of the following types: dtype.float or dtype.int.
y: A variable. Must have the same type as x.
Returns:
A variable of dtype.int.

MNN.expr.floormod(x, y)

Calculates element-wise remainder of division.
Args:
x: A variable. Must be one of the following types: dtype.float or dtype.int.
y: A variable. Must have the same type as x.
Returns:
A variable with the same type as x.

MNN.expr.reduce_sum(input, axis=[], keep_dims=False)

Computes the sum of elements across dimensions of a variable.
Reduces input along the dimensions given in axis.
Unless keep_dims is true, the rank of the variable is reduced by 1 for each entry in axis.
If keep_dims is true, the reduced dimensions are retained with length 1.
If axis is empty, all dimensions are reduced, and a variable with a single element is returned.
Args:
input: The variable to reduce. Should have numeric type.
axis: The dimensions to reduce. If empty(the default), reduces all dimensions.
       Must be in the range [-rank(input), rank(input)].
keep_dims: If true, retains reduced dimensions with length 1.
Returns:
The reduced variable, of the same dtype as the input.
*/

MNN.expr.reduce_mean(input, axis=[], keep_dims=False)

Computes the mean of elements across dimensions of a variable.
Reduces input along the dimensions given in axis.
Unless keep_dims is true, the rank of the variable is reduced by 1 for each entry in axis.
If keep_dims is true, the reduced dimensions are retained with length 1.
If axis is empty, all dimensions are reduced, and a variable with a single element is returned.
Args:
input: The variable to reduce. Should have numeric type.
axis: The dimensions to reduce. If empty(the default), reduces all dimensions.
       Must be in the range [-rank(input), rank(input)].
keep_dims: If true, retains reduced dimensions with length 1.
Returns:
The reduced variable, of the same dtype as the input.
*/

MNN.expr.reduce_max(input, axis=[], keep_dims=False)

Computes the maximum of elements across dimensions of a variable.
Reduces input along the dimensions given in axis.
Unless keep_dims is true, the rank of the variable is reduced by 1 for each entry in axis.
If keep_dims is true, the reduced dimensions are retained with length 1.
If axis is empty, all dimensions are reduced, and a variable with a single element is returned.
Args:
input: The variable to reduce. Should have numeric type.
axis: The dimensions to reduce. If empty(the default), reduces all dimensions.
       Must be in the range [-rank(input), rank(input)].
keep_dims: If true, retains reduced dimensions with length 1.
Returns:
The reduced variable, of the same dtype as the input.
*/

MNN.expr.reduce_min(input, axis=[], keep_dims=False)

Computes the minimum of elements across dimensions of a variable.
Reduces input along the dimensions given in axis.
Unless keep_dims is true, the rank of the variable is reduced by 1 for each entry in axis.
If keep_dims is true, the reduced dimensions are retained with length 1.
If axis is empty, all dimensions are reduced, and a variable with a single element is returned.
Args:
input: The variable to reduce. Should have numeric type.
axis: The dimensions to reduce. If empty(the default), reduces all dimensions.
       Must be in the range [-rank(input), rank(input)].
keep_dims: If true, retains reduced dimensions with length 1.
Returns:
The reduced variable, of the same dtype as the input.
*/

MNN.expr.reduce_prod(input, axis=[], keep_dims=False)

Computes the product of elements across dimensions of a variable.
Reduces input along the dimensions given in axis.
Unless keep_dims is true, the rank of the variable is reduced by 1 for each entry in axis.
If keep_dims is true, the reduced dimensions are retained with length 1.
If axis is empty, all dimensions are reduced, and a variable with a single element is returned.
Args:
input: The variable to reduce. Should have numeric type.
axis: The dimensions to reduce. If empty(the default), reduces all dimensions.
       Must be in the range [-rank(input), rank(input)].
keep_dims: If true, retains reduced dimensions with length 1.
Returns:
The reduced variable, of the same dtype as the input.
*/

MNN.expr.reduce_any(input, axis=[], keep_dims=False)

Computes the "logical or" of elements across dimensions of a variable.
Reduces input along the dimensions given in axis.
Unless keep_dims is true, the rank of the variable is reduced by 1 for each entry in axis.
If keep_dims is true, the reduced dimensions are retained with length 1.
If axis is empty, all dimensions are reduced, and a variable with a single element is returned.
Args:
input: The variable to reduce. Should have numeric type.
axis: The dimensions to reduce. If empty(the default), reduces all dimensions.
       Must be in the range [-rank(input), rank(input)].
keep_dims: If true, retains reduced dimensions with length 1.
Returns:
The reduced variable, of the same dtype as the input.
*/

MNN.expr.reduce_all(input, axis=[], keep_dims=False)

Computes the "logical and" of elements across dimensions of a variable.
Reduces input along the dimensions given in axis.
Unless keep_dims is true, the rank of the variable is reduced by 1 for each entry in axis.
If keep_dims is true, the reduced dimensions are retained with length 1.
If axis is empty, all dimensions are reduced, and a variable with a single element is returned.
Args:
input: The variable to reduce. Should have numeric type.
axis: The dimensions to reduce. If empty(the default), reduces all dimensions.
       Must be in the range [-rank(input), rank(input)].
keep_dims: If true, retains reduced dimensions with length 1.
Returns:
The reduced variable, of the same dtype as the input.
*/

MNN.expr.cast(x, dtype)

Casts variable x to data type dtype.
Args:
x: A variable.
dtype: the data type to cast x to. Must be either dtype.float or dtype.int. 
Returns:
A variable of type dtype.

MNN.expr.matmul(a, b, transposeA=False, transposeB=False)

Matrix product of two variables a and b.
Args:
a: A variable representing a matrix. 
b: Another variable representing another matrix.
transposeA: whether to tranpose matrix a before matrix product. Default to false. 
transposeB: whether to tranpose matrix b before matrix product. Default to false. 
Returns:
A variable for the matrix product of two variables a and b.

MNN.expr.argmax(input, axis=0)

Gets the index with the largest value across axes of a tensor.
Args:
input: A variable. Must be either dtype.float or dtype.int
axis: A int within the range (-rank(input), rank(input)). Describes which axis of the input variable to reduce across.
Returns:
A variable of type int.

MNN.expr.batch_matmul(x, y, adj_x=False, adj_y=False)

Multiplies all slices of variable x and y (each slice can be viewed as an element of a batch), and arranges the individual results in a single output tensor of the same batch size. Each of the individual slices can optionally be adjointed (to adjoint a matrix means to transpose and conjugate it) before multiplication by setting the adj_x or adj_y flag to True, which are by default False.
The input tensors x and y are 2-D or higher with shape [..., r_x, c_x] and [..., r_y, c_y].
The output tensor is 2-D or higher with shape [..., r_o, c_o], where:
r_o = c_x if adj_x else r_x
c_o = r_y if adj_y else c_y
It is computed as:
output[..., :, :] = matrix(x[..., :, :]) * matrix(y[..., :, :]).
Args:
x: A variable representing a matrix. 2-D or higher with shape [..., r_x, c_x].
y: A variable representing another matrix. 2-D or higher with shape [..., r_y, c_y].
adj_x: If True, adjoint the slices of x. Defaults to False.
adj_y: If True, adjoint the slices of y. Defaults to False.
Returns:
A variable for the batch multiplication result. 3-D or higher with shape [..., r_o, c_o]

MNN.expr.unravel_index(indices, dims)

Converts an array of flat indices into a tuple of coordinate arrays. This is equivalent to np.unravel_index.
Args:
indices: An 0-D or 1-D int variable whose elements are indices into the flattened version of an array of dimensions dims.
dims: The shape of the array to use for unraveling indices.
Returns:
A variable with the same type as indices.

MNN.expr.scatter_nd(indices, updates, shape)

Scatter updates into a new variable according to indices.
Creates a new variable by applying sparse updates to individual values or slices within a variable (initially zero for numeric, empty for string) of the given shape according to indices. This is equivalent to tf.scatter_nd.
Args:
indices: A variable for the index.
updates: A variable for the updates to scatter into output.
shape: A variable with the same type as indices. The shape of the resulting variable. 
Returns:
A variable with the same type as updates.

MNN.expr.one_hot(indices, depth, on_value, off_value, axis)

Returns a one-hot variable. 
Args:
indices: A variable of indices.
depth: A variable defining the depth of the one hot dimension.
on_value: A variable defining the value to fill in output when indices[j] = i. (default: 1)
off_value: A variable defining the value to fill in output when indices[j] != i. (default: 0)
axis: The axis to fill (default: -1, a new inner-most axis).
Returns:
The one-hot variable.

MNN.expr.broadcast_to(input, shape)

Broadcast an array for a compatible shape.
Args:
input: A variable to broadcast.
shape: A 1D int variable. The shape of the desired output.
Returns:
A variable of the same type as input.

MNN.expr.placeholder(shape, data_format=dataformat.NCHW, dtype=dtype.float)

Creates a placeholder variable. 
Args:
shape: The shape of the placeholder variable.
data_format: The data format of the placeholder variable. Defaults to NCHW.
dtype: The data type of the placeholder variable. Defaults to dtype.float.

MNN.expr.clone(source, deep_copy=False)

Clones a variable from the source variable.
Args:
source: The source variable to be clone from.
deep_copy: Whether to deep copy the source variable. Defaults to false.
Returns:
A new cloned variable.

MNN.expr.const(value_list, shape, data_format=data_format.NCHW, dtype=dtype.float)

Creates a constant variable.
Args:
value_list: Python object holding the value of the constant.
dtype: Data type of the constant variable.
shape: Shape of the constant variable. A list of ints.
data_format: The data format of constant variable. Defaults to NCHW.
Returns:
A constant variable.

MNN.expr.conv2d(input, weight, bias, stride=[1, 1], padding=[0, 0], dilate=[1, 1], group=1, padding_mode=Padding_Mode.VALID)

2-D convolution. 
Args:
input: A 4-D variable in NC4HW4 format. 
weight: A 4-D variable of the same type as input. 
bias: A variable for the bias after convolution. Its size must be the same the channel number of the weight.
stride: A list of ints. The strides of the sliding window for each dimension of input. Defaults to [1, 1].
padding: A list of ints indicating the explicit paddings at the start and end of each dimension. Defaults to [0, 0]. 
dilate: A list of ints indicating the dilation for each dimension. Defaults to [1, 1].
group: Number of blocked connections from input channels to output channels. Defaults to 1. 
padding_mode: Padding mode. Defaults to MNN.expr.padding_model.valid.
Returns:
A variable with the same type as input.

MNN.expr.conv2d_transpose(input, weight, bias, stride=[1, 1], padding=[0, 0], dilate=[1, 1], group=1, padding_mode=Padding_Mode.VALID)

Transposed convolution (a.k.a Deconvolution).
Args:
input: A 4-D variable in NC4HW4 format. 
weight: A 4-D variable of the same type as input. 
bias: A variable for the bias after convolution. Its size must be the same the channel number of the weight.
stride: A list of ints. The strides of the sliding window for each dimension of input. Defaults to [1, 1].
padding: A list of ints indicating the explicit paddings at the start and end of each dimension. Defaults to [0, 0]. 
dilate: A list of ints indicating the dilation for each dimension. Defaults to [1, 1].
group: Number of blocked connections from input channels to output channels. Defaults to 1. 
padding_mode: Padding mode. Defaults to MNN.expr.padding_model.valid.
Returns:
A variable with the same type as input.

MNN.expr.max_pool(input, kernel, stride, padding_mode=Padding_Mode.VALID, pads=[0, 0])

Performs the max pooling of the input.
Args:
input: A 4-D variable in NC4HW4 format. 
kernel: A list of ints. The size of the window for each dimension of the input.
stride: A list of ints. The strides of the sliding window for each dimension of input.
padding_mode: Padding mode. Defaults to MNN.expr.padding_model.valid.
pads: A list of ints indicating the explicit paddings at the start and end of each dimension. Defaults to [0, 0]. 
Returns:
A variable for the max pooling result.

MNN.expr.avg_pool(input, kernel, stride, padding_mode=Padding_Mode.VALID, pads=[0, 0])

Performs the average pooling of the input.
Args:
input: A 4-D variable in NC4HW4 format. 
kernel: A list of ints. The size of the window for each dimension of the input.
stride: A list of ints. The strides of the sliding window for each dimension of input.
padding_mode: Padding mode. Defaults to MNN.expr.padding_model.valid.
pads: A list of ints indicating the explicit paddings at the start and end of each dimension. Defaults to [0, 0]. 
Returns:
A variable for the max pooling result.

MNN.expr.reshape(x, shape, original_format=data_format.NCHW)

Reshapes the input variable.
Args:
x: the input variable to be reshaped.
shape: the desired shape of the output variable.
original_format: the original data format of the input variable. Can only be NCHW or NHWC. If the input variable x is in NC4HW4 format, this is the original data format from which the variable x is converted from. If the input varialbe x is not in NC4HW4 format, this is just its data format. Defaults to NHWC.
Returns:
A variable with the type as the input.

MNN.expr.scale(x, channels, scales, bias)

Scales the input variable. 
Args:
x: the input variable. 
channels: An int. Number of channels.
scales: A list of floats. Scaling factor for each channel.
bias: A list of floats. Bias value after the scale.
Returns:
A variable with the same type as input x.

MNN.expr.relu(x, slope=0.0)

Computes the rectified linear of the input.
Args:
x: The input variable.
slope: A float for the negative slope. Defaults to 0.0f. 
Returns:
A variable for the relu result.

MNN.expr.relu6(x)

Computes Rectified Linear 6: min(max(x, 0), 6).
Args:
x: The input variable.
Returns:
A variable for the relu6 result.

MNN.expr.prelu(x, slopes)

Parametric relu.
Args:
x: The input variable.
slopes: A list of floats. Slopes for the negative region.
Returns:
A variable for the prelu result.

MNN.expr.softmax(logits, axis)

Computes softmax of the input logits.
Args:
logits: A variable. Must be of dtype.float.
axis: The dimension of the softmax would be performed on. Defaults to -1 which indicats the last dimension.
Returns:
A variable with the same type as logits.

MNN.expr.softplus(features)

Computs softplus: log(exp(features) + 1).
Args:
features: A variable. Must be of dtype.float.
Returns:
A variable with the same type as input features.

MNN.expr.softsign(features)

Computes softsign: features / (abs(features) + 1).
Args:
features: A variable. Must be of dtype.float.
Returns:
A variable with the same type as input features.

MNN.expr.slice(input, starts, sizes)

Slices the input variable.
Args:
input: A variable. 
starts: A variable. The starting location of the slice.
sizes: A variable. The number of elements of the i-th dimension of the input x that you want to slice.
Returns:
A variable with the same type as input.

MNN.expr.split(input, size_splits, axis)

Splits a variable value into a list of sub variables.
input: The variable to split.
size_splits: A vector, a 1-D integer containing the sizes of each output variable along axis. 
axis: A int, the dimension along which to split. Must be in the range [-rank(value), rank(value)). Defaults to 0.
Returns:
A list of variables.

MNN.expr.strided_slice(input, begin, end, strides, begin_mask, end_mask, ellipsis_mask, new_axis_mask, shrink_axis_mask)

Strided slice of the input. Roughly speaking, this op extracts a slice of size (end-begin)/stride from the given input variable.
Args:
input: A variable.
begin:  A variable. The starting location of the slice.
end: A variable. The ending location of the slice.
strides: A variable. The ending location of the slice.
begin_mask: An integer. If the ith bit of begin_mask is set, begin[i] is ignored and the fullest possible range in that dimension is used instead.
end_mask: An integer. Works analogously with beginMask, except with the end range.
ellipsis_mask: An integer. If the ith bit of ellipsis_mask is set, as many unspecified dimensions as needed will be inserted between other dimensions. Only one non-zero bit is allowed in ellipsis_mask.
new_axis_mask: If the ith bit of new_axis_mask is set, then begin, end, and stride are ignored and a new length 1 dimension is added at this point in the output tensor.
shrink_axis_mask: If the ith bit of shrink_axis_mask is set, it implies that the ith specification shrinks the dimensionality by 1, taking on the value at index begin[i]. end[i] and strides[i] are ignored in this case. For example in Python one might do foo[:, 3, :] which would result in shrink_axis_mask equal to 2.
Returns:
A variable with the same type as input.

MNN.expr.concat(values, axis)

Concatenates variables along one dimension.
Args:
values: A list of variables.
axis: A int. Dimension along which to concatenate. Must be in the range [-rank(values), rank(values)). As in Python, indexing for axis is 0-based. Positive axis in the rage of [0, rank(values)) refers to axis-th dimension. And negative axis refers to axis + rank(values)-th dimension.
Returns:
A variable resulting from concatenation of the input variables.

MNN.expr.convert(input, format)

Convert a variable to another format.
Args:
input: A variable.
format: The target format. 
Returns:
A variable. If input is already in the targt format, then return input directly, otherwize add a variable after input with format.

MNN.expr.transpose(x, perm)

Transposes input x.
Args:
x: A variable.
perm: A list of ints, indicating the permutation of the dimensions of x. 
Returns:
A transposed variable.

MNN.expr.channel_shuffle(x, group)

Shuffles the input variable x channel-wise.
Args:
x: the input variable.
group: An int. The number of channels of the input.
Returns:
The channel shuffled variable.

MNN.expr.reverse_sequence(x, y, batch_dim, seq_dim)

Reverses variable length slices.
Args:
x: A variable. The input variable to be reversed.
y: A variable. 1-D with length input.dims(batch_dim) and max(y) <= input.dims(seq_dim)
batch_dim: An int. The dimension along which reversal is performed.
seq_dim: An int. The dimension which is partially reversed.

MNN.expr.crop(images, size, axis, offset)

Crop images. 
Args:
images: 4-D variable of NC4HW4 format.  
size: A variable. It takes the shape of `size` as output cropped variable's shape  while omits the values/format of `size`.
axis: A int indicating the dimention to crop. Must be >=2. All dimensions up to but excluding `axis` are preserved, while the dimensions including and trailing `axis` are cropped.  
offset: A vector of int indicating the offsets. length(`offset`) must be >=1 and <=2. If length(`offset`) is 1, then all dimensions are offset by this amount.Otherwise, the number of offsets must equal the number of cropped axes in each dimension accordingly.
Returns:
The cropped 4-D variable of NC4HW4 format.

MNN.expr.resize(images, x_scale, y_scale)

Resize images.
Args:
images: 4D variable of NC4HW4 format.
x_scale: A float.
y_scale: A float.
Returns:
The resized 4-D variable of NC4HW4 format.

MNN.expr.pad(x, paddings, mode=PadValue_Mode.CONSTANT)

Pads a variable.
x: A variable.
paddings: A variable of dtype int. The shape is [n, 2] where  n is the rank of variable x. 
mode: A enum. Defaults to padvalue_mode.constvalue.
Returns:
A variable. Has the same type as x.

MNN.expr.expand_dims(input, axis)

Returns a variable with an additional dimension inserted at index axis.
Args:
input: A variable.
axis: A int, specifying the dimension index at which to expand the shape of input. 
Given an input of D dimensions, axis must be in range [-(D+1), D] (inclusive).
Returns:
A variable with the same data as input, with an additional dimension inserted at the index specified by axis.

MNN.expr.shape(input)

Returns the shape of a variable.
Args:
input: A variable.
Returns:
A variable of dtype int.

MNN.expr.stack(values, axis=0)

Stacks a list of rank-R variables into one rank-(R+1) variable.
Packs the list of variables in `values` into a ariable with rank one higher than each variable in values, by packing them along the axis dimension. 
Given a list of length N of variables of shape (A, B, C);
if axis == 0 then the output variable will have the shape (N, A, B, C). 
if axis == 1 then the output variable will have the shape (A, N, B, C). Etc.
Args:
values: A list of variable objects with the same shape and type.
axis: An int. The axis to stack along. Defaults to the first dimension. Negative values wrap around, 
so the valid range is [-(R+1), R+1).
Returns:
output: A stacked variable with the same type as `values`.

MNN.expr.crop_and_resize(iamge, boxes, box_ind, crop_size, method=Interp_Method.BILINEAR, extrapolation_value=0.0)

Extracts crops from the input image variable and resizes them using bilinear sampling or nearest neighbor sampling (possibly with aspect ratio change) to a common output size specified by crop_size. 
Returns a variable with crops from the input image at positions defined at the bounding box locations in boxes. 
The cropped boxes are all resized (with bilinear or nearest neighbor interpolation) to a fixed size = [crop_height, crop_width]. 
The result is a 4-D tensor [num_boxes, crop_height, crop_width, depth](supposing NHWC format).
Args:
image: A 4-D variable of shape [batch, image_height, image_width, depth](supposing NHWC format). Both image_height and image_width need to be positive.
boxes: A 2-D variable of shape [num_boxes, 4]. The i-th row of the variable specifies the coordinates of a box in the box_ind[i] image and is specified in normalized coordinates [y1, x1, y2, x2]. 
A normalized coordinate value of y is mapped to the image coordinate at y * (image_height - 1), so as the [0, 1] interval of normalized image height is mapped to [0, image_height - 1] in image height coordinates. We do allow y1 > y2, in which case the sampled crop is an up-down flipped version of the original image. The width dimension is treated similarly. Normalized coordinates outside the [0, 1] range are allowed, in which case we use extrapolation_value to extrapolate the input image values.
box_ind: A 1-D variable of shape [num_boxes] with int values in [0, batch). The value of box_ind[i] specifies the image that the i-th box refers to.
crop_size: A 1-D variable of 2 elements, size = [crop_height, crop_width]. All cropped image patches are resized to this size. The aspect ratio of the image content is not preserved. Both crop_height and crop_width need to be positive.
method: A enum, either interp_method.bilinear, or interp_method.nearest, default to interp_method.bilinear. 
extrapolation_value: Value used for extrapolation, when applicable. Defaults to 0.0f.
Returns:
Output: A 4-D variable of shape [num_boxes, crop_height, crop_width, depth](supposing NHWC format).

MNN.expr.fill(dims, value)

Creates a variable filled with a scalar value.
Args:
dims: A variable. Must be 1-D dtype int. Represents the shape of the output variable.
value: A variable. 0-D (scalar). Value to fill the returned variable. 
Returns:
A variable. Has the same type as value.

MNN.expr.tile(input, multiples)

Constructs a variable by tiling a given variable.
Args:
input: A variable. 1-D or higher.
multiples: A variable. Must be 1-D dtype int. Length must be the same as the number of dimensions in input.
Returns:
A variable. Has the same type as input.

MNN.expr.gather(params, indices)

Gather slices from params according to indices.
Args:
params: The variable from which to gather values. 
indices: Index variable. Must be dtype int in range [0, ndims(params)-1].
Returns:
Output: Values from params gathered from indices given by indices.

MNN.expr.squeeze(input, axis=[])

Removes dimensions of size 1 from the shape of a variable.
Args:
input: A variable. The input to squeeze.
axis: A list of ints. If specified, only squeezes the dimensions listed. The dimension index starts at 0. 
Must be in the range [-rank(input), rank(input)). Defaults to an empty list.
Returns:
A variable. Has the same type as input. Contains the same data as input, but has one or more dimensions of size 1 removed.

MNN.expr.unsqueeze(input, axis=[])

Returns a new variable with a dimension of size one inserted at the specified position.
Args:
input: A variable.
axis: the position to insert a new dimension. 
Returns:
A variable with one more dimension than input.

MNN.expr.batch_to_space_nd(input, block_shape, crops)

BatchToSpace for N-D variables
This operation reshapes the "batch" dimension 0 into M + 1 dimensions of shape block_shape + [batch], 
interleaves these blocks back into the grid defined by the spatial dimensions [1, ..., M], 
to obtain a result with the same rank as the input. 
The spatial dimensions of this intermediate result are then optionally cropped according to crops to 
produce the output. This is the reverse of space_to_batch_nd. See below for a precise description.
Args:
input: must be 4-D with NC4HW4 format. N-D with shape input_shape = [batch] + spatial_shape + remaining_shape, where spatial_shape has M dimensions.
block_shape: 1-D with shape [M], all values must be >= 1.
crops: 2-D with shape [M, 2], all values must be >= 0. crops[i] = [crop_start, crop_end] specifies the amount to crop from input dimension i + 1, 
which corresponds to spatial dimension i. It is required that crop_start[i] + crop_end[i] <= block_shape[i] * input_shape[i + 1].
This operation is equivalent to the following steps:
Reshape input to reshaped of shape: [block_shape[0], ..., block_shape[M-1], batch / prod(block_shape), 
input_shape[1], ..., input_shape[N-1]]
Permute dimensions of reshaped to produce permuted of shape 
[batch / prod(block_shape),input_shape[1], block_shape[0], ..., input_shape[M], block_shape[M-1],input_shape[M+1], ..., input_shape[N-1]]
Reshape permuted to produce reshaped_permuted of shape 
[batch / prod(block_shape),input_shape[1] * block_shape[0], ..., input_shape[M] * block_shape[M-1],input_shape[M+1], ..., input_shape[N-1]]
Crop the start and end of dimensions [1, ..., M] of reshaped_permuted according to crops to produce the output of shape: 
[batch / prod(block_shape),input_shape[1] * block_shape[0] - crops[0,0] - crops[0,1], ..., input_shape[M] * block_shape[M-1] - crops[M-1,0] - crops[M-1,1],input_shape[M+1], ..., input_shape[N-1]]
Some examples:
for the following input of shape [4, 1, 1, 3], block_shape = [2, 2], and crops = [[0, 0], [0, 0]]:
[[[[1, 2, 3]]], [[[4, 5, 6]]], [[[7, 8, 9]]], [[[10, 11, 12]]]]
The output variable has shape [1, 2, 2, 3] and value:
x = [[[[1, 2, 3], [4, 5, 6]],
      [[7, 8, 9], [10, 11, 12]]]]
Returns:
Output: The output variable

MNN.expr.gather_nd(params, indices)

Gather slices from params into a variable with shape specified by indices.
Args:
params: A variable. The variables from which to gather values.
indices: A variable. Must be of dtype int. 
Returns:
A variable. Has the same type as params.

MNN.expr.selu(features, scale, alpha)

Computes scaled exponential linear: scale * alpha * (exp(features) - 1) if < 0, scale * features otherwise.
Args:
features: A variable of dtype int.
scale: Scaling factor (positive float)
alpha: Alpha factor (positive float)
Returns:
A variable. Has the same type as features.

MNN.expr.size(input)

Computes the size of the variable.
Args:
input: A variable of type dtype float or dtype int
Returns:
A variable. The shape is 0D, and type is dtype int.

MNN.expr.elu(features, alpha=1.0)

Computes exponential linear: alpha * (exp(features) - 1) if < 0, features otherwise.
Args:
features: A variable of type dtype float.
alpha: Alpha factor (positive float). Defaults 1.0f.
Returns:
A variable. Has the same type as features.

MNN.expr.matrix_band_part(input, num_lower, num_upper)

Copies a variable setting everything outside a central band in each innermost matrix.
Args:
input: A rank k variable.
num_lower: A variable. Number of subdiagonals to keep. If negative, keep entire lower triangle.
num_upper: A variable. Number of superdiagonals to keep. If negative, keep entire upper triangle.
Returns:
Output: Rank k variable of the same shape as input. The extracted banded tensor.

MNN.expr.moments(x, axes, shift, keep_dims)

Calculates the mean and variance of x.
Args:
x: A variable. Must be 4-D with NC4HW4 format.
axes: Array of ints. Axes along which to compute mean and variance. Ignored for this implementation: must be {2, 3}
shift: Not used in the current implementation. 
keepdims: produce moments with the same dimensionality as the input.  Ignored for this implementation: must be true.
Returns:
Two variable objects: mean and variance.

MNN.expr.setdiff1d(x, y)

Computes the difference between two lists of numbers or strings.
Given a list x and a list y, this operation returns a list out that represents all values that are in x but not in y. 
The returned list out is sorted in the same order that the numbers appear in x (duplicates are preserved). 
This operation also returns a list idx that represents the position of each out element in x. 
Args:
x: 1-D variable of type dtype int. Values to keep. 
y: 1-D variable of type dtype int. Values to remove.
Returns:
Output out: 1-D variable of type Halide_Type_Int. Values present in x but not in y.

MNN.expr.space_to_depth(input, block_size)

Rearranges blocks of spatial data, into depth. 
More specifically, it outputs a copy of the input variable where values from the height and width dimensions are moved to the depth dimension. 
The block_size indicates the input block size.
Non-overlapping blocks of size block_size x block_size are rearranged into depth at each location.
The depth of the output variable is block_size * block_size * input_depth.
The Y, X coordinates within each block of the input become the high order component of the output channel index.
The input variable's height and width must be divisible by block_size
Args:
input: A variable.
block_size: An int that is >= 2. The size of the spatial block.
Returns:
A variable. Has the same type as input.

MNN.expr.space_to_batch_nd(input, block_shape, paddings)

This operation divides "spatial" dimensions [1, ..., M] of the input into a grid of blocks of shape block_shape, 
and interleaves these blocks with the "batch" dimension 
such that in the output, the spatial dimensions [1, ..., M] correspond to the position within the grid,
and the batch dimension combines both the position within a spatial block and the original batch position.
Prior to division into blocks, the spatial dimensions of the input are optionally zero padded according to paddings. 
See below for a precise description.
Args:
input: A variable. must be 4-D with NC4HW4 format. N-D with shape input_shape = [batch] + spatial_shape + remaining_shape, where spatial_shape has M dimensions.
block_shape: A variable. Must be one of the following types: int32, int64. 1-D with shape [M], all values must be >= 1.
paddings: A variable. Must be one of the following types: int32, int64. 2-D with shape [M, 2], all values must be >= 0. paddings[i] = [pad_start, pad_end] specifies the padding for input dimension i + 1, which corresponds to spatial dimension i. It is required that block_shape[i] divides input_shape[i + 1] + pad_start + pad_end.
Returns:
A variable. Has the same type as input.

MNN.expr.zeros_like(input)

Creates a variable with all elements set to zero.
Args:
input: A variable.
Returns:
A variable with all elements set to zero.

MNN.expr.unstack(value, axis=0)

Unpacks the given dimension of a rank-R tensor into rank-(R-1) variable.
For example, given a variable of shape (A, B, C, D);
If axis == 0 then the i'th variable in output is the slice value[i, :, :, :] and each variable in output will have shape (B, C, D). 
(Note that the dimension unpacked along is gone, unlike split).
If axis == 1 then the i'th variable in output is the slice value[:, i, :, :] and each variable in output will have shape (A, C, D). 
Args:
value: A rank R > 0 variable to be unstacked.
axis: An int. The axis to unstack along. Defaults to the first dimension. Negative values wrap around, so the valid range is [-R, R).
Returns:
The list of variable objects unstacked from value.

MNN.expr.rank(input)

Returns the rank of a variable.
Note: The rank of a variable is not the same as the rank of a matrix. 
It's the number of indices required to uniquely select each element of the variable. 
It's also known as "order", "degree", or "ndims."
Args:
input: A variable.
Returns:
A 0-D variable of dtype int.

MNN.expr.range(start, limit, delta)

Creates a sequence of numbers.
Args:
start: A 0-D variable (scalar). 
limit: A 0-D variable (scalar). 
delta: A 0-D variable (scalar). 
Returns:
A variable that contains the sequence of numbers.

MNN.expr.depth_to_space(input, block_size)

Rearranges data from depth into blocks of spatial data. 
It is the reverse transformation of space_to_depth. More specifically,
it outputs a copy of the input variable where values from the depth dimension are moved in spatial blocks to the height and width dimensions. 
Args:
input: A variable.
block_size: An int that is >= 2. The size of the spatial block, same as in space_to_depth.
Returns:
A variable. Has the same type as input.

MNN.data

data module, you can implement your own dataset and data loader based on this module

MNN.data.Dataset

base class of a dataset, you need to subclass this class and rewrite its __getitem__ and __len__ method in order to create your own dataset

Examples:

try:
    import mnist
except ImportError:
    print("please 'pip install mnist' before run this demo")
import numpy as np
import MNN
F = MNN.expr
class MnistDataset(MNN.data.Dataset):
    def __init__(self, training_dataset=True):
        super(MnistDataset, self).__init__()
        self.is_training_dataset = training_dataset
        if self.is_training_dataset:
            self.data = mnist.train_images() / 255.0
            self.labels = mnist.train_labels()
        else:
            self.data = mnist.test_images() / 255.0
            self.labels = mnist.test_labels()
    def __getitem__(self, index):
        dv = F.const(self.data[index].flatten().tolist(), [1, 28, 28], F.data_format.NCHW)
        dl = F.const([self.labels[index]], [], F.data_format.NCHW, F.dtype.uint8)
        # first for inputs, and may have many inputs, so it's a list
        # second for targets, also, there may be more than one targets
        return [dv], [dl]
    def __len__(self):
        # size of the dataset
        if self.is_training_dataset:
            return 60000
        else:
            return 10000

Note that, the **__getitem__** method should return two lists of **MNN.expr.Var** type, one for inputs and one for targets

MNN.data.DataLoader

class for construct your own data loader, which support batching and random sampling

DataLoader(train_dataset, batch_size, shuffle = True, num_workers = 0)

Args:

traindataset: `_Dataset instance`

the dataset on which the data loader will sample from
batchsize: `_int`

batch size of the data loader
shuffle: _bool_

shuffle the dataset or not
numworkers: `_int`

number of thread used in data loader

Methods:

reset()

the data loader need to reset every time it is exhausted
next() ——> _List[MNN.expr.Var], List[MNN.expr.Var]_

get batch data in the dataset, one list for inputs and another list for targets

Properties:

iter_number:

return total iterations of invoking next() that the data loader will get exhausted. note that, when left data is not full for one batch size, it will still be loaded
size:

the size of the dataset

Examples:

# construct training data loader and test data loader using the MnistDataset in above example
train_dataset = MnistDataset(True)
test_dataset = MnistDataset(False)
train_dataloader = MNN.data.DataLoader(train_dataset, batch_size = 64, shuffle = True)
test_dataloader = MNN.data.DataLoader(test_dataset, batch_size = 100, shuffle = False)
...
# use in training
def train_func(net, train_dataloader, opt):
    """train function"""
    net.train(True)
    # need to reset when the data loader exhausted
    train_dataloader.reset()
    t0 = time.time()
    for i in range(train_dataloader.iter_number):
        example = train_dataloader.next()
        input_data = example[0]
        output_target = example[1]
        data = input_data[0]  # which input, model may have more than one inputs
        label = output_target[0]  # also, model may have more than one outputs
        predict = net.forward(data)
        target = F.one_hot(F.cast(label, F.int), 10, 1, 0)
        loss = nn.loss.cross_entropy(predict, target)
        opt.step(loss)
        if i % 100 == 0:
            print("train loss: ", loss.read())

MNN.nn

a module which contains the neural network functions

MNN.nn.Module

base class of all neural network modules.
you should subclass this class when you write your own module.
modules can also contain modules, sub-modules are treated like MNN.nn.conv, and their parameters are added to the whole module’s parameters.

Examples:

import MNN
import MNN.nn as nn
import MNN.expr as F
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.conv(1, 20, [5, 5])
        self.conv2 = nn.conv(20, 50, [5, 5])
        self.fc1 = nn.linear(800, 500)
        self.fc2 = nn.linear(500, 10)
    def forward(self, x):
        x = F.relu(self.conv1(x))
        x = F.max_pool(x, [2, 2], [2, 2])
        x = F.relu(self.conv2(x))
        x = F.max_pool(x, [2, 2], [2, 2])
        # some ops like conv, pool , resize using special data format `NC4HW4`
        # so we need to change their data format when fed into reshape
        # we can get the data format of a variable by its `data_format` attribute
        x = F.convert(x, F.NCHW)
        x = F.reshape(x, [0, -1])
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        x = F.softmax(x, 1)
        return x
model = Net()

Methods:

forward(_MNN.expr.Var_) ——> _MNN.expr.Var_:

forward propagation of the module
caller

same as forward
loadparameters(`_list[MNN.expr.Var]`):

set the module’s parameters using given parameter list
setname(`_str`):

set the name of the module
train(_bool_):

set the is_training flag of the module

Properties:

parameters ——> _list[MNN.expr.Var]_:

return parameters of the module
name:

get the name of the module
is_training:

get the training state of the module

MNN.nn.conv

function, construct a conv2d module instance

MNN.nn.conv(in_channel, out_channel, kernel_size, stride = [1, 1], padding = [0, 0], dilation = [1, 1], depthwise = False, bias = True, padding_mode = MNN.expr.Padding_Mode.VALID)

Args:

inchannel: `_int`

conv2d input channels
outchannel: `_int`

conv2d output channels
kernelsize: `_list[int]`

[kernel_h, kernel_w] of conv2d
stride: _list[int]_

[stride_h, stride_w] of conv2d
padding: _list[int]_

[padding_h, padding_w] of conv2d
dilation: _list[int]_

[dilation_h, dilation_w] of conv2d
depthwise: _bool_

whether or not to use depthwise conv2d
bias: _bool_

whether or not to add bias to conv2d
padding_mode:

padding mode of Padding_Mode.VALID, Padding_Mode.CAFFE, Padding_Mode.SAME

Returns:

a conv2d instance

Methods:

same as MNN.nn.Module

Examples:

import numpy as np
import MNN
conv = MNN.nn.conv(3, 16, [3, 3])
random_input = np.random.random((1, 3, 64, 64))
input_var = MNN.expr.placeholder([1, 3, 64, 64])
input_var.write(random_input.flatten().tolist())
conv_output = conv(input_var)
conv_output.read()

MNN.nn.linear

function, construct a inner-product module instance

MNN.nn.linear(input_length, output_length, bias = True)

Args:

inputlength: `_int`

input length of the inner-product op
outputlength: `_int`

output length of the inner-product op
bias: _bool_

whether or not to add bias to the inner-product op

Returns:

a inner-product instance

Methods:

same as MNN.nn.Module

Examples:

import numpy as np
import MNN
ip = MNN.nn.linear(32, 64)
random_input = np.random.random((1, 32))
input_var = MNN.expr.placeholder([1, 32])
input_var.write(random_input.flatten().tolist())
ip_output = ip(input_var)
ip_output.read()

MNN.nn.batch_norm

function, construct a batch norm module instance

MNN.nn.batch_norm(channels, dims = 4, momentum = 0.99, epsilon = 1e-05)

Args:

channels: _int_

input channels
dims: _int_

input dims
momentum: _float_

momentum in batch norm
epsilon: _float_

epsilon in batch norm

Returns:

a batch norm instance

Methods:

same as MNN.nn.Module

MNN.nn.dropout

function, construct a Dropout module instance

MNN.nn.dropout(drop_ratio)

Args:

dropratio: `_float`

the drop ratio of the dropout module

Returns:

a dropout instance

Methods:

same as MNN.nn.Module

Examples:

import numpy as np
import MNN
dropout = MNN.nn.dropout(0.5)
random_input = np.random.random((1, 32))
input_var = MNN.expr.placeholder([1, 32])
input_var.write(random_input.flatten().tolist())
dropout_output = dropout(input_var)
dropout_output.read()

MNN.nn.FixModule

class, fix a MNN.nn.Module, so that the module will not update during training

FixModule(module)

Args:

module: _MNN.nn.Module_

fix this module, so that the module will not update during training

Methods:

forward(_MNN.expr.Var_) ——> _MNN.expr.Var_:

forward propagation of the module (will set is_training flag to False)
caller

same as forward

MNN.nn.load_module

function to restore part of original graph to a module

load_module(inputs, outputs, for_training)

Args:

inputs: _List[MNN.expr.Var]_

the start of the part graph
outputs: _List[MNN.expr.Var]_

the end of the part graph
fortraining: `_bool`

this module is for training or not

import MNN
nn = MNN.nn
F = MNN.expr
def load_feature_extractor(model_file):
    # get the variable dict
    var_dict = F.load_as_dict(model_file)
    input_var = var_dict['input']
    output_var = var_dict['MobilenetV2/Logits/AvgPool']
    feature_extractor = nn.load_module([input_var], [output_var], False)
    feature_extractor = nn.FixModule(feature_extractor)  # fix feature extractor
    return feature_extractor

MNN.nn.load_module_from_file(file_name, input_names, output_names, dynamic=False, shape_mutable=False, rearrange=False, backend=F.Backend.CPU, memory_mode=F.MemoryMode.Normal, power_mode=F.PowerMode.Normal, precision_mode=F.PrecisionMode.Normal)

从模型文件加载module
参数：

file_name，模型文件路径
input_names，输入tensor名字，python列表
output_names，输出tensor名字，python列表
dynamic，是否加载为动态图
shape_mutable，输入tensor shape是否频繁变化
rearange
backend，module推理后端，默认为F.Backend.CPU
memory_mode，module内存设置，默认为F.MemoryMode.Normal
power_mode，module功耗设置，默认为F.PowerMode.Normal
precision_mode，module精度设置，默认为F.PrecisionMode.Normal

module

Args:

filename: `_string`

MNN file name
dynamic: _bool_

this module is dynamic graph, default is False
shapemutable: `_bool`

input tensor’s shape is mutable, default is True

输入shape不常变的话请将shape_mutable设为False，会获得比默认值True更好的性能。

MNN.nn.compress

a module which contains Quantization-Aware-Training (QAT) methods

MNN.nn.compress.Feature_Scale_Method

enum, the method to compute the feature scale during QAT, options are Feature_Scale_Method.PER_TENSOR, RegularizationMethod.PER_CHANNEL

MNN.nn.compress.Scale_Update_Method

enum, the method to update the feature scales during QAT, options are Scale_Update_Method.MOVING_AVERAGE, Scale_Update_Method.MAXIMUM

MNN.nn.compress.train_quant

function, turn a float module to QAT module

# args are self-explained
train_quant(module, quant_bits = 8, feature_scale_method = Feature_Scale_Method.PER_TENSOR, scale_update_method = Scale_Update_Method.MOVING_AVERAGE)

MNN.nn.loss

MNN.nn.loss.cross_entropy

calculate the cross entropy loss between the two inputs

MNN.nn.loss.cross_entropy(predict, target)

Args:

predict: _MNN.expr.Var_

a 2-D float variable of shape (Batch, Classes)
target: _MNN.expr.Var_

a 2-D float variable of shape (Batch, Classes), you need to do one-hot before calculating the loss

Returns:

cross entropy loss: _MNN.expr.Var_

Examples:

import numpy as np
import MNN
predict = MNN.expr.placeholder([2, 3])
temp = np.random.random([2,3])
predict.write(temp.flatten().tolist())
onehot = MNN.expr.placeholder([2, 3])
temp = [1, 0, 0, 0, 1, 0]
onehot.write(temp)
cross_entropy = MNN.nn.loss.cross_entropy
loss = cross_entropy(predict, onehot)
loss.read()

MNN.nn.loss.kl

calculate the KL-Divergence loss between the two inputs.
usage is similar to cross_entropy

MNN.nn.loss.mse

calculate the mean square error between the two inputs.
usage is similar to cross_entropy

MNN.nn.loss.mae

calculate the mean absolute error between the two inputs.
usage is similar to cross_entropy

MNN.nn.loss.hinge

calculate the hinge loss between the two inputs.
usage is similar to cross_entropy

MNN.optim

a module that handles optimization logic

MNN.optim.SGD

function, construct a SGD optimizer instance

MNN.optim.SGD(module, lr, momentum = 0.9, weight_decay = 0.0, regularization_method = RegularizationMethod.L2)

Args:

module: MNN.nn.Module

the module needed optimize.

lr: _float_

the learning rate of the SGD optimizer
momentum: _float_

the momentum in SGD algorithm
weightdecay: `_float`

the weight_decay in SGD algorithm
regularizationmethod: `_MNN.optm.RegularizationMethod`

the regularization method, options are RegularizationMethod.L1, RegularizationMethod.L2, RegularizationMethod.L1L2

Returns:

a SGD optimizer instance

Methods:

step(_MNN.expr.Var_):

backward propagate to get the gradients of parameters and update the parameters using their corresponding gradients

Properties:

learning_rate:

get and set the learning rate of the optimizer
momentum:

get and set the momentum of the optimizer
weight_decay:

get and set the weight decay factor of the optimizer
regularization_method:

get and set the regularization method of the optimizer

Examples:

import MNN
import MNN.nn as nn
import MNN.expr as F
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.conv(1, 20, [5, 5])
        self.conv2 = nn.conv(20, 50, [5, 5])
        self.fc1 = nn.linear(800, 500)
        self.fc2 = nn.linear(500, 10)
    def forward(self, x):
        x = F.relu(self.conv1(x))
        x = F.max_pool(x, [2, 2], [2, 2])
        x = F.relu(self.conv2(x))
        x = F.max_pool(x, [2, 2], [2, 2])
        # some ops like conv, pool , resize using special data format `NC4HW4`
        # so we need to change their data format when fed into reshape
        # we can get the data format of a variable by its `data_format` attribute
        x = F.convert(x, F.NCHW)
        x = F.reshape(x, [0, -1])
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        x = F.softmax(x, 1)
        return x
model = Net()
sgd = MNN.optim.SGD(model, 0.001, 0.9, 0.0005, MNN.optim.Regularization_Method.L2)
# feed some date to the model, then get the loss
loss = ...
sgd.step(loss) # backward and update parameters in the model

MNN.optim.admm

function, construct a ADAM optimizer instance

MNN.optim.ADAM(module, lr, momentum = 0.9, momentum2 = 0.999, weight_decay = 0.0, eps = 1e-8, regularization_method = RegularizationMethod.L2)

Args:

module: MNN.NN.Module
lr: _float_

the learning rate of the the optimizer
momentum: _float_

the momentum in the algorithm
momentum2: _float_

the second momentum in the algorithm
weightdecay: `_float`

the weight_decay in the algorithm
eps: _float_

the eps factor in the algorithm
regularizationmethod: `_MNN.optm.RegularizationMethod`

the regularization method, options are RegularizationMethod.L1, RegularizationMethod.L2, RegularizationMethod.L1L2

Returns:

a ADAM optimizer instance

Methods: same as MNN.optim.SGD

Properties:

learning_rate:

get and set the learning rate of the optimizer
momentum:

get and set the momentum of the optimizer
momentum2:

get and set the second momentum of the optimizer
weight_decay:

get and set the weight decay factor of the optimizer
eps:

get and set the eps factor of the optimizer
regularization_method:

get and set the regularization method of the optimizer

Python API 使用文档

概览

安装

依赖库安装

graphviz

版本限制

mnn命令

mnnconvert命令

mnnquant 命令

mnnvisual命令

MNN V2 API

MNN

data type

tensor type

error code（即将加入）

Interpreter Class

Interpreter(model_name)

createSession(config)

resizeSession(session)

runSession(session)

runSessionWithCallBack(session, begincallback, endcallback)

getSessionOutput(session, output_tensorname)

getSessionInput(session, input_tensorname)

resizeTensor(tensor,shape)

getSessionInputAll(session)

getSessionOutputAll(session)

cache()

removeCache()

Session Class

cache()

removeCache()

MNN.Tensor

Tensor Class

Tensor(shape, data_type, data, dimType)

getShape()

getDataType()

getData()

fromNumpy(data)

getHost()

getDimensionType()

copyFrom(src_tensor)/copyFromHostTensor(src_tensor)

copyToHostTensor

MNN V3 API

main package structure

MNN.expr

Config设置

MNN.expr.Var

MNN.expr.Var.read()

MNN.expr.Var.write(data)

MNN.expr.Var.shape

MNN.expr.Var.data_foramat

MNN.expr.Var.dtype

MNN.expr.Var.size

MNN.expr.Var.name

MNN.expr.Var.fix_as_const()

MNN.expr.Var.fix_as_placeholder()

MNN.expr.Var.fix_as_trainable()

MNN expression utils

MNN.expr.gc()

MNN.expr.set_thread_number(number_thread)

MNN.expr.load_as_list(file_name)

MNN.expr.load_as_dict(file_name)

MNN.expr.save(vars, file_name, for_inference = True)

MNN expression ops

MNN.expr.sign(x)

MNN.expr.abs(x)

MNN.expr.negative(x)

MNN.expr.floor(x)

MNN.expr.ceil(x)

MNN.expr.square(x)

MNN.expr.sqrt(x)

MNN.expr.rsqrt(x)

MNN.expr.exp(x)

MNN.expr.log(x)

MNN.expr.sin(x)

MNN.expr.cos(x)

MNN.expr.tan(x)

MNN.expr.asin(x)

MNN.expr.acos(x)

MNN.expr.atan(x)