Query Version
./MNNConvert --version
Model Convert
Parameters
Usage:
MNNConvert [OPTION...]
-h, --help Convert Other Model Format To MNN Model
-v, --version show current version
-f, --framework arg model type, ex: [TF,CAFFE,ONNX,TFLITE,MNN]
--modelFile arg tensorflow Pb or caffeModel, ex:
*.pb,*caffemodel
--prototxt arg only used for caffe, ex: *.prototxt
--MNNModel arg MNN model, ex: *.mnn
--fp16 save Conv's weight/bias in half_float data
type
--benchmarkModel Do NOT save big size data, such as Conv's
weight,BN's gamma,beta,mean and variance etc.
Only used to test the cost of the model
--bizCode arg MNN Model Flag, ex: MNN
--debug Enable debugging mode.
--forTraining whether or not to save training ops BN and
Dropout, default: false
--weightQuantBits arg quantize conv/matmul/LSTM float weights to int
type, only optimize for model size, optional 2-8 bits,
default: 0, which means no weight quant
--compressionParamsFile arg
The path of the compression parameters that
stores activation, weight scales and zero
points for quantization or information for
sparsity.
--saveStaticModel save static model with fix shape, default:
false
--inputConfigFile arg set input config file for static model, ex:
~/config.txt
Note1: The benchmarkModel option removes some parameters from the model to reduce the size of it, such as weight of convolution, mean, var of BN. removed parameters will be initialized randomly in runtime. This will be helpful in performance testing.
Note2: The weightQuantBits option should be used as “—weightQuantBits numBits”, where numBits=2~8. this function is to quantize conv/matmul/LSTM float weights to int type, only optimize for model size, optional 2-8 bits. the weight-quantized model will be decoded to float32 when it is loaded. inference speed is the same with the float32 model, but 4X smaller(8bits), the accuracy is almost the same with its original float32 version.
TensorFlow -> MNN
./MNNConvert -f TF --modelFile XXX.pb --MNNModel XXX.mnn --bizCode biz
TensorFlow Lite -> MNN
./MNNConvert -f TFLITE --modelFile XXX.tflite --MNNModel XXX.mnn --bizCode biz
Caffe -> MNN
./MNNConvert -f CAFFE --modelFile XXX.caffemodel --prototxt XXX.prototxt --MNNModel XXX.mnn --bizCode biz
ONNX -> MNN
./MNNConvert -f ONNX --modelFile XXX.onnx --MNNModel XXX.mnn --bizCode biz
PyTorch -> MNN
Convert PyTorch model to ONNX (https://pytorch.org/docs/stable/onnx.html)
import torch
import torchvision
dummy_input = torch.randn(10, 3, 224, 224, device='cuda')
model = torchvision.models.alexnet(pretrained=True).cuda()
# Providing input and output names sets the display names for values
# within the model's graph. Setting these does not change the semantics
# of the graph; it is only for readability.
#
# The inputs to the network consist of the flat list of inputs (i.e.
# the values you would pass to the forward() method) followed by the
# flat list of parameters. You can partially specify names, i.e. provide
# a list here shorter than the number of inputs to the model, and we will
# only set that subset of names, starting from the beginning.
input_names = [ "actual_input_1" ] + [ "learned_%d" % i for i in range(16) ]
output_names = [ "output1" ]
torch.onnx.export(model, dummy_input, "alexnet.onnx", verbose=True, input_names=input_names, output_names=output_names)
Convert ONNX to MNN
./MNNConvert -f ONNX --modelFile XXX.onnx --MNNModel XXX.mnn --bizCode biz
Python Tools for Model Conversion
We provided the model conversion tools in our Python toolchain. See here: https://www.yuque.com/mnn/en/usage_in_python