使用mmdeploy的好处是显而易见的,不需要安装臃肿的mmcv就可以进行部署。

Get Started

MMDeploy provides some useful tools. It is easy to deploy models in OpenMMLab to various platforms. You can convert models in our pre-defined pipeline or build a custom conversion pipeline by yourself. This guide will show you how to convert a model with MMDeploy and integrate MMDeploy’s SDK to your application!

Prerequisites

First we should install MMDeploy following build.md. Note that the build steps are slightly different among the supported backends. Here are some brief introductions to these backends:

  • ONNXRuntime: ONNX Runtime is a cross-platform inference and training machine-learning accelerator. It has best support for ONNX IR.
  • TensorRT: NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference. It includes a deep learning inference optimizer and runtime that delivers low latency and high throughput for deep learning inference applications. It is a good choice if you want to deploy your model on NVIDIA devices.
  • ncnn: ncnn is a high-performance neural network inference computing framework optimized for mobile platforms. ncnn is deeply considerate about deployment and uses on mobile phones from the beginning of design.
  • PPLNN: PPLNN, which is short for “PPLNN is a Primitive Library for Neural Network”, is a high-performance deep-learning inference engine for efficient AI inferencing. It can run various ONNX models and has better support for OpenMMLab.
  • OpenVINO: OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference. The open-source toolkit allows to seamlessly integrate with Intel AI hardware, the latest neural network accelerator chips, the Intel AI stick, and embedded computers or edge devices.

Choose the backend which can meet your demand and install it following the link provided above.

Convert Model

Once you have installed MMDeploy, you can convert the PyTorch model in the OpenMMLab model zoo to the backend model with one magic spell! For example, if you want to convert the Faster-RCNN in MMDetection to TensorRT:

  1. # Assume you have installed MMDeploy in ${MMDEPLOY_DIR} and MMDetection in ${MMDET_DIR}
  2. # If you do not know where to find the path. Just type `pip show mmdeploy` and `pip show mmdet` in your console.
  3. python ${MMDEPLOY_DIR}/tools/deploy.py \
  4. ${MMDEPLOY_DIR}/configs/mmdet/detection/detection_tensorrt_dynamic-320x320-1344x1344.py \
  5. ${MMDET_DIR}/configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py \
  6. ${CHECKPOINT_DIR}/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth \
  7. ${INPUT_IMG} \
  8. --work-dir ${WORK_DIR} \
  9. --device cuda:0 \
  10. --dump-info

${MMDEPLOY_DIR}/tools/deploy.py is a tool that does everything you need to convert a model. Read how_to_convert_model for more details. The converted model and other meta-info will be found in ${WORK_DIR}. And they make up of MMDeploy SDK Model that can be fed to MMDeploy SDK to do model inference.

detection_tensorrt_dynamic-320x320-1344x1344.py is a config file that contains all arguments you need to customize the conversion pipeline. The name is formed as

  1. <task name>_<backend>-[backend options]_<dynamic support>.py

It is easy to find the deployment config you need by name. If you want to customize the conversion, you can edit the config file by yourself. Here is a tutorial about how to write config.

Inference Model

Now you can do model inference with the APIs provided by the backend. But what if you want to test the model instantly? We have some backend wrappers for you.

  1. from mmdeploy.apis import inference_model
  2. result = inference_model(model_cfg, deploy_cfg, backend_models, img=img, device=device)

The inference_model will create a wrapper module and do the inference for you. The result has the same format as the original OpenMMLab repo.

Evaluate Model

You might wonder that does the backend model have the same precision as the original one? How fast can the model run? MMDeploy provides tools to test the model. Take the converted TensorRT Faster-RCNN as an example:

  1. python ${MMDEPLOY_DIR}/tools/test.py \
  2. ${MMDEPLOY_DIR}/configs/detection/detection_tensorrt_dynamic-320x320-1344x1344.py \
  3. ${MMDET_DIR}/configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py \
  4. --model ${BACKEND_MODEL_FILES} \
  5. --metrics ${METRICS} \
  6. --device cuda:0

Read how to evaluate a model for more details about how to use tools/test.py

Integrate MMDeploy SDK

Make sure to turn on MMDEPLOY_BUILD_SDK to build and install SDK by following build.md.
After that, the structure in the installation folder will show as follows,

  1. install
  2. ├── example
  3. ├── include
  4. ├── c
  5. └── cpp
  6. └── lib

where include/c and include/cpp correspond to C and C++ API respectively.

Caution: The C++ API is highly volatile and not recommended at the moment.

In the example directory, there are several examples involving classification, object detection, image segmentation and so on.
You can refer to these examples to learn how to use MMDeploy SDK’s C API and how to link ${MMDeploy_LIBS} to your application.

A From-scratch Example

Here is an example of how to deploy and inference Faster R-CNN model of MMDetection from scratch.

Create Virtual Environment and Install MMDetection.

Please run the following command in Anaconda environment to install MMDetection.

  1. conda create -n openmmlab python=3.7 -y
  2. conda activate openmmlab
  3. conda install pytorch==1.8.0 torchvision==0.9.0 cudatoolkit=10.2 -c pytorch -y
  4. # install the latest mmcv
  5. pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu102/torch1.8.0/index.html
  6. # install mmdetection
  7. git clone https://github.com/open-mmlab/mmdetection.git
  8. cd mmdetection
  9. pip install -r requirements/build.txt
  10. pip install -v -e .

Download the Checkpoint of Faster R-CNN

Download the checkpoint from this link and put it in the {MMDET_ROOT}/checkpoints where {MMDET_ROOT} is the root directory of your MMDetection codebase.

Install MMDeploy and ONNX Runtime

Please run the following command in Anaconda environment to install MMDeploy.

  1. conda activate openmmlab
  2. git clone https://github.com/open-mmlab/mmdeploy.git
  3. cd mmdeploy
  4. git submodule update --init --recursive
  5. pip install -e .

Once we have installed the MMDeploy, we should select an inference engine for model inference. Here we take ONNX Runtime as an example. Run the following command to install ONNX Runtime:

  1. pip install onnxruntime==1.8.1

Then download the ONNX Runtime library to build the mmdeploy plugin for ONNX Runtime:

  1. wget https://github.com/microsoft/onnxruntime/releases/download/v1.8.1/onnxruntime-linux-x64-1.8.1.tgz
  2. tar -zxvf onnxruntime-linux-x64-1.8.1.tgz
  3. cd onnxruntime-linux-x64-1.8.1
  4. export ONNXRUNTIME_DIR=$(pwd)
  5. export LD_LIBRARY_PATH=$ONNXRUNTIME_DIR/lib:$LD_LIBRARY_PATH
  6. cd ${MMDEPLOY_DIR} # To MMDeploy root directory
  7. mkdir -p build && cd build
  8. # build ONNXRuntime custom ops
  9. cmake -DMMDEPLOY_TARGET_BACKENDS=ort -DONNXRUNTIME_DIR=${ONNXRUNTIME_DIR} ..
  10. make -j$(nproc)
  11. # build MMDeploy SDK
  12. cmake -DMMDEPLOY_BUILD_SDK=ON \
  13. -DCMAKE_CXX_COMPILER=g++-7 \
  14. -DOpenCV_DIR=/path/to/OpenCV/lib/cmake/OpenCV \
  15. -Dspdlog_DIR=/path/to/spdlog/lib/cmake/spdlog \
  16. -DONNXRUNTIME_DIR=${ONNXRUNTIME_DIR} \
  17. -DMMDEPLOY_TARGET_BACKENDS=ort \
  18. -DMMDEPLOY_CODEBASES=mmdet ..
  19. make -j$(nproc) && make install

Model Conversion

Once we have installed MMDetection, MMDeploy, ONNX Runtime and built plugin for ONNX Runtime, we can convert the Faster R-CNN to a .onnx model file which can be received by ONNX Runtime. Run following commands to use our deploy tools:

  1. # Assume you have installed MMDeploy in ${MMDEPLOY_DIR} and MMDetection in ${MMDET_DIR}
  2. # If you do not know where to find the path. Just type `pip show mmdeploy` and `pip show mmdet` in your console.
  3. python ${MMDEPLOY_DIR}/tools/deploy.py \
  4. ${MMDEPLOY_DIR}/configs/mmdet/detection/detection_onnxruntime_dynamic.py \
  5. ${MMDET_DIR}/configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py \
  6. ${MMDET_DIR}/checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth \
  7. ${MMDET_DIR}/demo/demo.jpg \
  8. --work-dir work_dirs \
  9. --device cpu \
  10. --show \
  11. --dump-info

If the script runs successfully, two images will display on the screen one by one. The first image is the infernce result of ONNX Runtime and the second image is the result of PyTorch. At the same time, an onnx model file end2end.onnx and three json files (SDK config files) will generate on the work directory work_dirs.

Run MMDeploy SDK demo

After model conversion, SDK Model is saved in directory ${work_dir}.
Here is a recipe for building & running object detection demo.

  1. cd build/install/example
  2. # path to onnxruntime ** libraries **
  3. export LD_LIBRARY_PATH=/path/to/onnxruntime/lib
  4. mkdir -p build && cd build
  5. cmake -DOpenCV_DIR=path/to/OpenCV/lib/cmake/OpenCV \
  6. -DMMDeploy_DIR=${MMDEPLOY_DIR}/build/install/lib/cmake/MMDeploy ..
  7. make object_detection
  8. # suppress verbose logs
  9. export SPDLOG_LEVEL=warn
  10. # running the object detection example
  11. ./object_detection cpu ${work_dirs} ${path/to/an/image}

If the demo runs successfully, an image named “output_detection.png” is supposed to be found showing detection objects.

Add New Model Support?

If the models you want to deploy have not been supported yet in MMDeploy, you can try to support them by yourself. Here are some documents that may help you:

Finally, we welcome your PR!