Get Input Tensor

/**
 * @brief get input tensor for given name.
 * @param session   given session.
 * @param name      given name. if NULL, return first input.
 * @return tensor if found, NULL otherwise.
 */
Tensor* getSessionInput(const Session* session, const char* name);
/**
 * @brief get all output tensors.
 * @param session   given session.
 * @return all output tensors mapped with name.
 */
const std::map<std::string, Tensor*>& getSessionInputAll(const Session* session) const;

Interpreter provides two methods for getting the input Tensor: getSessionInput for getting a single input tensor and getSessionInputAll for getting the input tensor map.

When there is only one input tensor, you can pass NULL to get tensor when calling getSessionInput.

Fill Data

auto inputTensor = interpreter->getSessionInput(session, NULL);
inputTensor->host<float>()[0] = 1.f;

The simplest way to fill tensor data is assigning value to host directly, but this usage is limited to the CPU backend, other backends need to write data via deviceid. On the other hand, users need to handle the differences between NC4HW4 and NHWC data layouts.

For non-CPU backends, or users who are not familiar with data layout, copy data interfaces should be used.

Copy Data

NCHW example：

auto inputTensor = interpreter->getSessionInput(session, NULL);
auto nchwTensor = new Tensor(inputTensor, Tensor::CAFFE);
// nchwTensor-host<float>()[x] = ...
inputTensor->copyFromHostTensor(nchwTensor);
delete nchwTensor;

NC4HW4 example：

auto inputTensor = interpreter->getSessionInput(session, NULL);
auto nc4hw4Tensor = new Tensor(inputTensor, Tensor::CAFFE_C4);
// nc4hw4Tensor-host<float>()[x] = ...
inputTensor->copyFromHostTensor(nc4hw4Tensor);
delete nc4hw4Tensor;

NHWC example：

auto inputTensor = interpreter->getSessionInput(session, NULL);
auto nhwcTensor = new Tensor(inputTensor, Tensor::TENSORFLOW);
// nhwcTensor-host<float>()[x] = ...
inputTensor->copyFromHostTensor(nhwcTensor);
delete nhwcTensor;

By copying the data in this way, the data layout of tensor created with new is only thing you need to pay attention to. copyFromHostTensor is responsible for processing the conversion on the data layout (if needed) and the data copy between the backends (if needed).

Image Processing

CV module is provided in MNN to help users simplify the processing of images, and it also helps to avoid introducing image processing libraries such as opencv and libyuv.

Currently, the CV module only supports the CPU backend.

Image Process Config

struct Config
{
    Filter filterType = NEAREST;
    ImageFormat sourceFormat = RGBA;
    ImageFormat destFormat = RGBA;
    //Only valid if the dest type is float
    float mean[4] = {0.0f,0.0f,0.0f, 0.0f};
    float normal[4] = {1.0f, 1.0f, 1.0f, 1.0f};
};

in CV::ImageProcess::Config

Specify input and output formats by sourceFormat and destFormat, currently supports RGBA、RGB、BGR、GRAY、BGRA、YUV_NV21
Specify the type of interpolation by filterType, currently supports NEAREST, BILINEAR and BICUBIC
Specify the mean normalization by mean and normal, but the setting is ignored when the data type is not a floating point type

Image Transform Matrix

CV::Matrix is ported from the Skia used by Android. For usage, please refer to Skia’s Matrix: https://skia.org/user/api/SkMatrix_Reference.

It should be noted that the Matrix set in ImageProcess is the transformation matrix from the target image to the source image. When using, you can set the transformation from the source image to the target image, and then take the inverse. E.g:

// source image：1280x720
// target image：Rotate 90 degrees counterclockwise 
//                 and then reduce it to 1/10 of the original,
//               which becomes 72x128
Matrix matrix;
// reset to unit matrix
matrix.setIdentity();
// zoom out and change to the [0,1] interval:
matrix.postScale(1.0f / 1280, 1.0f / 720);
// rotate 90 degrees from the center point [0.5, 0.5]
matrix.postRotate(90, 0.5f, 0.5f);
// zoom back to 72x128
matrix.postScale(72.0f, 128.0f);
// transition to target image -> source image transformation matrix
matrix.invert(&matrix);

Image Process Instance

MNN uses CV::ImageProcess for image processing. ImageProcess contains a series of caches internally. To avoid repeated allocation/release for memory, it is recommended to create it only once and cache it. We use the convert of ImageProcess to fill the tensor data.

/*
 * source: source image
 * iw: source image width
 * ih: source image height
 * stride: the number of bytes in the source image after alignment (if no alignment is required, set to 0 (equivalent to iw*bpp))
 * dest: target tensor, could be uint8 or float type
 */
ErrorCode convert(const uint8_t* source, int iw, int ih, int stride, Tensor* dest);

Complete Example

auto input  = net->getSessionInput(session, NULL);
auto output = net->getSessionOutput(session, NULL);
auto dims  = input->shape();
int bpp    = dims[1]; 
int size_h = dims[2];
int size_w = dims[3];
auto inputPatch = argv[2];
FREE_IMAGE_FORMAT f = FreeImage_GetFileType(inputPatch);
FIBITMAP* bitmap = FreeImage_Load(f, inputPatch);
auto newBitmap = FreeImage_ConvertTo32Bits(bitmap);
auto width = FreeImage_GetWidth(newBitmap);
auto height = FreeImage_GetHeight(newBitmap);
FreeImage_Unload(bitmap);
Matrix trans;
//Dst -> [0, 1]
trans.postScale(1.0/size_w, 1.0/size_h);
//Flip Y  （因为 FreeImage 解出来的图像排列是Y方向相反的）
trans.postScale(1.0,-1.0, 0.0, 0.5);
//[0, 1] -> Src
trans.postScale(width, height);
ImageProcess::Config config;
config.filterType = NEAREST;
float mean[3] = {103.94f, 116.78f, 123.68f};
float normals[3] = {0.017f,0.017f,0.017f};
::memcpy(config.mean, mean, sizeof(mean));
::memcpy(config.normal, normals, sizeof(normals));
config.sourceFormat = RGBA;
config.destFormat = BGR;
std::shared_ptr<ImageProcess> pretreat(ImageProcess::create(config));
pretreat->setMatrix(trans);
pretreat->convert((uint8_t*)FreeImage_GetScanLine(newBitmap, 0), width, height, 0, input);
net->runSession(session);

Variable Dimension

/**
 * @brief resize given tensor.
 * @param tensor    given tensor.
 * @param dims      new dims. at most 6 dims.
 */
void resizeTensor(Tensor* tensor, const std::vector<int>& dims);
/**
 * @brief resize given tensor by nchw.
 * @param batch  / N.
 * @param channel   / C.
 * @param height / H.
 * @param width / W
 */
void resizeTensor(Tensor* tensor, int batch, int channel, int height, int width);
/**
 * @brief call this function to get tensors ready. output tensor buffer (host or deviceId) should be retrieved
 *        after resize of any input tensor.
 * @param session given session.
 */
void resizeSession(Session* session);

When input tensor dimensions are unknown or need to be updated, you need to call resizeTensor to update the dimension information. These situations generally occur when input tensor dimensions are not set or the dimension information is variable. After updating dimension information for all tensors, you need to call resizeSession to perform pre-inference which allocates or reuses memory. An example is as follows:

auto inputTensor = interpreter->getSessionInput(session, NULL);
interpreter->resizeTensor(inputTensor, {newBatch, newChannel, newHeight, newWidth});
interpreter->resizeSession(session);
inputTensor->copyFromHostTensor(imageTensor);
interpreter->runSession(session);

English Document

Input Data