Get Input Tensor
/**
* @brief get input tensor for given name.
* @param session given session.
* @param name given name. if NULL, return first input.
* @return tensor if found, NULL otherwise.
*/
Tensor* getSessionInput(const Session* session, const char* name);
/**
* @brief get all output tensors.
* @param session given session.
* @return all output tensors mapped with name.
*/
const std::map<std::string, Tensor*>& getSessionInputAll(const Session* session) const;
Interpreter
provides two methods for getting the input Tensor
: getSessionInput
for getting a single input tensor and getSessionInputAll
for getting the input tensor map.
When there is only one input tensor, you can pass NULL to get tensor when calling
getSessionInput
.
Fill Data
auto inputTensor = interpreter->getSessionInput(session, NULL);
inputTensor->host<float>()[0] = 1.f;
The simplest way to fill tensor data is assigning value to host
directly, but this usage is limited to the CPU backend, other backends need to write data via deviceid
. On the other hand, users need to handle the differences between NC4HW4
and NHWC
data layouts.
For non-CPU backends, or users who are not familiar with data layout, copy data interfaces should be used.
Copy Data
NCHW example:
auto inputTensor = interpreter->getSessionInput(session, NULL);
auto nchwTensor = new Tensor(inputTensor, Tensor::CAFFE);
// nchwTensor-host<float>()[x] = ...
inputTensor->copyFromHostTensor(nchwTensor);
delete nchwTensor;
NC4HW4 example:
auto inputTensor = interpreter->getSessionInput(session, NULL);
auto nc4hw4Tensor = new Tensor(inputTensor, Tensor::CAFFE_C4);
// nc4hw4Tensor-host<float>()[x] = ...
inputTensor->copyFromHostTensor(nc4hw4Tensor);
delete nc4hw4Tensor;
NHWC example:
auto inputTensor = interpreter->getSessionInput(session, NULL);
auto nhwcTensor = new Tensor(inputTensor, Tensor::TENSORFLOW);
// nhwcTensor-host<float>()[x] = ...
inputTensor->copyFromHostTensor(nhwcTensor);
delete nhwcTensor;
By copying the data in this way, the data layout of tensor created with
new
is only thing you need to pay attention to.copyFromHostTensor
is responsible for processing the conversion on the data layout (if needed) and the data copy between the backends (if needed).
Image Processing
CV module is provided in MNN to help users simplify the processing of images, and it also helps to avoid introducing image processing libraries such as opencv and libyuv.
Currently, the CV module only supports the CPU backend.
Image Process Config
struct Config
{
Filter filterType = NEAREST;
ImageFormat sourceFormat = RGBA;
ImageFormat destFormat = RGBA;
//Only valid if the dest type is float
float mean[4] = {0.0f,0.0f,0.0f, 0.0f};
float normal[4] = {1.0f, 1.0f, 1.0f, 1.0f};
};
in CV::ImageProcess::Config
- Specify input and output formats by
sourceFormat
anddestFormat
, currently supportsRGBA
、RGB
、BGR
、GRAY
、BGRA
、YUV_NV21
- Specify the type of interpolation by
filterType
, currently supportsNEAREST
,BILINEAR
andBICUBIC
- Specify the mean normalization by
mean
andnormal
, but the setting is ignored when the data type is not a floating point type
Image Transform Matrix
CV::Matrix
is ported from the Skia used by Android. For usage, please refer to Skia’s Matrix: https://skia.org/user/api/SkMatrix_Reference.
It should be noted that the Matrix set in ImageProcess is the transformation matrix from the target image to the source image. When using, you can set the transformation from the source image to the target image, and then take the inverse. E.g:
// source image:1280x720
// target image:Rotate 90 degrees counterclockwise
// and then reduce it to 1/10 of the original,
// which becomes 72x128
Matrix matrix;
// reset to unit matrix
matrix.setIdentity();
// zoom out and change to the [0,1] interval:
matrix.postScale(1.0f / 1280, 1.0f / 720);
// rotate 90 degrees from the center point [0.5, 0.5]
matrix.postRotate(90, 0.5f, 0.5f);
// zoom back to 72x128
matrix.postScale(72.0f, 128.0f);
// transition to target image -> source image transformation matrix
matrix.invert(&matrix);
Image Process Instance
MNN uses CV::ImageProcess
for image processing. ImageProcess
contains a series of caches internally. To avoid repeated allocation/release for memory, it is recommended to create it only once and cache it. We use the convert
of ImageProcess
to fill the tensor data.
/*
* source: source image
* iw: source image width
* ih: source image height
* stride: the number of bytes in the source image after alignment (if no alignment is required, set to 0 (equivalent to iw*bpp))
* dest: target tensor, could be uint8 or float type
*/
ErrorCode convert(const uint8_t* source, int iw, int ih, int stride, Tensor* dest);
Complete Example
auto input = net->getSessionInput(session, NULL);
auto output = net->getSessionOutput(session, NULL);
auto dims = input->shape();
int bpp = dims[1];
int size_h = dims[2];
int size_w = dims[3];
auto inputPatch = argv[2];
FREE_IMAGE_FORMAT f = FreeImage_GetFileType(inputPatch);
FIBITMAP* bitmap = FreeImage_Load(f, inputPatch);
auto newBitmap = FreeImage_ConvertTo32Bits(bitmap);
auto width = FreeImage_GetWidth(newBitmap);
auto height = FreeImage_GetHeight(newBitmap);
FreeImage_Unload(bitmap);
Matrix trans;
//Dst -> [0, 1]
trans.postScale(1.0/size_w, 1.0/size_h);
//Flip Y (因为 FreeImage 解出来的图像排列是Y方向相反的)
trans.postScale(1.0,-1.0, 0.0, 0.5);
//[0, 1] -> Src
trans.postScale(width, height);
ImageProcess::Config config;
config.filterType = NEAREST;
float mean[3] = {103.94f, 116.78f, 123.68f};
float normals[3] = {0.017f,0.017f,0.017f};
::memcpy(config.mean, mean, sizeof(mean));
::memcpy(config.normal, normals, sizeof(normals));
config.sourceFormat = RGBA;
config.destFormat = BGR;
std::shared_ptr<ImageProcess> pretreat(ImageProcess::create(config));
pretreat->setMatrix(trans);
pretreat->convert((uint8_t*)FreeImage_GetScanLine(newBitmap, 0), width, height, 0, input);
net->runSession(session);
Variable Dimension
/**
* @brief resize given tensor.
* @param tensor given tensor.
* @param dims new dims. at most 6 dims.
*/
void resizeTensor(Tensor* tensor, const std::vector<int>& dims);
/**
* @brief resize given tensor by nchw.
* @param batch / N.
* @param channel / C.
* @param height / H.
* @param width / W
*/
void resizeTensor(Tensor* tensor, int batch, int channel, int height, int width);
/**
* @brief call this function to get tensors ready. output tensor buffer (host or deviceId) should be retrieved
* after resize of any input tensor.
* @param session given session.
*/
void resizeSession(Session* session);
When input tensor dimensions are unknown or need to be updated, you need to call resizeTensor
to update the dimension information. These situations generally occur when input tensor dimensions are not set or the dimension information is variable. After updating dimension information for all tensors, you need to call resizeSession
to perform pre-inference which allocates or reuses memory. An example is as follows:
auto inputTensor = interpreter->getSessionInput(session, NULL);
interpreter->resizeTensor(inputTensor, {newBatch, newChannel, newHeight, newWidth});
interpreter->resizeSession(session);
inputTensor->copyFromHostTensor(imageTensor);
interpreter->runSession(session);