MNN中,Interpreter
一共提供了三个接口用于运行Session
,但一般来说,简易运行就足够满足绝对部分场景。
简易运行
/**
* @brief run session.
* @param session given session.
* @return result of running.
*/
ErrorCode runSession(Session* session) const;
传入事先创建好的Session
即可。
函数耗时并不总是等于推理耗时 —— 在CPU下,函数耗时即推理耗时;在其他后端下,函数可能不会同步等待推理完成,例如GPU下,函数耗时为GPU指令提交耗时。
回调运行
typedef std::function<bool(const std::vector<Tensor*>&,
const std::string& /*opName*/)> TensorCallBack;
/*
* @brief run session.
* @param session given session.
* @param before callback before each op. return true to run the op; return false to skip the op.
* @param after callback after each op. return true to continue running; return false to interrupt the session.
* @param sync synchronously wait for finish of execution or not.
* @return result of running.
*/
ErrorCode runSessionWithCallBack(const Session* session,
const TensorCallBack& before,
const TensorCallBack& end,
bool sync = false) const;
相比于简易运行,回调运行额外提供了:
- 每一个op执行前的回调,可以用于跳过Op执行;
- 每一个op执行后的回调,可以用于中断整个推理;
- 同步等待选项,默认关闭;开启时,所有后端均会等待推理完成,即函数耗时等于推理耗时;
计算量评估
class MNN_PUBLIC OperatorInfo {
struct Info;
public:
/** Operator's name*/
const std::string& name() const;
/** Operator's type*/
const std::string& type() const;
/** Operator's flops, in M*/
float flops() const;
protected:
OperatorInfo();
~OperatorInfo();
Info* mContent;
};
typedef std::function<bool(const std::vector<Tensor*>&, const OperatorInfo*)> TensorCallBackWithInfo;
/*
* @brief run session.
* @param session given session.
* @param before callback before each op. return true to run the op; return false to skip the op.
* @param after callback after each op. return true to continue running; return false to interrupt the session.
* @param sync synchronously wait for finish of execution or not.
* @return result of running.
*/
ErrorCode runSessionWithCallBackInfo(const Session* session,
const TensorCallBackWithInfo& before,
const TensorCallBackWithInfo& end,
bool sync = false) const;
一般而言,只有在评估计算量时才会用到的接口。相比于回调运行,在回调时,增加了Op类型和计算量信息。