MNN中,Interpreter一共提供了三个接口用于运行Session,但一般来说,简易运行就足够满足绝对部分场景。
简易运行
/*** @brief run session.* @param session given session.* @return result of running.*/ErrorCode runSession(Session* session) const;
传入事先创建好的Session即可。
函数耗时并不总是等于推理耗时 —— 在CPU下,函数耗时即推理耗时;在其他后端下,函数可能不会同步等待推理完成,例如GPU下,函数耗时为GPU指令提交耗时。
回调运行
typedef std::function<bool(const std::vector<Tensor*>&,const std::string& /*opName*/)> TensorCallBack;/** @brief run session.* @param session given session.* @param before callback before each op. return true to run the op; return false to skip the op.* @param after callback after each op. return true to continue running; return false to interrupt the session.* @param sync synchronously wait for finish of execution or not.* @return result of running.*/ErrorCode runSessionWithCallBack(const Session* session,const TensorCallBack& before,const TensorCallBack& end,bool sync = false) const;
相比于简易运行,回调运行额外提供了:
- 每一个op执行前的回调,可以用于跳过Op执行;
- 每一个op执行后的回调,可以用于中断整个推理;
- 同步等待选项,默认关闭;开启时,所有后端均会等待推理完成,即函数耗时等于推理耗时;
计算量评估
class MNN_PUBLIC OperatorInfo {struct Info;public:/** Operator's name*/const std::string& name() const;/** Operator's type*/const std::string& type() const;/** Operator's flops, in M*/float flops() const;protected:OperatorInfo();~OperatorInfo();Info* mContent;};typedef std::function<bool(const std::vector<Tensor*>&, const OperatorInfo*)> TensorCallBackWithInfo;/** @brief run session.* @param session given session.* @param before callback before each op. return true to run the op; return false to skip the op.* @param after callback after each op. return true to continue running; return false to interrupt the session.* @param sync synchronously wait for finish of execution or not.* @return result of running.*/ErrorCode runSessionWithCallBackInfo(const Session* session,const TensorCallBackWithInfo& before,const TensorCallBackWithInfo& end,bool sync = false) const;
一般而言,只有在评估计算量时才会用到的接口。相比于回调运行,在回调时,增加了Op类型和计算量信息。
