MNN中,Interpreter一共提供了三个接口用于运行Session,但一般来说,简易运行就足够满足绝对部分场景。

简易运行

  1. /**
  2. * @brief run session.
  3. * @param session given session.
  4. * @return result of running.
  5. */
  6. ErrorCode runSession(Session* session) const;

传入事先创建好的Session即可。

函数耗时并不总是等于推理耗时 —— 在CPU下,函数耗时即推理耗时;在其他后端下,函数可能不会同步等待推理完成,例如GPU下,函数耗时为GPU指令提交耗时。

回调运行

  1. typedef std::function<bool(const std::vector<Tensor*>&,
  2. const std::string& /*opName*/)> TensorCallBack;
  3. /*
  4. * @brief run session.
  5. * @param session given session.
  6. * @param before callback before each op. return true to run the op; return false to skip the op.
  7. * @param after callback after each op. return true to continue running; return false to interrupt the session.
  8. * @param sync synchronously wait for finish of execution or not.
  9. * @return result of running.
  10. */
  11. ErrorCode runSessionWithCallBack(const Session* session,
  12. const TensorCallBack& before,
  13. const TensorCallBack& end,
  14. bool sync = false) const;

相比于简易运行,回调运行额外提供了:

  • 每一个op执行前的回调,可以用于跳过Op执行;
  • 每一个op执行后的回调,可以用于中断整个推理;
  • 同步等待选项,默认关闭;开启时,所有后端均会等待推理完成,即函数耗时等于推理耗时;


计算量评估

  1. class MNN_PUBLIC OperatorInfo {
  2. struct Info;
  3. public:
  4. /** Operator's name*/
  5. const std::string& name() const;
  6. /** Operator's type*/
  7. const std::string& type() const;
  8. /** Operator's flops, in M*/
  9. float flops() const;
  10. protected:
  11. OperatorInfo();
  12. ~OperatorInfo();
  13. Info* mContent;
  14. };
  15. typedef std::function<bool(const std::vector<Tensor*>&, const OperatorInfo*)> TensorCallBackWithInfo;
  16. /*
  17. * @brief run session.
  18. * @param session given session.
  19. * @param before callback before each op. return true to run the op; return false to skip the op.
  20. * @param after callback after each op. return true to continue running; return false to interrupt the session.
  21. * @param sync synchronously wait for finish of execution or not.
  22. * @return result of running.
  23. */
  24. ErrorCode runSessionWithCallBackInfo(const Session* session,
  25. const TensorCallBackWithInfo& before,
  26. const TensorCallBackWithInfo& end,
  27. bool sync = false) const;

一般而言,只有在评估计算量时才会用到的接口。相比于回调运行,在回调时,增加了Op类型和计算量信息。