MNN中，Interpreter一共提供了三个接口用于运行Session，但一般来说，简易运行就足够满足绝对部分场景。

简易运行

/**
 * @brief run session.
 * @param session   given session.
 * @return result of running.
 */
ErrorCode runSession(Session* session) const;

传入事先创建好的Session即可。

函数耗时并不总是等于推理耗时 —— 在CPU下，函数耗时即推理耗时；在其他后端下，函数可能不会同步等待推理完成，例如GPU下，函数耗时为GPU指令提交耗时。

回调运行

typedef std::function<bool(const std::vector<Tensor*>&, 
                           const std::string& /*opName*/)> TensorCallBack;
/*
 * @brief run session.
 * @param session   given session.
 * @param before    callback before each op. return true to run the op; return false to skip the op.
 * @param after     callback after each op. return true to continue running; return false to interrupt the session.
 * @param sync      synchronously wait for finish of execution or not.
 * @return result of running.
 */
ErrorCode runSessionWithCallBack(const Session* session, 
                                 const TensorCallBack& before, 
                                 const TensorCallBack& end,
                                 bool sync = false) const;

相比于简易运行，回调运行额外提供了：

每一个op执行前的回调，可以用于跳过Op执行；
每一个op执行后的回调，可以用于中断整个推理；
同步等待选项，默认关闭；开启时，所有后端均会等待推理完成，即函数耗时等于推理耗时；

计算量评估

class MNN_PUBLIC OperatorInfo {
    struct Info;
public:
    /** Operator's name*/
    const std::string& name() const;
    /** Operator's type*/
    const std::string& type() const;
    /** Operator's flops, in M*/
    float flops() const;
protected:
    OperatorInfo();
    ~OperatorInfo();
    Info* mContent;
};
typedef std::function<bool(const std::vector<Tensor*>&, const OperatorInfo*)> TensorCallBackWithInfo;
/*
 * @brief run session.
 * @param session   given session.
 * @param before    callback before each op. return true to run the op; return false to skip the op.
 * @param after     callback after each op. return true to continue running; return false to interrupt the session.
 * @param sync      synchronously wait for finish of execution or not.
 * @return result of running.
 */
ErrorCode runSessionWithCallBackInfo(const Session* session, 
                                     const TensorCallBackWithInfo& before,
                                     const TensorCallBackWithInfo& end, 
                                     bool sync = false) const;

一般而言，只有在评估计算量时才会用到的接口。相比于回调运行，在回调时，增加了Op类型和计算量信息。

MNN 中文文档 - 帮助手册 - 教程

运行会话

简易运行

回调运行

计算量评估