Linux下安装

https://docs.nvidia.com/deeplearning/tensorrt/install-guide/index.html#installing-pip 参考官网教程
https://developer.nvidia.com/tensorrt 去官网下载适配自己CUDA版本的tensorRT
下载后解压,我这边解压到了usr/local下

添加环境变量:

  1. export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/TensorRT-${version}/lib

安装TensorRT whl:

  1. cd TensorRT-${version}/python
  2. pip install tensorrt-*-cp3x-none-linux_x86_64.whl

安装UFF whl(使用tensorflow的话)

  1. cd TensorRT-${version}/uff
  2. pip install uff-0.6.9-py2.py3-none-any.whl

安装graphsurgeon whl:

  1. cd TensorRT-${version}/graphsurgeon
  2. pip install graphsurgeon-0.4.5-py2.py3-none-any.whl

安装onnx-graphsurgeon whl:

  1. cd TensorRT-${version}/onnx_graphsurgeon
  2. pip install onnx_graphsurgeon-0.3.10-py2.py3-none-any.whl

测试(第三步如果报错需要检验cuda及cudnn环境变量配置):

  1. >>> import tensorrt
  2. >>> print(tensorrt.__version__)
  3. 8.0.1.6
  4. >>> assert tensorrt.Builder(tensorrt.Logger())
  5. >>>

h5(keras)模型转pb

h5模型无法被tensorRT识别,需要先转成pb模型

  1. # 转换函数
  2. def h5_to_pb(h5_model, output_dir, model_name, out_prefix="output_", log_tensorboard=True):
  3. if osp.exists(output_dir) == False:
  4. os.mkdir(output_dir)
  5. out_nodes = []
  6. for i in range(len(h5_model.outputs)):
  7. out_nodes.append(out_prefix + str(i + 1))
  8. tf.identity(h5_model.output[i], out_prefix + str(i + 1))
  9. sess = K.get_session()
  10. from tensorflow.python.framework import graph_util, graph_io
  11. init_graph = sess.graph.as_graph_def()
  12. main_graph = graph_util.convert_variables_to_constants(sess, init_graph, out_nodes)
  13. graph_io.write_graph(main_graph, output_dir, name=model_name, as_text=False)
  14. if log_tensorboard:
  15. from tensorflow.python.tools import import_pb_to_tensorboard
  16. import_pb_to_tensorboard.import_to_tensorboard(osp.join(output_dir, model_name), output_dir)

两个坑:
keras报错:ValueError: Cannot create group in read-only mode. 原因是模型文件仅仅保存了权重,并不包含结构,不能调用load_model,应将网络实例化后调用load_weight方法。

tensorflow报错:ValueError: Cannot find the variable that is an input to the ReadVariableOp. 原因是新版本keras加入了训练模式与评估模式(不追踪梯度)进行区分,需要进行如下修改:

  1. from keras import backend as K
  2. K.set_learning_phase(0)

pb转uff

  1. convert-to-uff -input_file xxx.pb -o xxx.uff