Linux下安装
https://docs.nvidia.com/deeplearning/tensorrt/install-guide/index.html#installing-pip 参考官网教程
https://developer.nvidia.com/tensorrt 去官网下载适配自己CUDA版本的tensorRT
下载后解压,我这边解压到了usr/local下
添加环境变量:
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/TensorRT-${version}/lib
安装TensorRT whl:
cd TensorRT-${version}/pythonpip install tensorrt-*-cp3x-none-linux_x86_64.whl
安装UFF whl(使用tensorflow的话)
cd TensorRT-${version}/uffpip install uff-0.6.9-py2.py3-none-any.whl
安装graphsurgeon whl:
cd TensorRT-${version}/graphsurgeonpip install graphsurgeon-0.4.5-py2.py3-none-any.whl
安装onnx-graphsurgeon whl:
cd TensorRT-${version}/onnx_graphsurgeonpip install onnx_graphsurgeon-0.3.10-py2.py3-none-any.whl
测试(第三步如果报错需要检验cuda及cudnn环境变量配置):
>>> import tensorrt>>> print(tensorrt.__version__)8.0.1.6>>> assert tensorrt.Builder(tensorrt.Logger())>>>
h5(keras)模型转pb
h5模型无法被tensorRT识别,需要先转成pb模型
# 转换函数def h5_to_pb(h5_model, output_dir, model_name, out_prefix="output_", log_tensorboard=True):if osp.exists(output_dir) == False:os.mkdir(output_dir)out_nodes = []for i in range(len(h5_model.outputs)):out_nodes.append(out_prefix + str(i + 1))tf.identity(h5_model.output[i], out_prefix + str(i + 1))sess = K.get_session()from tensorflow.python.framework import graph_util, graph_ioinit_graph = sess.graph.as_graph_def()main_graph = graph_util.convert_variables_to_constants(sess, init_graph, out_nodes)graph_io.write_graph(main_graph, output_dir, name=model_name, as_text=False)if log_tensorboard:from tensorflow.python.tools import import_pb_to_tensorboardimport_pb_to_tensorboard.import_to_tensorboard(osp.join(output_dir, model_name), output_dir)
两个坑:
keras报错:ValueError: Cannot create group in read-only mode. 原因是模型文件仅仅保存了权重,并不包含结构,不能调用load_model,应将网络实例化后调用load_weight方法。
tensorflow报错:ValueError: Cannot find the variable that is an input to the ReadVariableOp. 原因是新版本keras加入了训练模式与评估模式(不追踪梯度)进行区分,需要进行如下修改:
from keras import backend as KK.set_learning_phase(0)
pb转uff
convert-to-uff -input_file xxx.pb -o xxx.uff
