经常会出现gpu无法使用的情况。
环境:安装的是cuda10.0。注意版本号
第一层报错
Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found
报错找不到 cudar64_101.dll 。我们去C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\bin 目录下可以找到 cudar64_100.dll,按理来说应该来这个目录找才对。,系统环境环境变量指向的也是这里,所以找不到cudar64_101.dll。问题关键在于我们不应该找cudar64_101.dll,而是找cudar64_1001.dll,猜测应该是有隐藏的变量控制了,可惜找不到源头。既然找不到源头,就在系统中找cudar64_101.dll吧,最后在C:\Program Files\NVIDIA Corporation\NvStreamSrv找到了。将cudar64_101.dll复制到C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\bin 目录下即可。
第二层报错
2021-09-04 13:48:20.166871: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cublas64_10.dll'; dlerror: cublas64_10.dll not found
2021-09-04 13:48:20.167337: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cufft64_10.dll'; dlerror: cufft64_10.dll not found
2021-09-04 13:48:20.167897: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'curand64_10.dll'; dlerror: curand64_10.dll not found
2021-09-04 13:48:20.168483: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cusolver64_10.dll'; dlerror: cusolver64_10.dll not found
2021-09-04 13:48:20.168956: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cusparse64_10.dll'; dlerror: cusparse64_10.dll not found
在第一层处理后,基本会出现第二层报错,这些文件有类似命名的文件,在C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\bin 目录下,将cublas64_100.dll 拷贝出来命名为cublas64_10.dll,报错的逐个改就可以成功使用gpu了。
显然这是很丑陋的做法,但知识有限,只能这么操作了。唉