John Nagle
January 18th, 2020, 05:40 AM
I'm trying to run Tensorflow on Ubuntu 18.04 LTS. Just upgraded from 16.04 LTS. x86-64, Nvidia 640.
The driver part of the CUDA install seems to be OK:
nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Nov__3_21:07:56_CDT_2017
Cuda compilation tools, release 9.1, V9.1.85
john@Nagle-LTS:/usr/local$ nvidia-smi
Fri Jan 17 20:15:42 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 435.21 Driver Version: 435.21 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+================= =====+======================|
| 0 GeForce GT 640 Off | 00000000:01:00.0 N/A | N/A |
| 30% 31C P8 N/A / N/A | 302MiB / 1996MiB | N/A Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|================================================= ============================|
| 0 Not Supported |
+-----------------------------------------------------------------------------+
But if I try to run Tensorflow's basic test, I get
~/projects/sl/rasa$ python
Python 3.7.6 (default, Jan 17 2020, 10:24:03)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
2020-01-17 19:26:02.912477: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer.so.6'; dlerror: libnvinfer.so.6: cannot open shared object file: No such file or directory
2020-01-17 19:26:02.912566: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer_plugin.so.6'; dlerror: libnvinfer_plugin.so.6: cannot open shared object file: No such file or directory
2020-01-17 19:26:02.912575: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:30] Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
Reading articles about this, it's a known problem, and on older versions there are all sorts of workarounds that bypass the package manager. I don't want to do that; the drivers and nVidia command line tools seem to be OK. But there's nothing in "/usr/local/cuda", the usual place for those libraries.
What installs those shared libraries without messing up anything else?
There's some discussion here: https://devtalk.nvidia.com/default/topic/1066634/could-not-load-dynamic-library-libnvinfer-so-5-/
but itt's for an older version of the NVidia driver. I have driver 435.21 installed; those instructions have very specific version numbers and install driver 418.
The driver part of the CUDA install seems to be OK:
nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Nov__3_21:07:56_CDT_2017
Cuda compilation tools, release 9.1, V9.1.85
john@Nagle-LTS:/usr/local$ nvidia-smi
Fri Jan 17 20:15:42 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 435.21 Driver Version: 435.21 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+================= =====+======================|
| 0 GeForce GT 640 Off | 00000000:01:00.0 N/A | N/A |
| 30% 31C P8 N/A / N/A | 302MiB / 1996MiB | N/A Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|================================================= ============================|
| 0 Not Supported |
+-----------------------------------------------------------------------------+
But if I try to run Tensorflow's basic test, I get
~/projects/sl/rasa$ python
Python 3.7.6 (default, Jan 17 2020, 10:24:03)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
2020-01-17 19:26:02.912477: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer.so.6'; dlerror: libnvinfer.so.6: cannot open shared object file: No such file or directory
2020-01-17 19:26:02.912566: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer_plugin.so.6'; dlerror: libnvinfer_plugin.so.6: cannot open shared object file: No such file or directory
2020-01-17 19:26:02.912575: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:30] Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
Reading articles about this, it's a known problem, and on older versions there are all sorts of workarounds that bypass the package manager. I don't want to do that; the drivers and nVidia command line tools seem to be OK. But there's nothing in "/usr/local/cuda", the usual place for those libraries.
What installs those shared libraries without messing up anything else?
There's some discussion here: https://devtalk.nvidia.com/default/topic/1066634/could-not-load-dynamic-library-libnvinfer-so-5-/
but itt's for an older version of the NVidia driver. I have driver 435.21 installed; those instructions have very specific version numbers and install driver 418.