Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running your container #1

Open
jgato opened this issue Aug 3, 2020 · 1 comment
Open

Running your container #1

jgato opened this issue Aug 3, 2020 · 1 comment

Comments

@jgato
Copy link

jgato commented Aug 3, 2020

Hi there,

I dont know if you are giving too much support to this report, but... I am trying to create a Docker Image for the Jetson Nano with Tensorflow and GPU access. I have tried many docker build examples, but the combinations between CUDA and Tensorflow libraries is been crazy. So I reached your repo that seems to have a pre-compiled image but when I try, I have an error about

>>> import tensorlfow
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'tensorlfow'
>>> import tensorflow
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
  File "/usr/lib/python3.6/imp.py", line 243, in load_module
    return load_dynamic(name, filename, file)
  File "/usr/lib/python3.6/imp.py", line 343, in load_dynamic
    return _load(spec)
ImportError: libcublas.so.10.0: cannot open shared object file: No such file or directory

I guess is because the libculas version installed in your image is 10.2:

# ls -l /usr/lib/aarch64-linux-gnu/libcublas*
lrwxrwxrwx 1 root root       15 Aug  3 07:21 /usr/lib/aarch64-linux-gnu/libcublas.so -> libcublas.so.10
lrwxrwxrwx 1 root root       22 Aug  3 07:21 /usr/lib/aarch64-linux-gnu/libcublas.so.10 -> libcublas.so.10.2.2.89
-rw-r--r-- 1 root root 80530928 Oct 29  2019 /usr/lib/aarch64-linux-gnu/libcublas.so.10.2.2.89
lrwxrwxrwx 1 root root       17 Aug  3 07:21 /usr/lib/aarch64-linux-gnu/libcublasLt.so -> libcublasLt.so.10
lrwxrwxrwx 1 root root       24 Aug  3 07:21 /usr/lib/aarch64-linux-gnu/libcublasLt.so.10 -> libcublasLt.so.10.2.2.89
-rw-r--r-- 1 root root 33235064 Oct 29  2019 /usr/lib/aarch64-linux-gnu/libcublasLt.so.10.2.2.89

Reading about it, I tried to update inside your image to tensorflow-gpu 1.15 but then I have a seg fault:

/# python3
Python 3.6.8 (default, Jan 14 2019, 11:02:34) 
[GCC 8.0.1 20180414 (experimental) [trunk revision 259383]] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow
Segmentation fault (core dumped)

any idea/help?

@jgato
Copy link
Author

jgato commented Aug 3, 2020

I think I got it. The base docker image provided by nvidia, shares your local runtime environment into the container. Therefore, no matter what I do, I will have CUDA 10.2 inside the container so I need Tensorflow 2.2 installed. In your case, when you did this example, the conditions were different about the local runtime environment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant