-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
the supported AMDGPU versions are gfx1030gfx1100, may be lost a ',' between the devices "gfx1030,gfx1100" #2524
Comments
This is already fixed in the code but they seem to take forever to release the updated binary. If you are ok with building from source, it should work. This script should give you some idea how to compile it. Also, the latest docker image has the fix. So, if you are OK with using a Docker container, try ROCm 6.1 images from https://hub.docker.com/r/rocm/tensorflow/tags |
This has been an issue for many months now... See #2410. If what @briansp2020 has said about the Docker image being fixed is accurate, it's quite baffling that they didn't bother to update the package on pypi... Still, if you do not want to use the Docker image, there's an alternative to compiling tensorflow. You can download nightly wheels from here: http://ml-ci.amd.com:21096/job/tensorflow/job/release-rocmfork-r214-rocm-enhanced/job/release-build-whl/. This was mentioned by jayfurmanek in #2410, and it worked quite well for me. |
This issue is still present in rocm 6.1.2 2024-07-21 22:16:52.680470: I tensorflow/core/common_runtime/gpu/gpu_device.cc:2266] Ignoring visible gpu device (device: 0, name: AMD Radeon RX 6600, pci bus id: 0000:03:00.0) with AMDGPU version : gfx1030. The supported AMDGPU versions are gfx1030gfx1100, gfx900, gfx906, gfx908, gfx90a, gfx940, gfx941, gfx942. tensorflow ignores a gfx1030. gpu tensorflow_rocm-2.14.0.600 Agent 2 Name: gfx1030 |
I think they may have given up on the pypi package, but instructions on this repo were not updated and the change was poorly communicated (no surprises here). According to https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/3rd-party/tensorflow-install.html:
|
I confirm what @Eskander is saying. They dropped the pypi package, and the pypi page has no mention of that. However, I would currently advise against attempting to install tensorflow-rocm on your system - not because of tensorflow per se, but actually because of rocm itself (assuming you are installing rocm on your system). Rocm currently breaks my system. I'm on a fresh install of Ubuntu 24.04.1, disabled iGPU on the BIOS (I have a Ryzen 7950X), attempted to install via amdgpu with dkms, then uninstalled, installed again with --no-dkms, and no luck. GNOME implodes the moment you get to the login screen - I could only login, see a bunch of glitching, open a terminal and uninstall. Additionally, if you actually check their repos, there's currently no tensorflow build for python 3.12, which Ubuntu 24.04 now ships by default... So yeah, even if you could get ROCM working, tensorflow-rocm for Ubuntu LTS is currently broken, despite AMD claiming support for it... Anyways, the docker version seems to work fine perfectly fine with my 7900 XTX, so I believe this particular issue has been solved and can now be closed. Lastly, if you are running Fedora, I hear they now are shipping with ROCM 6 installed by default. You'll still have the python version issue, but maybe you'll have better luck there. |
Issue type
Bug
Have you reproduced the bug with TensorFlow Nightly?
Yes
Source
binary
TensorFlow version
v2.14.0-4248-g3448956e87e 2.14.0.600
Custom code
Yes
OS platform and distribution
No response
Mobile device
No response
Python version
No response
Bazel version
No response
GCC/compiler version
No response
CUDA/cuDNN version
No response
GPU model and memory
No response
Current behavior?
2024-05-04 09:45:04.334204: I tensorflow/core/common_runtime/gpu/gpu_device.cc:2266] Ignoring visible gpu device (device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:03:00.0) with AMDGPU version : gfx1100. The supported AMDGPU versions are gfx1030gfx1100, gfx900, gfx906, gfx908, gfx90a, gfx940, gfx941, gfx942.
Standalone code to reproduce the issue
Relevant log output
The text was updated successfully, but these errors were encountered: