Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pytest] nvlink related tests fail on a machine without nvlink #24

Open
ksangeek opened this issue Jan 8, 2020 · 1 comment
Open

[pytest] nvlink related tests fail on a machine without nvlink #24

ksangeek opened this issue Jan 8, 2020 · 1 comment

Comments

@ksangeek
Copy link
Contributor

ksangeek commented Jan 8, 2020

Describe the bug
I see that the tests for nvlink related APIs fail on a machine without nvlink e.g. test_nvml_nvlink_properties(). Looking at the rc pynvml.nvml.NVMLError_NotSupported it is clear that the failure is because of the absence of nvlink.
Opening this issue to check if there is a better way to handle these in the tests. Or is it too much of a work to bother about?

Steps/Code to reproduce bug
pytest reports these kinds of failures for nvlink related testcases -

__________________________________ test_nvml_nvlink_properties ___________________________________

ngpus = 2
handles = [<pynvml.nvml.LP_struct_c_nvmlDevice_t object at 0x7f2f299bfc80>, <pynvml.nvml.LP_struct_c_nvmlDevice_t object at 0x7f2f299bfb70>]

    def test_nvml_nvlink_properties(ngpus, handles):
        for i in range(ngpus):
            for j in range(pynvml.NVML_NVLINK_MAX_LINKS):
>               version = pynvml.nvmlDeviceGetNvLinkVersion(handles[i], j)

test_nvml.py:238:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../../../../anaconda3/envs/pynvml_py36/lib/python3.6/site-packages/pynvml/nvml.py:2021: in nvmlDeviceGetNvLinkVersion
    check_return(ret)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

ret = 3

    def check_return(ret):
        if (ret != NVML_SUCCESS):
>           raise NVMLError(ret)
E           pynvml.nvml.NVMLError_NotSupported: Not Supported

../../../../anaconda3/envs/pynvml_py36/lib/python3.6/site-packages/pynvml/nvml.py:366: NVMLError_NotSupported
------------------------------------- Captured stdout setup --------------------------------------
[2 GPUs]
@rjzamora
Copy link
Collaborator

rjzamora commented Jan 8, 2020

Thanks for raising @ksangeek - You are correct that the test suite assumes NVLink is supported on the machine being queried. It certainly makes sense to skip tests that do not apply to the target machine (especially if/when we start introducing CI).

For NVLink, we can probably try to call nvmlDeviceGetNvLinkVersion on the 0th device within a module-level fixture, and then catch the NVMLError_NotSupported error to specify if NVLink is not supported.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants