When failing on NVMLError exception, bug in handling #37

qwertAsc · 2021-05-20T13:25:41Z

When failing on this line in smi.py with exception
nvmlDeviceGetSupportedMemoryClocks(handle)

following line fails with this error - "TypeError: list indices must be integers or slices, not str"
except NVMLError as err:
supportedClocks['Error'] = nvidia_smi.__handleError(err)

because supportedClocks defines as list

rjzamora · 2021-05-20T13:32:50Z

Thanks for raising an issue @qwertAsc - Can you provide a full reproducer here? How are you getting the handle?

For example, here is how I would expect someone to use nvmlDeviceGetSupportedMemoryClocks:

In [1]: import pynvml

In [2]: pynvml.nvmlInit()

In [3]: handle = pynvml.nvmlDeviceGetHandleByIndex(0)

In [4]: pynvml.nvmlDeviceGetSupportedMemoryClocks(handle)
Out[4]: [7001, 6501, 5001, 810, 405]

qwertAsc · 2021-05-20T13:45:30Z

Thanks @rjzamora
I am just calling the following
from pynvml.smi import nvidia_smi
nvidia_smi.getInstance().DeviceQuery()
and receiving this error
supportedClocks['Error'] = nvidia_smi.__handleError(err)
TypeError: list indices must be integers or slices, not str

when runnning your example I get the following
pynvml.nvmlDeviceGetSupportedMemoryClocks(handle)
File "/home/.../env/lib/python3.6/site-packages/pynvml/nvml.py", line 1135, in nvmlDeviceGetSupportedMemoryClocks
raise NVMLError(ret)
pynvml.nvml.NVMLError_NotSupported: Not Supported

rjzamora · 2021-05-20T13:57:17Z

Thanks for the info! Are you passing a query string to DeviceQuery (e.g. nvidia_smi.getInstance().DeviceQuery('memory.free') ?)

Also, can you specify the version of CUDA you are using and wheter you happen to be using MIG support?

qwertAsc · 2021-05-20T14:12:22Z

No, i am not passing any query string
I am using CUDA version 11.0
Also I am less bothered that the clock query fails, and more that except doesn't catch the error
(Not sure regarding MIG)

danielbraun89 · 2021-05-23T08:35:20Z

have the same issue while running nvidia_smi.getInstance().DeviceQuery()

supportedClocks['Error'] = nvidia_smi.__handleError(err)
TypeError: list indices must be integers or slices, not str

Riebart mentioned this issue Jul 19, 2021

[BUG] pynvml.smi.DeviceQuery() errors when run in the Intro01 demo notebook due to bad device brand (10) returned rapidsai-community/notebooks-contrib#338

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When failing on NVMLError exception, bug in handling #37

When failing on NVMLError exception, bug in handling #37

qwertAsc commented May 20, 2021

rjzamora commented May 20, 2021

qwertAsc commented May 20, 2021

rjzamora commented May 20, 2021

qwertAsc commented May 20, 2021

danielbraun89 commented May 23, 2021

When failing on NVMLError exception, bug in handling #37

When failing on NVMLError exception, bug in handling #37

Comments

qwertAsc commented May 20, 2021

rjzamora commented May 20, 2021

qwertAsc commented May 20, 2021

rjzamora commented May 20, 2021

qwertAsc commented May 20, 2021

danielbraun89 commented May 23, 2021