Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding support for CUDA_VISIBLE_DEVICES #567

Closed
lipengfeizju opened this issue Jun 6, 2024 · 5 comments
Closed

Adding support for CUDA_VISIBLE_DEVICES #567

lipengfeizju opened this issue Jun 6, 2024 · 5 comments
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@lipengfeizju
Copy link

  • CodeCarbon version: 2.4.2
  • Python version: 3.9.19
  • Operating System: Rocky Linux release 8.8 (Green Obsidian)

Description

In our university's cluster, our goal is to measure the energy consumption of a deep learning model. The server uses SLURM system and we only get 1 A100 ( out of 8 GPUs). The GPU power measurement from codecarbon is about all 8 GPUs, instead of the GPU we have been allocated.

Techinically speaking, I guess in line 184 of codecarbon/core/gpu.py, it queries all GPUs from nvml instead of focusing on the GPU we are actually using. To get a more accurate measurement, it would be better to only look up the power consumption related to CUDA_VISIBLE_DEVICES.

Similar discussion about this topic can also be found here in pynvml.

So is it possible to add a new feature to support measurements focusing on CUDA_VISIBLE_DEVICES? I think this is important for deep learning applications, since the other non-visiable devices are usually unrelated to the power consumption of the DL applications.

Thank you again for providing the code base for carbon measurement.

@SaboniAmine SaboniAmine added good first issue Good for newcomers enhancement New feature or request labels Jun 6, 2024
@inimaz
Copy link
Contributor

inimaz commented Jun 6, 2024

Hello @lipengfeizju! Thanks for using codecarbon.
I didn't know about this var CUDA_VISIBLE_DEVICES. If they end up implementing the function in pynvml it would be useful to use it indeed.

In the meantime, in codecarbon there is a way to filter the GPUs that are tracked if you provide their gpu_id. https://github.com/mlco2/codecarbon/blob/master/codecarbon/emissions_tracker.py#L191
So maybe you can get the id via nvidia-smi of the ones you want to use and do

EmissionsTracker(
             ...
             gpu_ids ="0,3,4"
)

Is this what you need?

@lipengfeizju
Copy link
Author

Thanks! That's exactly what I need.

@lipengfeizju
Copy link
Author

Sorry to reopen the issue again, is it possible to measure the power of several specific CPU cores? (Maybe just like we do for the GPU ids)

@benoit-cty
Copy link
Contributor

Maybe we could initialize gpu_ids with os.environ("CUDA_VISIBLE_DEVICES") ?

For CPU node could you open another issue and provide the codecarbon debug logs ? Because it's not possible yet but maybe we could imagine a way to do it.

@lipengfeizju
Copy link
Author

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

4 participants