Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pixi environment activation slow due to invocation of nvidia-smi #2430

Open
2 tasks done
pkgw opened this issue Nov 6, 2024 · 2 comments
Open
2 tasks done

Pixi environment activation slow due to invocation of nvidia-smi #2430

pkgw opened this issue Nov 6, 2024 · 2 comments

Comments

@pkgw
Copy link

pkgw commented Nov 6, 2024

Checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pixi, using pixi --version.

Reproducible example

This happens on my computer with any Pixi tree; description of the circumstances below.

Issue description

On my Linux laptop, any operation that needs to activate a Pixi environment is very slow. For instance:

$ time pixi run echo ok
ok

real     0m9.416s
user     0m3.700s
sys      0m3.999s

I.e., it takes almost 10 seconds for a pixi run echo ok to run for me.

I have traced this down to be because some part of the activation process is invoking the program nvidia-smi --query -u -x. On my Linux laptop running Fedora 40, this program is quite slow. If I run it on its own, it takes about 10 seconds, and the output I get is:

NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. 
Make sure that the latest NVIDIA driver is installed and running.

This is probably happening because my laptop has an NVIDIA card, and I have the Linux NVIDIA userspace installed, but most of the time I have the actual NVIDIA kernel drivers disabled because I need some of the functionality of the open-source Nouveau driver. I'd bet that if I booted up with the actual NVIDIA drivers enabled, the program would run a lot faster.

As far as I can tell, this nvidia-smi invocation isn't part of my environments' activation scripts, so I think that Pixi must be running it as part of the activation process. (Also, activations with plain Conda are not nearly this slow.) Is that true? Is there a way to avoid it? I'm clearly in a bit of a corner case, so if it's not possible to avoid running it in most cases, maybe there could be some kind of low-level configuration setting that avoids it for situations like mine.

Expected behavior

Activation is fast on my laptop.

@wolfv
Copy link
Member

wolfv commented Nov 6, 2024

Wow, that's really slow.
We use nvidia-smi to detect the versio for the __cuda virtual package. We should double check what conda is doing ...

@pkgw
Copy link
Author

pkgw commented Nov 7, 2024

Yeah, I think it's so slow because of my special situation where the NVIDIA tool is available but the kernel drivers aren't. If I strace the program, it's iterating through a bunch of stuff in /sys so I think it's probing for the hardware in some exhaustive way that normally gets the right answer really quickly.

My environments don't need anything to do with GPUs, so if there's some way to avoid that version test until/unless __cuda is needed, that would improve my experience. Or some kind of hack where I can put something along the lines of

avoid_cuda = true

in my pixi.toml files — not particularly elegant, but would make a difference for me in practice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants