Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GPU utilization meter #406

Closed
ThyrixYang opened this issue Dec 16, 2020 · 28 comments · Fixed by #1288
Closed

Add GPU utilization meter #406

ThyrixYang opened this issue Dec 16, 2020 · 28 comments · Fixed by #1288
Labels
needs-discussion 🤔 Changes need to be discussed and require consent new feature Completely new feature

Comments

@ThyrixYang
Copy link

I think it would be nice if there could be a usage bar like cpu for gpu processors.
The only available option is the nvidia-smi command but it's awkward.

@ThyrixYang ThyrixYang changed the title Add gpu support [Feature Request] Add gpu support Dec 16, 2020
@BenBE BenBE added the new feature Completely new feature label Dec 16, 2020
@BenBE
Copy link
Member

BenBE commented Dec 16, 2020

While this sounds interesting it'd be appreciated if we can keep as neutral as possible. Thus an implementation of this should cover most systems regardless of the vendor supplying the hardware.

Also, as was done with some other features that rely on external libraries the implementation should try to dynamically load the required libraries at runtime. Or even avoid external libraries at all when possible.

@BenBE BenBE added the needs-discussion 🤔 Changes need to be discussed and require consent label Dec 16, 2020
@BenBE BenBE changed the title [Feature Request] Add gpu support Add GPU utilization meter Dec 16, 2020
@fasterit
Copy link
Member

https://github.com/rib/gputop for inspiration

@ThyrixYang
Copy link
Author

ThyrixYang commented Dec 16, 2020

It seems that the gputop mentioned by @fasterit only supports intel gpus.
As a deep learning trainer, I'm looking for a monitor for nvidia gpus. And I believe stand-alone graphics cards are more needed to be monitored than intel gpus, since we are often working with multiple nvidia gpus.
AMD gpus are not available for deep learning training at least for now, so I think only supporting nvidia gpus are already very useful for us. It would be better if AMD, and Intel gpus can be supported as well.

@stulluk
Copy link

stulluk commented Dec 19, 2020

I wish that I could write a script inside htoprc, such that:

...
GpuTemp_handler=$(nvidia-smi --query-gpu=temperature.gpu --format=csv,noheader,nounits)
...

@daniejstriata
Copy link

If I had to use this feature I'd expect the results to show me the utilisation of the GPU for doing work with CUDA, OpenGL, OpenCV, Xtend, or OpenCL. Also, I don't see how a busy GPU would have the same metrics consideration as for a busy CPU.

@sevagh
Copy link

sevagh commented Feb 13, 2021

There is an issue on the previous version of the repo: hishamhm/htop#899

Implementing generic custom meters should solve the problem of vendor neutrality.

@NicTanghe
Copy link

IF (card = nvidia)
data = nvidia-smi -l --query-gpu=timestamp,temperature.gpu,memory.used,memory.free --format=csv

elif (card = AMD)

data = amd version

elif (card = intel)
(intel version)

else ()
(return card not supported)

Although it doesn't have the same tick rate as Htop itself, so costum code would probably be required to be made.

I’d be willing to have a go at coding it myself, but I’m not a c programmer and I have no idea where to go looking in the codebase to add something like this.

@natoscott
Copy link
Member

natoscott commented Oct 4, 2022

Hi folks,
You can achieve this in htop today if you use the pcp-htop variant and the nvidia PCP metrics. This is probably the best long term solution here since the way the metrics are extracted (at least in the case of nvidia) requires a separate daemon like has been done for the 'gputop server' ... this is also the architecture PCP provides already.

https://man7.org/linux/man-pages/man1/pmdanvidia.1.html
https://man7.org/linux/man-pages/man1/pcp-htop.1.html

@NicTanghe
Copy link

NicTanghe commented Oct 5, 2022

I`ve installed the packages but when i run pcp htop

I seem to just get an htop wich doesn`t even show my running processes.

also no GPU option in headers layout menu.

@natoscott
Copy link
Member

[...] I seem to just get an htop wich doesn`t even show my running processes.

Hmm, maybe an installation issue, everythings working fine here - can you fetch values via:

pminfo --fetch proc.psinfo.rss

Which Linux distribution? PCP version? Can you paste output from 'pcp summary'?

[...] also no GPU option in headers layout menu.

Yep, you're blazing a trail here - this will involve adding a new text config file alongside the others below pcp/meters/ in the htop repo specifying which metrics you want to display.

@ghost
Copy link

ghost commented Oct 21, 2022

Hi folks, You can achieve this in htop today if you use the pcp-htop variant and the nvidia PCP metrics. This is probably the best long term solution here since the way the metrics are extracted (at least in the case of nvidia) requires a separate daemon like has been done for the 'gputop server' ... this is also the architecture PCP provides already.

same here as @NicTanghe's issue
Linux 6.0.2-arch1-1
also:

pcp-summary: Cannot connect to PMCD on host "local:": Connection refused

@natoscott
Copy link
Member

"Connection refused" means pmcd(1) is not running - try 'systemctl start pmcd' (or equivalent for your local init system). If pmcd isn't available you likely haven't installed pmdanvidia(1) either.

@NicTanghe
Copy link

I`ve been buisy with other stuf and probably wont reply any time soon

@benjamin051000
Copy link

see https://github.com/Syllo/nvtop, they seem to have this figured out. htop needs this!

@BenBE
Copy link
Member

BenBE commented Aug 24, 2023

Thank you for this pointer. When we last had a look at the GPU utilization stuff there was no real unified interface available yet. But given that fdinfo seems to be the way to go, this seems to have become reasonable to implement and maintain.

@benjamin051000: Do you mind helping with a PR for initial support for these?

@djh00t
Copy link

djh00t commented Aug 29, 2023

I know this is scope creep, but adding Apple Metal GPU stats would be awesome too.

@BenBE BenBE linked a pull request Aug 29, 2023 that will close this issue
@stulluk
Copy link

stulluk commented Mar 29, 2024

I confirm this works on my ryzen 5700G out of box . I compiled htop as follows:

./autogen.sh
./configure --enable-static --enable-sensors
make

It can even show GPU usage time per process !

Thank you so much.

@Samueru-sama
Copy link

Samueru-sama commented May 20, 2024

I confirm this works on my ryzen 5700G out of box . I compiled htop as follows:

./autogen.sh
./configure --enable-static --enable-sensors
make

It can even show GPU usage time per process !

Thank you so much.

Sorry how does one enable this? I have an RX 580 and I can't find the option to enable the GPU usage in htop. (I also tried building it manually like you did)

@stulluk
Copy link

stulluk commented May 20, 2024

I confirm this works on my ryzen 5700G out of box . I compiled htop as follows:

./autogen.sh
./configure --enable-static --enable-sensors
make

It can even show GPU usage time per process !
Thank you so much.

Sorry how does one enable this? I have an RX 580 and I can't find the option to enable the GPU usage in htop. (I also tried building it manually like you did)

Press F2. Scroll down to "Meters" On the 4th column, you will see "GPU usage" . Focus on it an then press ENTER to move it to 2nd column.

image

Hope this helps.

@Samueru-sama
Copy link

I confirm this works on my ryzen 5700G out of box . I compiled htop as follows:

./autogen.sh
./configure --enable-static --enable-sensors
make

It can even show GPU usage time per process !
Thank you so much.

Sorry how does one enable this? I have an RX 580 and I can't find the option to enable the GPU usage in htop. (I also tried building it manually like you did)

Press F2. Scroll down to "Meters" On the 4th column, you will see "GPU usage" . Focus on it an then press ENTER to move it to 2nd column.

image

Hope this helps.

Yeah it is not on my system.

image

Thank you either way, now I know that at least the issue isn't that I couldn't find the option.

@stulluk
Copy link

stulluk commented May 20, 2024

I confirm this works on my ryzen 5700G out of box . I compiled htop as follows:

./autogen.sh
./configure --enable-static --enable-sensors
make

It can even show GPU usage time per process !
Thank you so much.

Sorry how does one enable this? I have an RX 580 and I can't find the option to enable the GPU usage in htop. (I also tried building it manually like you did)

Press F2. Scroll down to "Meters" On the 4th column, you will see "GPU usage" . Focus on it an then press ENTER to move it to 2nd column.
image
Hope this helps.

Yeah it is not on my system.

image

Thank you either way, now I know that at least the issue isn't that I couldn't find the option.

Not sure if this helps, but I wanted to share my htoprc for you:

stulluk ~ $  cat .config/htop/htoprc 
# Beware! This file is rewritten by htop when settings are changed in the interface.
# The parser is also very primitive, and not human-friendly.
htop_version=3.4.0-dev
config_reader_min_version=3
fields=0 48 17 18 38 39 40 2 46 47 132 49 1
hide_kernel_threads=1
hide_userland_threads=1
hide_running_in_container=0
shadow_other_users=0
show_thread_names=1
show_program_path=1
highlight_base_name=0
highlight_deleted_exe=1
shadow_distribution_path_prefix=0
highlight_megabytes=1
highlight_threads=1
highlight_changes=0
highlight_changes_delay_secs=5
find_comm_in_cmdline=1
strip_exe_from_cmdline=1
show_merged_command=0
header_margin=1
screen_tabs=0
detailed_cpu_time=0
cpu_count_from_one=0
show_cpu_usage=1
show_cpu_frequency=1
show_cpu_temperature=1
degree_fahrenheit=0
update_process_names=0
account_guest_in_cpu_meter=0
color_scheme=0
enable_mouse=1
delay=15
hide_function_bar=0
header_layout=two_50_50
column_meters_0=LeftCPUs2 Memory DiskIO NetworkIO Swap
column_meter_modes_0=1 1 2 2 1
column_meters_1=RightCPUs2 Tasks LoadAverage Systemd GPU
column_meter_modes_1=1 2 2 2 1
tree_view=0
sort_key=46
tree_sort_key=46
sort_direction=-1
tree_sort_direction=-1
tree_view_always_by_pid=0
all_branches_collapsed=0
screen:Main=PID USER PRIORITY NICE M_VIRT M_RESIDENT M_SHARE STATE PERCENT_CPU PERCENT_MEM GPU_PERCENT TIME Command
.sort_key=PERCENT_CPU
.tree_sort_key=PERCENT_CPU
.tree_view_always_by_pid=0
.tree_view=0
.sort_direction=-1
.tree_sort_direction=-1
.all_branches_collapsed=0
screen:I/O=PID USER IO_PRIORITY IO_RATE IO_READ_RATE IO_WRITE_RATE Command
.sort_key=IO_RATE
.tree_sort_key=PID
.tree_view_always_by_pid=0
.tree_view=0
.sort_direction=-1
.tree_sort_direction=1
.all_branches_collapsed=0
stulluk ~ $ 

Other than this, I am just wondering if you installed "libsensors5" and rebooted ? or have you run "sensors-detect" ?

@Samueru-sama
Copy link

# Beware! This file is rewritten by htop when settings are changed in the interface.
# The parser is also very primitive, and not human-friendly.
htop_version=3.4.0-dev
config_reader_min_version=3
fields=0 48 17 18 38 39 40 2 46 47 132 49 1
hide_kernel_threads=1
hide_userland_threads=1
hide_running_in_container=0
shadow_other_users=0
show_thread_names=1
show_program_path=1
highlight_base_name=0
highlight_deleted_exe=1
shadow_distribution_path_prefix=0
highlight_megabytes=1
highlight_threads=1
highlight_changes=0
highlight_changes_delay_secs=5
find_comm_in_cmdline=1
strip_exe_from_cmdline=1
show_merged_command=0
header_margin=1
screen_tabs=0
detailed_cpu_time=0
cpu_count_from_one=0
show_cpu_usage=1
show_cpu_frequency=1
show_cpu_temperature=1
degree_fahrenheit=0
update_process_names=0
account_guest_in_cpu_meter=0
color_scheme=0
enable_mouse=1
delay=15
hide_function_bar=0
header_layout=two_50_50
column_meters_0=LeftCPUs2 Memory DiskIO NetworkIO Swap
column_meter_modes_0=1 1 2 2 1
column_meters_1=RightCPUs2 Tasks LoadAverage Systemd GPU
column_meter_modes_1=1 2 2 2 1
tree_view=0
sort_key=46
tree_sort_key=46
sort_direction=-1
tree_sort_direction=-1
tree_view_always_by_pid=0
all_branches_collapsed=0
screen:Main=PID USER PRIORITY NICE M_VIRT M_RESIDENT M_SHARE STATE PERCENT_CPU PERCENT_MEM GPU_PERCENT TIME Command
.sort_key=PERCENT_CPU
.tree_sort_key=PERCENT_CPU
.tree_view_always_by_pid=0
.tree_view=0
.sort_direction=-1
.tree_sort_direction=-1
.all_branches_collapsed=0
screen:I/O=PID USER IO_PRIORITY IO_RATE IO_READ_RATE IO_WRITE_RATE Command
.sort_key=IO_RATE
.tree_sort_key=PID
.tree_view_always_by_pid=0
.tree_view=0
.sort_direction=-1
.tree_sort_direction=1
.all_branches_collapsed=0

Didn't work, could you share the htop binary that works on your end? It might be that my gpu isn't supported.

Yes lm_sensors is installed, it is even a dependency of mesa (that's what I assume you meant by libsensors5 since that package isn't on the arch based distro that I use).

This is what I get when running sensors-detect:
image

@stulluk
Copy link

stulluk commented May 21, 2024

htop-x86-64-3.4.0-stulluk-gpu-works-static-compile.tar.gz

I hope it helps.

MD5SUM: f8b654c937c72591d9a3a5599cfd6cef

@Samueru-sama
Copy link

Samueru-sama commented May 21, 2024

htop-x86-64-3.4.0-stulluk-gpu-works-static-compile.tar.gz

I hope it helps.

MD5SUM: f8b654c937c72591d9a3a5599cfd6cef

Thank you this works, looks like I have an issue with libraries on my end. Because I just tried to build it again and I can't compile static on artix linux even though I have all the libraries needed.

Omg this has taken so long I give up, I can't get this compile that feature.

I tried the official arch package, I downloaded the debian package as well, and also built the htop package statically using github workflows on a ubuntu machine, none gave me a htop that has a working gpu meter, the only one that has it is your binary.

@myclevorname
Copy link

The GPU meter is not on my htop either. I am using NixOS unstable with htop 3.3.0.

@stulluk
Copy link

stulluk commented Jun 22, 2024

The GPU meter is not on my htop either. I am using NixOS unstable with htop 3.3.0.

If I am not mistaken, this feature was added since 3.4.0 (see my build version above )

@myclevorname
Copy link

The GPU meter is not on my htop either. I am using NixOS unstable with htop 3.3.0.

If I am not mistaken, this feature was added since 3.4.0 (see my build version above )

Thanks.

@gsoul
Copy link

gsoul commented Oct 30, 2024

Since htop 3.4 is not yet available it seems, I made this small script for myself. Maybe it will help somebody else as well:

tmux new-session -s htop_nvtop -d
tmux split-window -h
tmux send-keys -t 0 "htop" C-m
tmux send-keys -t 1 "nvtop" C-m
tmux attach-session -t htop_nvtop

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-discussion 🤔 Changes need to be discussed and require consent new feature Completely new feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.