Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix grafana dashboard and clarify dashboard usage more clearly. #543

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

jiangsanyin
Copy link

Signed-off-by: jiangsanyin 1327212357@qq.com

What type of PR is this?
/kind bug

What this PR does / why we need it:
fix grafana dashboard and clarify dashboard usage more clearly. Thanks "fangfenghuang (https://github.com/fangfenghuang)" for your help

Which issue(s) this PR fixes:
Fixes #498 #468

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

image: nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda10.2
resources:
limits:
nvidia.com/vgpu: 2 # requesting 2 vGPUs
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should "nvidia.com/vgpu" be "nvidia.com/gpu"?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should "nvidia.com/vgpu" be "nvidia.com/gpu"?

I forgot to explain it, it depends on our own case.
In order to distinguish from “nvidia.com/gpu” in nvidia-device-plugin, I used resourceName parameter and setted it's value to "nvidia.com/vgpu", such as: helm install hami hami-charts/hami --set resourceName=nvidia.com/vgpu --set scheduler.kubeScheduler.imageTag=v1.23.10 -n kube-system

@wawa0210
Copy link
Member

@fangfenghuang Can you help review this pr?

Copy link

codecov bot commented Oct 24, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Flag Coverage Δ
unittests 27.09% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Copy link

@fangfenghuang fangfenghuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix some http url


​ You can see the monitoring details in the dashboard. The contents are as follows:

![image-20241003215400685](https://s2.loli.net/2024/10/03/RFJuthzAGYw5UHk.png)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is best to place the referenced images in the ../imgs/

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, changes has been made.

…he image display problem in document and document format

Signed-off-by: jiangsanyin <1327212357@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

monitoring data node_name not exists ,GPU power usage is not correct in Grafana Dashboard
4 participants