Investigate Flakey Tests in E2E CI #40

aiyengar2 · 2022-12-19T18:22:52Z

The following dashboard queries always seem to be flakey in E2E CI:

db/kubernetes-compute-resources-namespace-pods/IOPS(Reads+Writes)_query0
db/kubernetes-compute-resources-namespace-pods/Current_Storage_IO_query0
db/kubernetes-compute-resources-namespace-pods/Current_Storage_IO_query1
db/kubernetes-compute-resources-namespace-pods/Current_Storage_IO_query2

db/kubernetes-compute-resources-pod/IOPS_query0
db/kubernetes-compute-resources-pod/IOPS_query1
db/kubernetes-compute-resources-pod/IOPS(Reads+Writes)_query0
db/kubernetes-compute-resources-pod/Current_Storage_IO_query0
db/kubernetes-compute-resources-pod/Current_Storage_IO_query1
db/kubernetes-compute-resources-pod/Current_Storage_IO_query2

db/kubernetes-compute-resources-project/IOPS(Reads+Writes)_query0
db/kubernetes-compute-resources-project/Current_Storage_IO_query0
db/kubernetes-compute-resources-project/Current_Storage_IO_query1
db/kubernetes-compute-resources-project/Current_Storage_IO_query2

In #39, these tests are being automatically skipped but we should ideally investigate why these dashboards tend to have no data in random runs and put in fixes for them.

The text was updated successfully, but these errors were encountered:

aiyengar2 · 2022-12-19T18:26:03Z

In addition to fixing these flakey tests, we should also introduce a step into CI to perform a helm install of Longhorn before creating the ProjectHelmChart so that we can trigger the ProjectHelmChart to enable persistent storage for Prometheus and Grafana; this will ensure that we can validate that the Persistent Volume Metrics that are currently being skipped can be validated

aiyengar2 · 2022-12-19T18:50:51Z

We should also investigate the issues in https://github.com/rancher/prometheus-federator/actions/runs/3734091815/jobs/6335719834; this is probably related to rancher/rancher#39430

aiyengar2 assigned aiyengar2 and geethub97 Dec 19, 2022

github-actions bot added the team/area3 label Dec 19, 2022

aiyengar2 added area/monitoring bug Something isn't working [zube]: To Triage and removed team/area3 labels Dec 19, 2022

aiyengar2 mentioned this issue Dec 19, 2022

Bump Helm Project Operator to add 1.25 support for Prometheus Federator main chart and fix Grafana templating bug #39

Merged

2 tasks

This was referenced Dec 20, 2022

Update BCI micro version to 15.4 #41

Merged

Update Go to 1.19 and bump some dependencies #42

Merged

MKlimuszka unassigned aiyengar2 Sep 20, 2023

MKlimuszka added team/opni and removed [zube]: To Triage labels Sep 20, 2023

alexandreLamarre removed the team/opni label Feb 14, 2024

jbiers added the team/observability&backup label Jul 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate Flakey Tests in E2E CI #40

Investigate Flakey Tests in E2E CI #40

aiyengar2 commented Dec 19, 2022

aiyengar2 commented Dec 19, 2022

aiyengar2 commented Dec 19, 2022

Investigate Flakey Tests in E2E CI #40

Investigate Flakey Tests in E2E CI #40

Comments

aiyengar2 commented Dec 19, 2022

aiyengar2 commented Dec 19, 2022

aiyengar2 commented Dec 19, 2022