Skip to content

~NGC release testing #112

~NGC release testing

~NGC release testing #112

Manually triggered October 23, 2024 07:11
Status Failure
Total duration 56m 39s
Artifacts 31

ngc-release-testing.yaml

on: workflow_dispatch
Matrix: test-maxtext / maxtext-multinode
Matrix: test-maxtext / single-process-multi-device
Matrix: test-jax / run-unit-test
Matrix: test-rosetta-pax / rosetta-pax-multi-node-te
Matrix: test-rosetta-pax / rosetta-pax-multi-node
Matrix: test-rosetta-pax / rosetta-pax-single-node-dropout-te
Matrix: test-rosetta-pax / single-process-evaluation-te
Matrix: test-rosetta-pax / single-process-multi-device-te
test-jax  /  ...  /  launch-slurm-runner
18m 19s
test-jax / runner / launch-slurm-runner
test-maxtext  /  test-maxtext-summary
0s
test-maxtext / test-maxtext-summary
test-maxtext  /  test-maxtext-metrics
14s
test-maxtext / test-maxtext-metrics
test-rosetta-pax  /  test-pax-rosetta-summary
0s
test-rosetta-pax / test-pax-rosetta-summary
test-rosetta-pax  /  test-pax-rosetta-metrics
20s
test-rosetta-pax / test-pax-rosetta-metrics
test-maxtext  /  ...  /  sitrep
17s
test-maxtext / test-maxtext-sitrep / sitrep
test-rosetta-pax  /  ...  /  sitrep
12s
test-rosetta-pax / test-pax-rosetta-sitrep / sitrep
test-maxtext  /  test-maxtext-outcome
0s
test-maxtext / test-maxtext-outcome
test-rosetta-pax  /  test-pax-rosetta-outcome
0s
test-rosetta-pax / test-pax-rosetta-outcome
finalize  /  workflow-badge
4s
finalize / workflow-badge
finalize  /  report
6s
finalize / report
finalize  /  upload-badge
7s
finalize / upload-badge
finalize  /  publish-badge
4s
finalize / publish-badge
Fit to window
Zoom out
Zoom in

Annotations

1 error and 4 warnings
test-rosetta-pax / test-pax-rosetta-outcome
Process completed with exit code 1.
test-maxtext / maxtext-multinode (1, 1, 8, 1)
Failed to download action 'https://api.github.com/repos/webfactory/ssh-agent/tarball/dc588b651fe13675774614f8e6a936a468676387'. Error: The request was canceled due to the configured HttpClient.Timeout of 100 seconds elapsing.
test-maxtext / maxtext-multinode (1, 1, 8, 1)
Back off 18.25 seconds before retry.
test-rosetta-pax / single-process-evaluation-te (1, 8, 1, 1)
Failed to download action 'https://api.github.com/repos/webfactory/ssh-agent/tarball/dc588b651fe13675774614f8e6a936a468676387'. Error: The request was canceled due to the configured HttpClient.Timeout of 100 seconds elapsing.
test-rosetta-pax / single-process-evaluation-te (1, 8, 1, 1)
Back off 12.94 seconds before retry.

Artifacts

Produced during runtime
Name Size
artifact-final-report
2.92 KB
artifact-maxtext-test
670 Bytes
artifact-rosetta-pax-mgmn-test
2 KB
artifact-workflow-metadata
269 Bytes
jax-unit-test-A100
16.9 KB
jax-unit-test-V100
17.9 KB
rosetta-pax-11474884093-16DP1FSDP1TP1PP_TE
474 KB
rosetta-pax-11474884093-1DP1FSDP1TP1PP_TE
86.8 KB
rosetta-pax-11474884093-1DP2FSDP4TP1PP_single_process_TE
63.3 KB
rosetta-pax-11474884093-1DP8FSDP1TP1PP_TE
259 KB
rosetta-pax-11474884093-2DP1FSDP1TP4PP
268 KB
rosetta-pax-11474884093-2DP1FSDP2TP4PP
499 KB
rosetta-pax-11474884093-4DP1FSDP2TP1PP
362 KB
rosetta-pax-11474884093-4DP1FSDP2TP1PP_TE
262 KB
rosetta-pax-11474884093-5B_fused_attn_0
167 KB
rosetta-pax-11474884093-5B_fused_attn_1
169 KB
rosetta-pax-11474884093-8DP1FSDP1TP1PP
364 KB
rosetta-pax-11474884093-8DP1FSDP1TP1PP_TE
258 KB
rosetta-pax-11474884093-8DP1FSDP1TP1PP_eval_TE
77.4 KB
rosetta-pax-11474884093-8DP1FSDP1TP1PP_single_process_TE
63.2 KB
rosetta-pax-11474884093-8DP_TE_dropout
265 KB
rosetta-pax-11474884093-LLaMA_eval_TE
155 KB
rosetta-pax-metrics-test-log
7.86 KB
upstream-maxtext-11474884093-1DP1FSDP1TP1PP
13.9 KB
upstream-maxtext-11474884093-1DP1FSDP8TP1PP
18.8 KB
upstream-maxtext-11474884093-1DP2FSDP4TP1PP_single_process
14.1 KB
upstream-maxtext-11474884093-1DP4FSDP2TP1PP
19.1 KB
upstream-maxtext-11474884093-1DP8FSDP1TP1PP
19 KB
upstream-maxtext-11474884093-2DP2FSDP2TP1PP
19 KB
upstream-maxtext-11474884093-4DP2FSDP2TP1PP
24.5 KB
upstream-maxtext-metrics-test-log
1.85 KB