Triton: use CUDA 12.3 tools from the base image #656

olupton · 2024-03-22T11:57:20Z

Previously, Triton would download its own copies of ptxas, cuobjdump and nvdisasm:
https://github.com/openxla/triton/blob/cl617459344/python/setup.py#L373-L393

This began to cause problems when those versions were bumped to CUDA 12.4, meaning that Triton started to generate PTX with version number 8.3. When this was compiled, using the ptxas from the base container, inside XLA, then there were errors:

CustomCall failed: ptxas exited with non-zero error code 65280, output: ptxas /tmp/tempfile-aac66f5d464c-1e8add55-32-61414d9c202e5, line 5;
fatal   : Unsupported .version 8.4; current version is '8.3'
ptxas fatal   : Ptx assembly aborted due to errors

in the nightly tests, which are taken from JAX-Triton.

Setting environment variables like TRITON_PTXAS_PATH has two effects:

it blocks downloading other versions during setup.py
at runtime, it is the highest precedence search location

If Triton starts depending on new features before the base container is updated to CUDA 12.4, problems may resurface.

Thanks to @andportnoy for help debugging.

yhtang

Given that the paths are hard-coded, can we add some test so that we get notified if the binaries change locations? e.g.

RUN if [[ ! -x ${TRITON_PTXAS_PATH} ]]; then <THROW-ERROR>; fi

andportnoy

Thank you for the fix! Left a minor suggestion adding some context.

.github/container/Dockerfile.triton

nouiz · 2024-03-22T16:26:55Z

Should we fix this upstream?

________________________________ From: Andrey Portnoy ***@***.***> Sent: Friday, March 22, 2024 9:09:38 AM To: NVIDIA/JAX-Toolbox ***@***.***> Cc: Frederic Bastien ***@***.***>; Review requested ***@***.***> Subject: Re: [NVIDIA/JAX-Toolbox] Triton: use CUDA 12.3 tools from the base image (PR #656) @andportnoy approved this pull request. Thank you for the fix! Left a minor suggestion adding some context.

________________________________ In .github/container/Dockerfile.triton<#656 (comment)>:

@@ -2,10 +2,18 @@

ARG BASE_IMAGE=ghcr.io/nvidia/jax-mealkit:jax ARG SRC_PATH_TRITON=/opt/openxla-triton +FROM ${BASE_IMAGE} as base +# Tell Triton to use CUDA binaries from the host container. These should be set ⬇️ Suggested change -# Tell Triton to use CUDA binaries from the host container. These should be set +# Triton setup.py downloads and installs CUDA binaries at specific versions +# hardcoded in the script itself: +# https://github.com/openxla/triton/blob/84f9d9de158fb866fac67970f0f5d323999d9db1/python/setup.py#L373-L393 +# Tell Triton to use CUDA binaries from the host container instead. These should be set — Reply to this email directly, view it on GitHub<#656 (review)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AABMF627MQCSEU6JBTO5DBDYZRJUFAVCNFSM6AAAAABFDGANMOVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMYTSNJVGIYDQOJVGE>. You are receiving this because your review was requested.Message ID: ***@***.***>

andportnoy

Great suggestion, Andrey. Good job.

.github/container/Dockerfile.triton

Co-authored-by: Andrey Portnoy <aportnoy@nvidia.com>

olupton · 2024-03-25T11:54:43Z

@nouiz

Should we fix this upstream?

I'm not sure what that would look like? What did you have in mind?

nouiz · 2024-03-25T20:51:24Z

@nouiz

Should we fix this upstream?

I'm not sure what that would look like? What did you have in mind?

I taught it was for Triton via Pallas. All is good.

olupton requested review from nouiz and andportnoy March 22, 2024 11:57

yhtang requested changes Mar 22, 2024

View reviewed changes

olupton requested a review from yhtang March 22, 2024 13:25

andportnoy previously approved these changes Mar 22, 2024

View reviewed changes

.github/container/Dockerfile.triton Outdated Show resolved Hide resolved

olupton dismissed andportnoy’s stale review via f816401 March 22, 2024 16:38

andportnoy self-requested a review March 22, 2024 16:42

andportnoy previously approved these changes Mar 22, 2024

View reviewed changes

yhtang requested changes Mar 22, 2024

View reviewed changes

.github/container/Dockerfile.triton Outdated Show resolved Hide resolved

olupton dismissed andportnoy’s stale review via 4289497 March 25, 2024 08:12

olupton requested a review from yhtang March 25, 2024 08:12

olupton and others added 4 commits March 25, 2024 09:12

Triton: use CUDA 12.3 tools from the base image

68bac2b

Check paths exist

131f4a6

Update .github/container/Dockerfile.triton

7fb211f

Co-authored-by: Andrey Portnoy <aportnoy@nvidia.com>

-f -> -x

fe2f64f

olupton force-pushed the olupton/triton-should-not-use-12.4-yet branch from 4289497 to fe2f64f Compare March 25, 2024 08:13

yhtang approved these changes Mar 25, 2024

View reviewed changes

olupton merged commit 52b2c10 into main Mar 25, 2024
167 of 171 checks passed

olupton deleted the olupton/triton-should-not-use-12.4-yet branch March 25, 2024 17:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Triton: use CUDA 12.3 tools from the base image #656

Triton: use CUDA 12.3 tools from the base image #656

olupton commented Mar 22, 2024 •

edited

Loading

yhtang left a comment

andportnoy left a comment

nouiz commented Mar 22, 2024 via email

andportnoy left a comment

olupton commented Mar 25, 2024

nouiz commented Mar 25, 2024

Triton: use CUDA 12.3 tools from the base image #656

Triton: use CUDA 12.3 tools from the base image #656

Conversation

olupton commented Mar 22, 2024 • edited Loading

yhtang left a comment

Choose a reason for hiding this comment

andportnoy left a comment

Choose a reason for hiding this comment

nouiz commented Mar 22, 2024 via email

andportnoy left a comment

Choose a reason for hiding this comment

olupton commented Mar 25, 2024

nouiz commented Mar 25, 2024

olupton commented Mar 22, 2024 •

edited

Loading