Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge main into protected branch 24.10-devel #1062

Closed
wants to merge 11 commits into from
Closed

Conversation

DwarKapex
Copy link
Contributor

To stay up to date before code freeze for NGC 24.10 release

kocchop and others added 8 commits September 5, 2024 11:57
1. `xla_gpu_enable_triton_gemm` is still needed. 
2. Removed some other deprecated XLA flags:
`xla_gpu_enable_triton_softmax_fusion`
3. Also removed some XLA flags that are now turned on by default.
`xla_enable_async_all_gather` etc.
Fixed the tensorboard dir path after a recent change in MaxText
software:
AI-Hypercomputer/maxtext#863
Example as of 8-28-2024

```
$ docker run --entrypoint='' --rm -it ghcr.io/nvidia/jax:pax-2024-08-28 ls -lah /opt/jaxlibs total 20K
drwxr-xr-x 1 root root 4.0K Aug 28 09:43 .
drwxr-xr-x 1 root root 4.0K Aug 28 10:04 ..
drwx------ 1 root root 4.0K Aug 28 09:43 jax_gpu_pjrt
drwx------ 1 root root 4.0K Aug 28 09:43 jax_gpu_plugin
drwx------ 1 root root 4.0K Aug 28 09:43 jaxlib
```

Signed-off-by: Terry Kong <terryk@nvidia.com>
Provide an option to run XLA cuDNN flash attention as an alternative to
TE cuDNN flash attention.
Forced by this change in JAX build system:
jax-ml/jax#23787
Co-authored-by: Olli Lupton <olupton@nvidia.com>
olupton
olupton previously approved these changes Sep 26, 2024
Moves XLA flags from model CI into their own files that can be sourced.
Each file can be sourced and will print what it sets.

Some files source other files, which was intentional to avoid
introducing sim-links into the repo, which can sometimes have platform
issues (like on windows).

---------

Signed-off-by: Terry Kong <terryk@nvidia.com>
DwarKapex and others added 2 commits September 27, 2024 10:45
The latest MaxText uses `pathwayutils`, which is added as a dependency.
Need to add it to our manifest.yaml file to resolve reference issue
during final installation.
…tives (#1073)

This adds logic to treat `dynamic[-update]-slice` operations that have a
source/destination operand in the host memory space as being
communication operations, labelling them as single-device "collectives".

The goal is to improve support for analysing profiles of execution
including offloading to host memory.

Also fix using nsys 2024.6 by applying the same patch as 2024.5 that
adds the thread ID.
@DwarKapex DwarKapex closed this Oct 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants