Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge main into protected branch 24.10-devel #1062

Closed
wants to merge 11 commits into from
Closed

Commits on Sep 5, 2024

  1. remove deprecated XLA flag (#1010)

    1. `xla_gpu_enable_triton_gemm` is still needed. 
    2. Removed some other deprecated XLA flags:
    `xla_gpu_enable_triton_softmax_fusion`
    3. Also removed some XLA flags that are now turned on by default.
    `xla_enable_async_all_gather` etc.
    kocchop authored Sep 5, 2024
    Configuration menu
    Copy the full SHA
    ecacd5b View commit details
    Browse the repository at this point in the history

Commits on Sep 6, 2024

  1. fix tensorboard events dir path (#1032)

    Fixed the tensorboard dir path after a recent change in MaxText
    software:
    AI-Hypercomputer/maxtext#863
    kocchop authored Sep 6, 2024
    Configuration menu
    Copy the full SHA
    44b4dfe View commit details
    Browse the repository at this point in the history
  2. Makes jaxlib wheel dirs readable for non-root users (#1023)

    Example as of 8-28-2024
    
    ```
    $ docker run --entrypoint='' --rm -it ghcr.io/nvidia/jax:pax-2024-08-28 ls -lah /opt/jaxlibs total 20K
    drwxr-xr-x 1 root root 4.0K Aug 28 09:43 .
    drwxr-xr-x 1 root root 4.0K Aug 28 10:04 ..
    drwx------ 1 root root 4.0K Aug 28 09:43 jax_gpu_pjrt
    drwx------ 1 root root 4.0K Aug 28 09:43 jax_gpu_plugin
    drwx------ 1 root root 4.0K Aug 28 09:43 jaxlib
    ```
    
    Signed-off-by: Terry Kong <terryk@nvidia.com>
    terrykong authored Sep 6, 2024
    Configuration menu
    Copy the full SHA
    f808df5 View commit details
    Browse the repository at this point in the history

Commits on Sep 9, 2024

  1. Configuration menu
    Copy the full SHA
    f116054 View commit details
    Browse the repository at this point in the history

Commits on Sep 18, 2024

  1. Add an option to test-pax.sh to enable XLA cuDNN flash attention (#1045)

    Provide an option to run XLA cuDNN flash attention as an alternative to
    TE cuDNN flash attention.
    Cjkkkk authored Sep 18, 2024
    Configuration menu
    Copy the full SHA
    056a3b0 View commit details
    Browse the repository at this point in the history

Commits on Sep 25, 2024

  1. Configuration menu
    Copy the full SHA
    57919e0 View commit details
    Browse the repository at this point in the history
  2. Bump clang to 18 (#1060)

    Forced by this change in JAX build system:
    jax-ml/jax#23787
    DwarKapex authored Sep 25, 2024
    Configuration menu
    Copy the full SHA
    3a2e8c8 View commit details
    Browse the repository at this point in the history
  3. Add CI argument for user-defined CUDA base image (#1013)

    Co-authored-by: Olli Lupton <olupton@nvidia.com>
    yhtang and olupton authored Sep 25, 2024
    Configuration menu
    Copy the full SHA
    ccededf View commit details
    Browse the repository at this point in the history

Commits on Sep 26, 2024

  1. Model XLA Flags (#1052)

    Moves XLA flags from model CI into their own files that can be sourced.
    Each file can be sourced and will print what it sets.
    
    Some files source other files, which was intentional to avoid
    introducing sim-links into the repo, which can sometimes have platform
    issues (like on windows).
    
    ---------
    
    Signed-off-by: Terry Kong <terryk@nvidia.com>
    terrykong authored Sep 26, 2024
    Configuration menu
    Copy the full SHA
    1a3febb View commit details
    Browse the repository at this point in the history

Commits on Sep 27, 2024

  1. Add pathwaysutils for MaxText to manifest file (#1065)

    The latest MaxText uses `pathwayutils`, which is added as a dependency.
    Need to add it to our manifest.yaml file to resolve reference issue
    during final installation.
    DwarKapex authored Sep 27, 2024
    Configuration menu
    Copy the full SHA
    3638a66 View commit details
    Browse the repository at this point in the history

Commits on Oct 1, 2024

  1. nsys-jax post-processing: treat host-device copies as 1-device collec…

    …tives (#1073)
    
    This adds logic to treat `dynamic[-update]-slice` operations that have a
    source/destination operand in the host memory space as being
    communication operations, labelling them as single-device "collectives".
    
    The goal is to improve support for analysing profiles of execution
    including offloading to host memory.
    
    Also fix using nsys 2024.6 by applying the same patch as 2024.5 that
    adds the thread ID.
    olupton authored Oct 1, 2024
    Configuration menu
    Copy the full SHA
    ef3fd66 View commit details
    Browse the repository at this point in the history