Skip to content

Commit

Permalink
Update ROCm CI (#357)
Browse files Browse the repository at this point in the history
Co-authored-by: Binyang Li <binyli@microsoft.com>
  • Loading branch information
chhwang and Binyang2014 authored Sep 20, 2024
1 parent 74130c7 commit 8a330f9
Show file tree
Hide file tree
Showing 3 changed files with 53 additions and 9 deletions.
1 change: 0 additions & 1 deletion .azure-pipelines/integration-test-rocm.yml
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,6 @@ jobs:
set -e
git clone https://$(GIT_USER):$(GIT_PAT)@msazure.visualstudio.com/DefaultCollection/One/_git/azure-mscclpp
cd azure-mscclpp
git checkout binyli/ci
mkdir execution-files
python3 algos/allreduce_mi300_packet.py 8 8 > execution-files/allreduce_mi300_packet.json
python3 algos/allreduce_mi300_sm_mscclpp.py 8 8 > execution-files/allreduce_mi300_sm_mscclpp.json
Expand Down
58 changes: 52 additions & 6 deletions .github/workflows/codeql-analysis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,11 +9,11 @@ on:
- cron: "30 1 * * 1"

jobs:
analyze:
name: Analyze
analyze-cuda:
name: Analyze (CUDA)
runs-on: 'ubuntu-latest'
container:
image: ghcr.io/microsoft/mscclpp/mscclpp:base-dev-${{ matrix.cuda-version }}
image: ghcr.io/microsoft/mscclpp/mscclpp:base-dev-${{ matrix.version }}

permissions:
actions: read
Expand All @@ -24,7 +24,7 @@ jobs:
fail-fast: false
matrix:
language: [ 'cpp', 'python' ]
cuda-version: [ 'cuda11.8', 'cuda12.2' ]
version: [ 'cuda11.8', 'cuda12.2' ]

steps:
- name: Checkout repository
Expand All @@ -45,10 +45,56 @@ jobs:
- name: Build
run: |
cmake -DBYPASS_GPU_CHECK=ON -DUSE_CUDA=ON .
rm -rf build && mkdir build && cd build
cmake -DBYPASS_GPU_CHECK=ON -DUSE_CUDA=ON ..
make -j
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v2
with:
category: "/language:${{matrix.language}}/cuda-version:${{matrix.cuda-version}}"
category: "/language:${{matrix.language}}/version:${{matrix.version}}"

analyze-rocm:
name: Analyze (ROCm)
runs-on: 'ubuntu-latest'
container:
image: ghcr.io/microsoft/mscclpp/mscclpp:base-dev-${{ matrix.version }}

permissions:
actions: read
contents: read
security-events: write

strategy:
fail-fast: false
matrix:
language: [ 'cpp', 'python' ]
version: [ 'rocm6.2' ]

steps:
- name: Checkout repository
uses: actions/checkout@v4

- name: Check disk space
run: |
df -h
- name: Initialize CodeQL
uses: github/codeql-action/init@v2
with:
languages: ${{ matrix.language }}

- name: Dubious ownership exception
run: |
git config --global --add safe.directory /__w/mscclpp/mscclpp
- name: Build
run: |
rm -rf build && mkdir build && cd build
CXX=/opt/rocm/bin/hipcc cmake -DBYPASS_GPU_CHECK=ON -DUSE_ROCM=ON ..
make -j
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v2
with:
category: "/language:${{matrix.language}}/version:${{matrix.version}}"
3 changes: 1 addition & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,7 @@
|--------------------------|-------------------|
| Unit Tests (CUDA) | [![Build Status](https://dev.azure.com/binyli/HPC/_apis/build/status%2Fmscclpp-ut?branchName=main)](https://dev.azure.com/binyli/HPC/_build/latest?definitionId=4&branchName=main) |
| Integration Tests (CUDA) | [![Build Status](https://dev.azure.com/binyli/HPC/_apis/build/status%2Fmscclpp-test?branchName=main)](https://dev.azure.com/binyli/HPC/_build/latest?definitionId=3&branchName=main) |

*NOTE (Nov 2023): Azure pipelines for ROCm will be added soon.*
| Integration Tests (ROCm) | [![Build Status](https://dev.azure.com/binyli/HPC/_apis/build/status%2Fmscclpp-test-rocm?branchName=main)](https://dev.azure.com/binyli/HPC/_build/latest?definitionId=7&branchName=main) |

A GPU-driven communication stack for scalable AI applications.

Expand Down

0 comments on commit 8a330f9

Please sign in to comment.