Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add info about new contraction data type and permutation feature to README #167

Merged
merged 12 commits into from
Jan 30, 2024
2 changes: 2 additions & 0 deletions .wordlist.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
GoogleTest
rocm
120 changes: 85 additions & 35 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,11 @@ through general purpose kernel languages, like HIP C++.
* AMD CDNA class GPU featuring matrix core support:
gfx908, gfx90a, gfx940, gfx941, gfx942 as 'gfx9'

> Note: Double precision FP64 datatype support requires
> gfx90a, gfx940, gfx941 or gfx942
:::{note}
Double precision FP64 datatype support requires gfx90a, gfx940, gfx941 or gfx942
:::

## Minimum Software Requirements
## Minimum software requirements

* ROCm stack minimum version 5.7
* ROCm-cmake minimum version 0.8.0 for ROCm 5.7
Expand All @@ -28,7 +29,7 @@ Optional:

Run the steps below to build documentation locally.

```shell
```bash
cd docs

pip3 install -r sphinx/requirements.txt
Expand All @@ -38,15 +39,38 @@ python3 -m sphinx -T -E -b html -d _build/doctrees -D language=en . _build/html

## Currently supported

Operations - Contraction Tensor
Data Types - FP32 , FP64
### Operation: Contraction tensor

Supported data-type combinations are:

| typeA | typeB | typeC | typeCompute | notes |
|-------------|-------------|-------------|-------------------|------------------------------------|
| bf16 | bf16 | bf16 | f32 | |
CongMa13 marked this conversation as resolved.
Show resolved Hide resolved
| __half | __half | __half | f32 | |
| f32 | f32 | f32 | bf16 | |
| f32 | f32 | f32 | __half | |
| f32 | f32 | f32 | f32 | |
| f64 | f64 | f64 | f32 | f64 is only supported on gfx90a + |
| f64 | f64 | f64 | f64 | f64 is only supported on gfx90a + |
| cf32 | cf32 | cf32 | cf32 | cf32 is only supported on gfx90a + |
| cf64 | cf64 | cf64 | cf64 | cf64 is only supported on gfx90a + |

### Operation: Permutation tensor

Supported data-type combinations are:

| typeA | typeB | descCompute | notes |
|-----------|-----------|-----------------|-------|
| f16 | f16 | f16 | |
| f16 | f16 | f32 | |
| f32 | f32 | f32 | |

## Contributing to the code

1. Create and track a hipTensor fork.
2. Clone your fork:

```shell
```bash
git clone -b develop https://github.com/<your_fork>/hipTensor.git .
.githooks/install
git checkout -b <new_branch>
Expand All @@ -69,24 +93,24 @@ git push origin <new_branch>

### Project options

|Option|Description|Default Value|
|---|---|---|
|AMDGPU_TARGETS|Build code for specific GPU target(s)|gfx908:xnack-;gfx90a:xnack-;gfx90a:xnack+;gfx940;gfx941;gfx942|
|HIPTENSOR_BUILD_TESTS|Build Tests|ON|
|HIPTENSOR_BUILD_SAMPLES|Build Samples|ON|
| Option | Description | Default Value |
|-------------------------|---------------------------------------|----------------------------------------------------------------|
| AMDGPU_TARGETS | Build code for specific GPU target(s) | gfx908:xnack-;gfx90a:xnack-;gfx90a:xnack+;gfx940;gfx941;gfx942 |
| HIPTENSOR_BUILD_TESTS | Build Tests | ON |
| HIPTENSOR_BUILD_SAMPLES | Build Samples | ON |

### Example configurations

By default, the project is configured as Release mode.
Here are some of the examples for the configuration:
|Configuration|Command|
|---|---|
|Basic|`CC=hipcc CXX=hipcc cmake -B<build_dir> .`|
|Targeting gfx908|`CC=hipcc CXX=hipcc cmake -B<build_dir> . -DAMDGPU_TARGETS=gfx908:xnack-` |
|Debug build|`CC=hipcc CXX=hipcc cmake -B<build_dir> . -DCMAKE_BUILD_TYPE=Debug` |
|Build without tests (default on)|`CC=hipcc CXX=hipcc cmake -B<build_dir> . -DHIPTENSOR_BUILD_TESTS=OFF` |
| Configuration | Command |
|----------------------------------|---------------------------------------------------------------------------|
| Basic | `CC=hipcc CXX=hipcc cmake -B<build_dir> .` |
| Targeting gfx908 | `CC=hipcc CXX=hipcc cmake -B<build_dir> . -DAMDGPU_TARGETS=gfx908:xnack-` |
| Debug build | `CC=hipcc CXX=hipcc cmake -B<build_dir> . -DCMAKE_BUILD_TYPE=Debug` |
| Build without tests (default on) | `CC=hipcc CXX=hipcc cmake -B<build_dir> . -DHIPTENSOR_BUILD_TESTS=OFF` |

After configuration, build with `cmake --build <build_dir> -- -j<nproc>`
After configuration, build with `cmake --build <build_dir> -- -j<nproc>`.

### Tips to reduce tests compile time

Expand All @@ -99,44 +123,70 @@ After configuration, build with `cmake --build <build_dir> -- -j<nproc>`

Tests API implementation of logger verbosity and functionality.

* `<build_dir>/bin/logger_test`
```bash
<build_dir>/bin/logger_test
```

## Running Contraction Tests
## Running contraction tests

### Bilinear contraction tests
* Bilinear contraction tests

Tests the API implementation of bilinear contraction algorithm with validation.

* `<build_dir>/bin/bilinear_contraction_f32_test`
* `<build_dir>/bin/bilinear_contraction_f64_test`
```bash
<build_dir>/bin/bilinear_contraction_test
<build_dir>/bin/complex_bilinear_contraction_test
```

### Scale contraction tests
* Scale contraction tests

Tests the API implementation of scale contraction algorithm with validation.

* `<build_dir>/bin/scale_contraction_f32_test`
* `<build_dir>/bin/scale_contraction_f64_test`
```bash
<build_dir>/bin/scale_contraction_test
<build_dir>/bin/complex_scale_contraction_test
```

## Running permutation tests

### Samples
Test API implementation of the permutation algorithm with validation.

```bash
<build_dir>/bin/permutation_test
```

## Samples

These are stand-alone use-cases of the hipTensor contraction operations.

## F32 Bilinear contraction
### F32 bilinear contraction

Demonstrates the API implementation of bilinear contraction operation without validation.

* `<build_dir>/bin/simple_contraction_bilinear_f32`
```bash
<build_dir>/bin/simple_bilinear_contraction_<typeA>_<typeB>_<typeC>_<typeD>_compute_<computeType>
```

## F32 Scale contraction
### F32 scale contraction

Demonstrates the API implementation of scale contraction operation without validation.

* `<build_dir>/bin/simple_contraction_scale_f32`
```bash
<build_dir>/bin/simple_scale_contraction_<typeA>_<typeB>_<typeD>_compute_<typeCompute>
```

### Permutation

Demonstrates the API implementation of permutation operation without validation.

```bash
<build_dir>/bin/simple_permutation
```

### Build Samples as external client
### Build samples as external client

Client application links to hipTensor library,
and therefore hipTensor library needs to be installed before building client applications.
The client application links to the hipTensor library; therefore, you must install the
hipTensor library before building client applications.

## Build

Expand Down
2 changes: 1 addition & 1 deletion docs/Contributors_Guide.rst
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ The hipTensor repository follows a workflow which dictates a /master branch wher
- ensure code builds successfully.
- do not break existing test cases
- new functionality will only be merged with new unit tests
- new unit tests should integrate within the existing googletest framework.
- new unit tests should integrate within the existing GoogleTest framework.
- tests must have good code coverage
- code must also have benchmark tests, and performance must approach
the compute bound limit or memory bound limit.
Expand Down
2 changes: 1 addition & 1 deletion docs/Linux_Install_Guide.rst
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ For Centos use
yum info rocm-libs

The ROCm version has major, minor, and patch fields, possibly followed by a build specific identifier. For example the ROCm version could be 4.0.0.40000-23, this corresponds to major = 4, minor = 0, patch = 0, build identifier 40000-23.
There are GitHub branches at the hipTensor site with names rocm-major.minor.x where major and minor are the same as in the ROCm version. For ROCm version 4.0.0.40000-23 you need to use the following to download hipTensor:
There are GitHub branches at the hipTensor site with names `rocm-major.minor.x` where major and minor are the same as in the ROCm version. For ROCm version 4.0.0.40000-23 you need to use the following to download hipTensor:

::

Expand Down