fix and update SYCL targets #2390

AuroraPerego · 2024-09-18T15:37:35Z

Now all device targets are defined as 0, while the one(s) we are compiling for are defined as 1. Therefore # if defined cannot be used anymore.

I've added some new targets as well.

Note that there is probably a bug with NVIDIA targets, but we are not using them for now.

fwyzard

I need to check some of the updates.

fwyzard · 2024-09-19T07:26:48Z

@psychocoderHPC the error in mathTest (link, attached log) should be unrelated to this PR:

Randomness seeded to: 2485158815
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mathTest is a Catch2 v3.5.2 host application.
Run with -? for options
-------------------------------------------------------------------------------
mathOpsComplexFloat - TestAccFunctorTuplesComplex - 49
-------------------------------------------------------------------------------
/builds/hzdr/crp/alpaka/test/unit/math/src/mathComplexFloat.cpp:26
...............................................................................
/builds/hzdr/crp/alpaka/test/unit/math/src/TestTemplate.hpp:145: FAILED:
  REQUIRE( isApproxEqual(results(i), std_result) )
with expansion:
  false
with messages:
  testing acc:alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::
  integral_constant<unsigned long, 1ul>, unsigned long> data type:alpaka::
  internal::Complex<float> functor:mathtest::OpPowComplex seed:4140714306
  Operator: OpPowComplex
  Type: alpaka::internal::Complex<float>
  The args buffer: 
  capacity: 1000
  0: [ (1e+01,1e+01), (0,0) ]
  ...
  999: [ (2,2), (-0.8,-0.3) ]
  
  Idx i: 1 computed : (506.109,-1.62104e+06) vs expected: (509.202,-1.62104e+
  06)
===============================================================================
test cases:    278 |    277 passed | 1 failed
assertions: 267086 | 267085 passed | 1 failed

Any ideas what may have caused it or how to reproduce it ?

psychocoderHPC · 2024-09-19T07:50:51Z

I retriggered the failing test

fwyzard

After reviewing the behaviour of HIP with @AuroraPerego, we think that all AMD GPUs starting from gfx 10.0 should support only a subgroup size of 32.

psychocoderHPC · 2024-09-19T12:21:26Z

After reviewing the behaviour of HIP with @AuroraPerego, we think that all AMD GPUs starting from gfx 10.0 should support only a subgroup size of 32.

Yes I checked it against https://rocm.docs.amd.com/en/latest/reference/gpu-arch-specs.html#accelerator-and-gpu-hardware-specifications (the table has different tabs)

and AMD is writing https://rocm.docs.amd.com/projects/HIP/en/latest/understand/hardware_implementation.html#rdna-architecture

RDNA makes a fundamental change to CU design, by changing the size of a warp to 32 threads. 
This is done by effectively combining two GCN5 SIMDs, creating a VALU of width 32, 
so that a whole warp can be issued in one cycle. 
The CU is also replaced by the work group processor (WGP), 
which encompasses two CUs. For backwards compatibility the WGP can also run in wave64 mode, 
in which it issues a warp of size 64 in two cycles.

fwyzard · 2024-09-19T13:11:16Z

Yes... the RDNA1, RDNA2 and RDNA3 architecture whitepapers suggest that both 32 and 64 can be used.

And clang++ does have an -mwavefrontsize64 option to enable wavefronts (warps) with 64 elements.

But HIP does not like it:

`/opt/rocm/include/hip/hip_runtime.h`

#if __HIP_DEVICE_COMPILE__ && !__GFX7__ && !__GFX8__ && !__GFX9__ && __AMDGCN_WAVEFRONT_SIZE == 64
#error HIP is not supported on the specified GPU ARCH with wavefront size 64
#endif

I assume that the ROCm backend for SYCL/oneAPI is built with HIP, so on gfx 10 and later we should use only a wavefront size of 32.

include/alpaka/kernel/SyclSubgroupSize.hpp

now all targets are defined as 0 (the one we are compiling for as 1), therefore `if defined` cannot be used. Co-authored-by: Andrea Bocci <fwyzard@gmail.com>

fwyzard approved these changes Sep 18, 2024

View reviewed changes

fwyzard requested changes Sep 18, 2024

View reviewed changes

psychocoderHPC added this to the 1.2.0 milestone Sep 19, 2024

psychocoderHPC added Type:Refactoring Backend:SYCL labels Sep 19, 2024

fwyzard requested changes Sep 19, 2024

View reviewed changes

AuroraPerego force-pushed the SYCLsubGroups branch from 926ee82 to f44a981 Compare September 19, 2024 11:42

fwyzard reviewed Sep 19, 2024

View reviewed changes

include/alpaka/kernel/SyclSubgroupSize.hpp Outdated Show resolved Hide resolved

fwyzard reviewed Sep 19, 2024

View reviewed changes

include/alpaka/kernel/SyclSubgroupSize.hpp Outdated Show resolved Hide resolved

fwyzard reviewed Sep 19, 2024

View reviewed changes

include/alpaka/kernel/SyclSubgroupSize.hpp Outdated Show resolved Hide resolved

add new targets and move to if

d378ed0

now all targets are defined as 0 (the one we are compiling for as 1), therefore `if defined` cannot be used. Co-authored-by: Andrea Bocci <fwyzard@gmail.com>

AuroraPerego force-pushed the SYCLsubGroups branch from 228aac2 to d378ed0 Compare September 19, 2024 13:48

fwyzard approved these changes Sep 19, 2024

View reviewed changes

psychocoderHPC approved these changes Sep 20, 2024

View reviewed changes

psychocoderHPC merged commit 5c5a690 into alpaka-group:develop Sep 20, 2024
22 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix and update SYCL targets #2390

fix and update SYCL targets #2390

AuroraPerego commented Sep 18, 2024

fwyzard left a comment

fwyzard commented Sep 19, 2024 •

edited

Loading

psychocoderHPC commented Sep 19, 2024

fwyzard left a comment

psychocoderHPC commented Sep 19, 2024 •

edited

Loading

fwyzard commented Sep 19, 2024

fix and update SYCL targets #2390

fix and update SYCL targets #2390

Conversation

AuroraPerego commented Sep 18, 2024

fwyzard left a comment

Choose a reason for hiding this comment

fwyzard commented Sep 19, 2024 • edited Loading

psychocoderHPC commented Sep 19, 2024

fwyzard left a comment

Choose a reason for hiding this comment

psychocoderHPC commented Sep 19, 2024 • edited Loading

fwyzard commented Sep 19, 2024

/opt/rocm/include/hip/hip_runtime.h

fwyzard commented Sep 19, 2024 •

edited

Loading

psychocoderHPC commented Sep 19, 2024 •

edited

Loading

`/opt/rocm/include/hip/hip_runtime.h`