Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite the SYCL backend for the SYCL 2020 standard and USM allocations #1845

Merged
merged 10 commits into from
Jul 27, 2023

Commits on Jul 26, 2023

  1. Rewrite the SYCL backend for the SYCL 2020 standard and USM allocatio…

    …ns (part 1)
    
    Initial work to support the SYCL 2020 standard, using USM allocations instead of
    SYCL buffers and accessors:
      - bring the SYCL interface in line with the other backends, and remove the last
        uses of the alpaka::experimental namespace;
      - reimplement the alpaka memory buffers, memset and memcpy tasks for the USM
        SYCL backend;
      - make the SYCL native handles more consistent with the other backends;
      - use the oneAPI printf extension, and implement a workaround for the OpenCL
        limitation on variadic functions and the conflict with AMD HIP/ROCm device code;
      - add more debug print messages;
      - various fixes for kernel names, memory_scope Grid and atomics;
      - update copyright information.
    
    Initial work on the SYCL random number generators (not fully working yet).
    Parsifal-2045 authored and fwyzard committed Jul 26, 2023
    Configuration menu
    Copy the full SHA
    be6421e View commit details
    Browse the repository at this point in the history
  2. Rewrite the SYCL backend for the SYCL 2020 standard and USM allocatio…

    …ns (part 2)
    
    More changes to the SYCL backend:
      - move printf to alpaka/core and use it in ALPAKA_CHECK;
      - remove IsView -> false in mem/buf/sycl/Accessor;
      - remove wrong attribute in mem/buf/sycl/Copy;
      - remove the SYCL experimental BuildAccessor<BufGenericSycl>, use the default
        implementation from alpaka/mem/view.
    
    Fix the examples to work with the SYCL backend:
      - fix the accelerator in the vectorAdd example;
      - move AccCpuSerial at the end in the ExampleDefaultAcc, as it was preventing
        the SYCL accelerators from being selected.
    
    Complete the work on the SYCL random number generators.
    AuroraPerego authored and fwyzard committed Jul 26, 2023
    Configuration menu
    Copy the full SHA
    5b1b2b7 View commit details
    Browse the repository at this point in the history
  3. Rewrite the SYCL backend for the SYCL 2020 standard and USM allocatio…

    …ns (part 3)
    
    Update the documentation.
    
    Implement various fixes to the SYCL math functions:
      - add missing "if constexpr" to rsqrt();
      - do not call math function with mixed arguments; this fixes errors due to
        the implicit conversion between floating point types of different sizes
        in sycl::atan2() and sycl::pow();
      - add explicit type casts to silence warnings;
      - cast the result of isfinite/isinf/isnan to bool.
    
    Implement various fixes to the SYCL atomic functions:
      - fix the cas/compare_exchange loops;
      - clarify which atomic types are supported.
    
    Implement various fixes to the SYCL warp-level functions:
      - fix compilation warnings;
      - extract bits from sub_group_mask.
    
    Mark the use of global device variables and constants as undupported: the SYCL
    backend does not support global device variables and constants, yet.
    
    Add explicit checks on the dimensionality of the SYCL accelerator and work division.
    
    Silence warnings about the use of GNU extensions, and those coming from the
    Intel oneMKL and oneDPL headers.
    
    Update more tests for the SYCL backend:
      - add a special case for 0-dimensional tests;
      - disable the use of STL rand;
      - disable the test of global device variables and constants.
    fwyzard committed Jul 26, 2023
    Configuration menu
    Copy the full SHA
    97ebacd View commit details
    Browse the repository at this point in the history
  4. Rewrite the SYCL backend for the SYCL 2020 standard and USM allocatio…

    …ns (part 4)
    
    Update the documentation related to FPGAs.
    
    Various fixes and updates to the SYCL backend and tests, the copyright
    information and code formatting.
    j-stephan authored and fwyzard committed Jul 26, 2023
    Configuration menu
    Copy the full SHA
    085365a View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    0fadd2a View commit details
    Browse the repository at this point in the history
  6. Rewrite the SYCL memcpy and memset operations

    Rewrite the N-dimensional Copy and Set memory operations to support pitched
    memory buffers, based on the Cpu implementation. This may require more than one
    memset or memcpy call per operation, which is not supported by command group
    handlers. Rewrite the Copy and Set memory operations to use queues instead.
    fwyzard committed Jul 26, 2023
    Configuration menu
    Copy the full SHA
    93c607d View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    01afbee View commit details
    Browse the repository at this point in the history
  8. Support compile-time warp size in SYCL kernels

    Introduce a new optional trait to describe at compile time the warp size that a
    kernel should use.  The default behaviour is to let the back-end compiler pick
    the preferred size.
    
    Before launching a kernel with a compile-time sub-group size the user should
    query the sizes supported by the device, and choose accordingly.  If the device
    does not support the requested size, the SYCL runtime will throw a synchronous
    exception.
    
    During just-in-time (JIT) compilation this guarantees that a kernel is compiled
    only for the sizes supported by the device.  During ahead-of-time (AOT)
    compilation this is not enough, because the device is not known at compile
    time.  The SYCL specification mandates that the back-end compilers should not
    fail if a kernel uses unsupported features, like unsupported sub-group sizes.
    Unfortunately the Intel OpenCL CPU and GPU compilers currently fail with a hard
    error.  To work around this limitation, use the preprocessor macros defined
    when compiling AOT for the new SYCL targets to enable the compilation only for
    the sub-group sizes supported by each device.
    
    Note: while the CPU OpenCL back-end does support a sub-group size of 64, the
    SYCL code currently does not.  To avoid issues with the sub-group primitives
    always consider the sub-group size of 64 as not supported by the device.
    
    Other changes:
      - remove the use of SYCL streams in favour of the printf() extension;
      - remove the ALPAKA_FN_HOST attribute;
      - fix the GetSize test for the different sub-group sizes;
      - fix the use of sycl::exceptions;
      - use different member names for nd_item in different classes, to avoid
        ambiguous name lookup error when accessing the nd_item in the accelerator
        object.
    AuroraPerego authored and fwyzard committed Jul 26, 2023
    Configuration menu
    Copy the full SHA
    b26b05a View commit details
    Browse the repository at this point in the history
  9. Various fixes related to the SYCL back-end

      - add the missing specialization of CreateViewPlainPtr for SYCL devices
      - improve the comments on the ALPAKA_FN_INLINE macro
      - remove unnecessary ALPAKA_FN_HOST attributes
      - rename QueueGenericSyclBase::m_impl to m_spQueueImpl, to align with the other back-ends
    AuroraPerego authored and fwyzard committed Jul 26, 2023
    Configuration menu
    Copy the full SHA
    218ab6e View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    c66b7a5 View commit details
    Browse the repository at this point in the history