Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make SYCL runtime objects static. #1865

Merged
merged 1 commit into from
Jan 17, 2023

Conversation

j-stephan
Copy link
Member

This is a supporting effort for #1845. The alpaka SYCL platforms are now specialized using the SYCL device selector. Each specialization will contain (as static members) the corresponding sycl::platform, sycl::context, std::vector<sycl::device>.

CC @fwyzard @Parsifal-2045 @AuroraPerego

@fwyzard
Copy link
Contributor

fwyzard commented Dec 12, 2022

Hi @j-stephan, if I read the code correctly, these changes would build all SYCL platforms, contexts and devices at program initialisation time:

template<typename TSelector>
class PltfGenericSycl : public concepts::Implements<ConceptPltf, PltfGenericSycl<TSelector>> {
    ...
    inline static sycl::platform sycl_platform{TSelector{}};
    inline static std::vector<sycl::device> sycl_devices{sycl_platform.get_devices()};
    inline static sycl::context sycl_context{sycl_devices, async_handler};
};

We tried this approach and run into some with in the past, due to the order of initialisation and of destruction.

Now we switched to some kind of lazy initialisation, where the objects are initialised the first time they are used instead of at application initialisation time, and an explicit destruction / finalisation.

@j-stephan
Copy link
Member Author

We tried this approach and run into some with in the past, due to the order of initialisation and of destruction.

Interesting. Do you remember how you approached this? According to cppreference.com (see points 2 and 3) the way I implemented this here should work since the initializations happen in the right order and in the same translation unit. See also this StackOverflow question + answer which addresses the same issue.

CC @bernhardmgruber , our local C++ lawyer.

@fwyzard
Copy link
Contributor

fwyzard commented Dec 12, 2022

According to cppreference.com (see points 2 and 3) the way I implemented this here should work since the initializations happen in the right order and in the same translation unit.

Sorry for not being clear - I didn't mean initialisation order between the SYCL objects, I meant between them and other global objects.

For example, at some point we were having a crash because the destructor of cout/cerr was done before the destruction of the SYCL objects, while they are/were trying to print a message during the destruction.

@fwyzard
Copy link
Contributor

fwyzard commented Dec 12, 2022

Another problem actually was with CUDA, thought it may be a general issue that affects other runtimes as well: using a static CUDA object would prevent the CUDA profiling tools to work.

@j-stephan
Copy link
Member Author

Oh, I see. Yes, that makes sense. I'll change it to the "construct on first use" idiom.

@fwyzard
Copy link
Contributor

fwyzard commented Dec 12, 2022

More in general, we found that initialising all runtimes and objects irrespective if they are used or not may range from being sub-optimal (slow startup and extra resource usage) to outright problematic (some SYCL runtimes would crash).

@bernhardmgruber
Copy link
Member

I would also go for lazy initialization:

template<typename TSelector>
class PltfGenericSycl : public concepts::Implements<ConceptPltf, PltfGenericSycl<TSelector>> {
    ...
    static auto syclPlatform() {
        static sycl::platform p{TSelector{}};
        return p:
    }
...
};

As Andrea said, I can imagine interferrence or indeterminism if we start to construct all objects right at program startup. We may still run into issues on program termination, but this is something we need to test in practice.

@fwyzard
Copy link
Contributor

fwyzard commented Dec 12, 2022

As Andrea said, I can imagine interferrence or indeterminism if we start to construct all objects right at program startup. We may still run into issues on program termination, but this is something we need to test in practice.

I would suggest to add a static void reset() method to the SYCL platforms, so that people can call it explicitly if they run into those kind of problems.

We would be happy to have it and call it :-)

@j-stephan
Copy link
Member Author

j-stephan commented Dec 12, 2022

Just for clarity: What would reset() do? Destruct everything and then reconstruct everything? Or should it act as a shutdown mechanism?

@fwyzard
Copy link
Contributor

fwyzard commented Dec 12, 2022

I'm just making this up as we go along... I guess it should destruct the static objects in the correct order, and leave things in a pre-initialisation state.

So if anything re-uses the objects afterwords, they would just be re-initialised from scratch.

@fwyzard
Copy link
Contributor

fwyzard commented Dec 12, 2022

From a quick look, I'm concerned that this approach is not thread-safe: calling e.g. syclDevices() from more than one thread could create more than one collection of devices, with the first one being destroyed leading to some left-over objects.

@j-stephan j-stephan force-pushed the sycl_static_pltf branch 2 times, most recently from 0f8f313 to 275424c Compare December 13, 2022 13:18
Copy link
Member

@bernhardmgruber bernhardmgruber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In principle, I think this PR is fine. I have a couple of questions and suggestions though, but none of them necessarily blocks a merge.

include/alpaka/pltf/PltfGenericSycl.hpp Outdated Show resolved Hide resolved
include/alpaka/pltf/PltfGenericSycl.hpp Outdated Show resolved Hide resolved
include/alpaka/pltf/PltfGenericSycl.hpp Outdated Show resolved Hide resolved
include/alpaka/pltf/PltfGenericSycl.hpp Outdated Show resolved Hide resolved
include/alpaka/pltf/PltfGenericSycl.hpp Outdated Show resolved Hide resolved
Copy link
Contributor

@fwyzard fwyzard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes, I think this approach should be fine.

I am vaguely concerned if we could run into a dead lock if different threads try to call different initialisation functions at the same time. Or, if one thread tries to re-initialise an object while another one is calling reset().

My naive approach would have bee to use a single mutex: then, either initialise all objects in a single method, or split the methods into a public interface (that does the locking) and a private implementation (that does the initialisation).

That said, if the current approach is deemed safe against race conditions and dead locks, I'm fine with it.

@j-stephan
Copy link
Member Author

Thanks for the comments. I'll address them in the afternoon, so please don't merge just yet.

@j-stephan j-stephan force-pushed the sycl_static_pltf branch 2 times, most recently from d89ef94 to 62a5258 Compare January 16, 2023 10:09
fwyzard
fwyzard previously approved these changes Jan 16, 2023
Copy link
Contributor

@fwyzard fwyzard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the latest round of changes, everything looks good to me now.

Being paranoid about performance, you might make two changes:

  • replace std::optional<std::vector<sycl::device>> with simply std::vector<sycl::device>, and leave the vector empty until it is initialised
  • replace all call top optional.value() with *optional, since they are anyway protected by if (initialized)

Neither of this is likely to show up in any kind of profiling, so do use the style you like better.

@bernhardmgruber bernhardmgruber merged commit 9716c56 into alpaka-group:develop Jan 17, 2023
@j-stephan j-stephan deleted the sycl_static_pltf branch January 17, 2023 12:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants