diff --git a/CHANGELOG.md b/CHANGELOG.md index a5be4d71..15dc84dd 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -19,6 +19,7 @@ Documentation for rocThrust available at ### Known issues * `thrust::reduce_by_key` outputs are not bit-wise reproducible, as run-to-run results for pseudo-associative reduction operators (e.g. floating-point arithmetic operators) are not deterministic on the same device. +* Note that currently, rocThrust memory allocation is performed in such a way that most algorithmic API functions cannot be called from within hipGraphs. ## rocThrust 3.0.0 for ROCm 6.0 diff --git a/docs/environment.yml b/docs/environment.yml index fa46ba40..599d9013 100644 --- a/docs/environment.yml +++ b/docs/environment.yml @@ -3,7 +3,7 @@ channels: - conda-forge - defaults dependencies: - - python=3.8 + - python=3.10 - pip - doxygen=1.9.8 - pip: diff --git a/docs/hip-execution-policies.rst b/docs/hip-execution-policies.rst new file mode 100644 index 00000000..c8b03b1a --- /dev/null +++ b/docs/hip-execution-policies.rst @@ -0,0 +1,87 @@ +.. meta:: + :description: rocThrust documentation and API reference + :keywords: rocThrust, ROCm, API, reference, execution policy + +.. _hip-execution-policies: + +****************************************** +Execution Policies +****************************************** + +In addition to the standard Thrust execution policies (eg. ``thrust::host``, ``thrust::device``, ``thrust::seq``), +rocThrust's HIP backend provides the following: + +* ``hip_rocprim::par`` - This policy causes algorithms to be launched in a parallel configuration. + API calls are blocking (synchronous with respect to the host). + +* ``hip_rocprim::par_nosync`` - This policy tells Thrust that algorithms may avoid synchronization + barriers when it is possible to do so. As a result, algorithms may be launched asynchronously with + respect to the host. This can allow you to perform other host-side work while the algorithms + are running on the device. If you use this policy, you must synchronize before accessing results + on the host side. + +The example below illustrates the behaviour of these two policies. + +.. code-block:: cpp + + #include + #include + #include + #include + #include + #include + #include + #include + #include + + int main(int argc, char* argv[]) + { + // Allocate host and device vectors. + const size_t size = 100; + thrust::host_vector h_vec(size); + thrust::device_vector d_vec1(size); + thrust::device_vector d_vec2(size); + + // Fill host vector with random values. + const int limit = 100; + auto seed = std::time(nullptr); + thrust::default_random_engine rng(seed); + for (int i = 0; i < size; i++) + h_vec[i] = rng() % limit; + + // Copy data to device vectors. + d_vec1 = h_vec; + d_vec2 = h_vec; + + // Launch some algorithms using the hip_rocprim::par policy. + // The calls below are blocking with respect to the host. + // However, internally, each algorithm will run in parallel. + auto par_policy = thrust::hip_rocprim::par; + int count = thrust::count(par_policy, d_vec1.begin(), d_vec1.end(), 50); + int reduction = thrust::reduce(par_policy, d_vec2.begin(), d_vec2.end()); + + // Print out the results. + std::cout << "par results:" << std::endl; + std::cout << "count: " << count << std::endl; + std::cout << "reduction: " << reduction << std::endl; + + // Launch the algorithms using the hip_rocprim::par_nosync policy. + // These calls may not be blocking with respect to the host. + auto nosync_policy = thrust::hip_rocprim::par_nosync; + int count2 = thrust::count(nosync_policy, d_vec1.begin(), d_vec1.end(), 50); + int reduction2 = thrust::reduce(nosync_policy, d_vec2.begin(), d_vec2.end()); + + // We can perform other host-side work here, and it may overlap with the + // algorithms launched above. + DoHostSideWork(); + + // We must synchronize before accessing the results on the host. + hipDeviceSynchronize(); + + // Print out the results. + std::cout << "par_nosync results:" << std::endl; + std::cout << "count: " << count2 << std::endl; + std::cout << "reduction: " << reduction2 << std::endl; + + return 0; + } \ No newline at end of file diff --git a/docs/hipgraph-support.rst b/docs/hipgraph-support.rst new file mode 100644 index 00000000..f24d0c1a --- /dev/null +++ b/docs/hipgraph-support.rst @@ -0,0 +1,24 @@ +.. meta:: + :description: rocThrust documentation and API reference + :keywords: rocThrust, ROCm, API, reference, hipGraph + +.. _hipgraph-support: + +****************************************** +hipGraph Support +****************************************** +Currently, rocThrust does not support the use of ``hipGraphs``. ``hipGraphs`` are not allowed to contain any synchronous +function calls or barriers. Thrust API functions are blocking (synchronous with respect to the host) by default. + +Thrust does provide asynchronous versions of a number of algorithms. These are contained in the ``thrust::async`` namespace +(see the headers in ``rocThrust/thrust/async/``). However, these algorithms operate asynchronously by returning futures. +This approach is different from the form of asynchronous execution required within ``hipGraphs``, which must be achieved by +issuing calls into a user-defined ``hipStream``. + +While it is possible to create execution policies that encourage Thrust API algorithms to execute within a user-defined stream, +(eg. ``thrust::hip_rocprim::par.on(stream)``), this is not enough to guarentee that synchronization will not occur within +a given algorithm. This is because some Thrust functions require execution policies to be passed in at compile time (via template +arguments) rather than at runtime. Since streams must be created at runtime, there is no way to pass these functions a stream. +Adding a stream argument to such functions breaks compatibility with the Thrust API. + +For these reasons, we recommend that you do not use hipGraphs together with rocThrust code. \ No newline at end of file diff --git a/docs/index.rst b/docs/index.rst index bf7454ed..b6921a54 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -25,7 +25,9 @@ The documentation is structured as follows: * :ref:`data-type-support` * :ref:`bitwise-repro` + * :ref:`hipgraph-support` * :ref:`api-reference` + * :ref:`hip-execution-policies` * :ref:`genindex` To contribute to the documentation, refer to diff --git a/docs/sphinx/requirements.in b/docs/sphinx/requirements.in index 0a3deabf..c00b16b0 100644 --- a/docs/sphinx/requirements.in +++ b/docs/sphinx/requirements.in @@ -1 +1 @@ -rocm-docs-core==0.38.1 +rocm-docs-core==1.4.1 diff --git a/docs/sphinx/requirements.txt b/docs/sphinx/requirements.txt index 71cac415..967422a2 100644 --- a/docs/sphinx/requirements.txt +++ b/docs/sphinx/requirements.txt @@ -1,112 +1,106 @@ # -# This file is autogenerated by pip-compile with Python 3.8 +# This file is autogenerated by pip-compile with Python 3.10 # by the following command: # # pip-compile requirements.in # -accessible-pygments==0.0.3 +accessible-pygments==0.0.5 # via pydata-sphinx-theme -alabaster==0.7.13 +alabaster==0.7.16 # via sphinx -babel==2.12.1 +babel==2.15.0 # via # pydata-sphinx-theme # sphinx -beautifulsoup4==4.11.2 +beautifulsoup4==4.12.3 # via pydata-sphinx-theme -breathe==4.34.0 +breathe==4.35.0 # via rocm-docs-core -certifi==2023.7.22 +certifi==2024.6.2 # via requests -cffi==1.15.1 +cffi==1.16.0 # via # cryptography # pynacl -charset-normalizer==3.1.0 +charset-normalizer==3.3.2 # via requests -click==8.1.3 +click==8.1.7 # via sphinx-external-toc -cryptography==42.0.4 +cryptography==42.0.8 # via pyjwt -deprecated==1.2.13 +deprecated==1.2.14 # via pygithub -docutils==0.19 +docutils==0.21.2 # via # breathe # myst-parser # pydata-sphinx-theme # sphinx -fastjsonschema==2.16.3 +fastjsonschema==2.19.1 # via rocm-docs-core -gitdb==4.0.10 +gitdb==4.0.11 # via gitpython -gitpython==3.1.41 +gitpython==3.1.43 # via rocm-docs-core idna==3.7 # via requests imagesize==1.4.1 # via sphinx -importlib-metadata==6.8.0 - # via sphinx -importlib-resources==6.1.1 - # via rocm-docs-core -jinja2==3.1.3 +jinja2==3.1.4 # via # myst-parser # sphinx -markdown-it-py==2.2.0 +markdown-it-py==3.0.0 # via # mdit-py-plugins # myst-parser -markupsafe==2.1.2 +markupsafe==2.1.5 # via jinja2 -mdit-py-plugins==0.3.5 +mdit-py-plugins==0.4.1 # via myst-parser mdurl==0.1.2 # via markdown-it-py -myst-parser==1.0.0 +myst-parser==3.0.1 # via rocm-docs-core -packaging==23.0 +packaging==24.0 # via # pydata-sphinx-theme # sphinx -pycparser==2.21 +pycparser==2.22 # via cffi -pydata-sphinx-theme==0.13.3 +pydata-sphinx-theme==0.15.3 # via # rocm-docs-core # sphinx-book-theme -pygithub==1.58.1 +pygithub==2.3.0 # via rocm-docs-core -pygments==2.15.0 +pygments==2.18.0 # via # accessible-pygments # pydata-sphinx-theme # sphinx -pyjwt[crypto]==2.6.0 +pyjwt[crypto]==2.8.0 # via pygithub pynacl==1.5.0 # via pygithub -pytz==2023.3.post1 - # via babel -pyyaml==6.0 +pyyaml==6.0.1 # via # myst-parser # rocm-docs-core # sphinx-external-toc -requests==2.31.0 +requests==2.32.3 # via # pygithub # sphinx -rocm-docs-core==0.38.1 +rocm-docs-core==1.4.1 # via -r requirements.in -smmap==5.0.0 +smmap==5.0.1 # via gitdb snowballstemmer==2.2.0 # via sphinx -soupsieve==2.4 +soupsieve==2.5 # via beautifulsoup4 -sphinx==5.3.0 +sphinx==7.3.7 # via # breathe # myst-parser @@ -117,35 +111,37 @@ sphinx==5.3.0 # sphinx-design # sphinx-external-toc # sphinx-notfound-page -sphinx-book-theme==1.0.1 +sphinx-book-theme==1.1.2 # via rocm-docs-core -sphinx-copybutton==0.5.1 +sphinx-copybutton==0.5.2 # via rocm-docs-core -sphinx-design==0.4.1 +sphinx-design==0.6.0 # via rocm-docs-core -sphinx-external-toc==0.3.1 +sphinx-external-toc==1.0.1 # via rocm-docs-core -sphinx-notfound-page==0.8.3 +sphinx-notfound-page==1.0.2 # via rocm-docs-core -sphinxcontrib-applehelp==1.0.4 +sphinxcontrib-applehelp==1.0.8 # via sphinx -sphinxcontrib-devhelp==1.0.2 +sphinxcontrib-devhelp==1.0.6 # via sphinx -sphinxcontrib-htmlhelp==2.0.1 +sphinxcontrib-htmlhelp==2.0.5 # via sphinx sphinxcontrib-jsmath==1.0.1 # via sphinx -sphinxcontrib-qthelp==1.0.3 +sphinxcontrib-qthelp==1.0.7 # via sphinx -sphinxcontrib-serializinghtml==1.1.5 +sphinxcontrib-serializinghtml==1.1.10 # via sphinx -typing-extensions==4.5.0 - # via pydata-sphinx-theme -urllib3==1.26.18 - # via requests -wrapt==1.15.0 - # via deprecated -zipp==3.17.0 +tomli==2.0.1 + # via sphinx +typing-extensions==4.12.2 # via - # importlib-metadata - # importlib-resources + # pydata-sphinx-theme + # pygithub +urllib3==2.2.2 + # via + # pygithub + # requests +wrapt==1.16.0 + # via deprecated diff --git a/thrust/replace.h b/thrust/replace.h index 0c3fa5b1..68d55983 100644 --- a/thrust/replace.h +++ b/thrust/replace.h @@ -49,7 +49,7 @@ THRUST_NAMESPACE_BEGIN * \tparam DerivedPolicy The name of the derived execution policy. * \tparam ForwardIterator is a model of Forward Iterator, * and \p ForwardIterator is mutable. - * \tparam T is a model of Assignable, + * \tparam T is a model of Assignable, * \p T is a model of EqualityComparable, * objects of \p T may be compared for equality with objects of * \p ForwardIterator's \c value_type, @@ -101,7 +101,7 @@ __host__ __device__ * * \tparam ForwardIterator is a model of Forward Iterator, * and \p ForwardIterator is mutable. - * \tparam T is a model of Assignable, + * \tparam T is a model of Assignable, * \p T is a model of EqualityComparable, * objects of \p T may be compared for equality with objects of * \p ForwardIterator's \c value_type,