Stream aware outputs #5684

mzient · 2024-10-21T16:12:31Z

Category:

New feature (non-breaking change which adds functionality)
Refactoring (Redesign of existing code that doesn't affect functionality)

Description:

This PR adds support for returning the pipeline outputs as DLPack without copying.

Additional information:

Affected modules and functionalities:

This PR adds the following features:

stream parameters for Pipeline Run/Outputs
__dlpack__ interface for Tensors
It removes the _expose_dlpack_capsule as it was incomplete.

Key points relevant for the review:

Tests:

Checklist

Documentation

DALI team only

Requirements

Implements new requirements
Affects existing requirements
N/A

REQ IDs: N/A

JIRA TASK: DALI-4075

mzient · 2024-10-21T16:14:39Z

dali/pipeline/data/dltensor_obj.h

TODO: Remove this class entirely.

dali/test/python/dlpack/test_torch.py

dali/test/python/dlpack/test_torch_perf.py

dali-automaton · 2024-10-21T16:16:20Z

CI MESSAGE: [19571767]: BUILD STARTED

dali-automaton · 2024-10-21T16:25:10Z

CI MESSAGE: [19572115]: BUILD STARTED

dali-automaton · 2024-10-21T16:43:51Z

CI MESSAGE: [19572115]: BUILD FAILED

dali-automaton · 2024-10-22T09:13:50Z

CI MESSAGE: [19600864]: BUILD STARTED

dali-automaton · 2024-10-22T16:52:27Z

CI MESSAGE: [19600864]: BUILD FAILED

szkarpinski · 2024-10-23T10:52:00Z

dali/test/python/dlpack/test_torch.py

+            # convert the tensors in the batch to DLPack
+            batch = [torch.from_dlpack(t) for t in out]


Nit: This comment is slightly confusing: "convert to DLPack", but the function is from_dlpack. Maybe rephrase it a bit to indicate that we're not really converting to DLPack, but DALI->DLPack->Torch (without a copy)

szkarpinski · 2024-10-23T10:54:04Z

dali/test/python/dlpack/test_torch.py

+            for t in batch:
+                means[flat_idx] = torch.mean(t)
+                flat_idx += 1
+        # those are meant to overwrite the results if synchronization fails


maybe check that we're actually sharing the memory:

batch_a = [torch.from_dlpack(t) for t in out] batch_b = [torch.from_dlpack(t) for t in out] # now change batch_b and make sure that batch_a is changed as well

szkarpinski · 2024-10-23T10:56:56Z

dali/test/python/jax_plugin/test_integration.py

-    assert jax_array.device() == jax.devices()[0]
+    assert jax_array.device == jax.devices()[0]


A breaking change?

Slipped in... That's a breaking change... in JAX 0.4.31 :\

szkarpinski · 2024-10-23T10:58:16Z

dali/pipeline/data/dltensor.cc

 #include "dali/core/static_switch.h"

 namespace dali {

+class DLTensorGraveyard {


Maybe add some docs why we need this and how it works?

* Add output order handling to exec2 * Add CUDA stream to Outputs and SharedOutputs in Python bindings for Pipeline. * Refactor stream pointer handling in Python Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

dali-automaton · 2024-10-29T19:10:49Z

CI MESSAGE: [19878075]: BUILD STARTED

dali-automaton · 2024-10-29T22:37:34Z

CI MESSAGE: [19878075]: BUILD FAILED

Adjust dlpack tests. Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

dali-automaton · 2024-10-30T08:17:24Z

CI MESSAGE: [19899374]: BUILD STARTED

dali-automaton · 2024-10-30T21:26:33Z

CI MESSAGE: [19899374]: BUILD PASSED

mzient commented Oct 21, 2024

View reviewed changes

dali/pipeline/data/dltensor_obj.h Outdated

Copy link

Contributor Author

mzient Oct 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: Remove this class entirely.

github-advanced-security bot found potential problems Oct 21, 2024

View reviewed changes

mzient force-pushed the stream_aware_output branch from acb54b0 to 672f6f9 Compare October 21, 2024 16:22

dali-automaton assigned banasraf and szkarpinski Oct 22, 2024

szkarpinski reviewed Oct 23, 2024

View reviewed changes

mzient force-pushed the stream_aware_output branch 2 times, most recently from 9b42d28 to 71854e7 Compare October 29, 2024 14:49

mzient and others added 17 commits October 29, 2024 17:43

Return outputs in stream order.

c9b6f8f

* Add output order handling to exec2 * Add CUDA stream to Outputs and SharedOutputs in Python bindings for Pipeline. * Refactor stream pointer handling in Python Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>

[WIP]

e6e7c5a

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

[WIP]

fe8dbac

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

[WIP]

58c4a3d

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>

[WIP]

5c80301

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>

[WIP]

2b72643

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>

[WIP]

abb7ad5

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>

[WIP]

99de52b

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

[WIP]

674d450

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

[WIP]

815aa10

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>

Add DLTensorGraveyard for stream-conservative deletion.

1fa9664

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>

[WIP] Tests.

cf25bc9

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>

Tests

6d0c78e

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

[WIP]

e9b820d

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>

Enable the tests.

e0cf5ab

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>

Fix CodeQL issues.

c44f144

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>

Lint

1ced4f4

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>

mzient added 3 commits October 29, 2024 17:43

Revert JAX test change.

e2a20f8

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>

[WIP]

2bef25f

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>

Expose streams to Python in DALI tensors.

ee58ddf

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>

mzient force-pushed the stream_aware_output branch from 71854e7 to ee58ddf Compare October 29, 2024 16:43

mzient added 2 commits October 29, 2024 20:01

Add comments to DLTensorGraveyard.

3a73c87

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

Verify that dali->dlpack->torch is zero-copy.

370640a

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

Fix TensorListGPU.stream property.

e1b53c6

Adjust dlpack tests. Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

banasraf approved these changes Oct 30, 2024

View reviewed changes

szkarpinski approved these changes Oct 30, 2024

View reviewed changes

NVIDIA deleted a comment from dali-automaton Oct 30, 2024

mzient merged commit 000ba4d into NVIDIA:main Oct 30, 2024
5 of 6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stream aware outputs #5684

Stream aware outputs #5684

mzient commented Oct 21, 2024

mzient Oct 21, 2024

dali-automaton commented Oct 21, 2024

dali-automaton commented Oct 21, 2024

dali-automaton commented Oct 21, 2024

dali-automaton commented Oct 22, 2024

dali-automaton commented Oct 22, 2024

szkarpinski Oct 23, 2024

szkarpinski Oct 23, 2024

mzient Oct 29, 2024

szkarpinski Oct 23, 2024

mzient Oct 23, 2024

mzient Oct 29, 2024

szkarpinski Oct 23, 2024

mzient Oct 23, 2024

mzient Oct 29, 2024

dali-automaton commented Oct 29, 2024

dali-automaton commented Oct 29, 2024

dali-automaton commented Oct 30, 2024

dali-automaton commented Oct 30, 2024

		# convert the tensors in the batch to DLPack
		batch = [torch.from_dlpack(t) for t in out]

		assert jax_array.device() == jax.devices()[0]
		assert jax_array.device == jax.devices()[0]

Stream aware outputs #5684

Stream aware outputs #5684

Conversation

mzient commented Oct 21, 2024

Category:

Description:

Additional information:

Affected modules and functionalities:

Key points relevant for the review:

Tests:

Checklist

Documentation

DALI team only

Requirements

Choose a reason for hiding this comment

dali-automaton commented Oct 21, 2024

dali-automaton commented Oct 21, 2024

dali-automaton commented Oct 21, 2024

dali-automaton commented Oct 22, 2024

dali-automaton commented Oct 22, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dali-automaton commented Oct 29, 2024

dali-automaton commented Oct 29, 2024

dali-automaton commented Oct 30, 2024

dali-automaton commented Oct 30, 2024