You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a question regarding some design choices when building a video dataset with DALI. My pipeline consists of several steps where some steps happen within DALI pipelines and some steps are normal python code. Specifically, I have a web dataset consisting of video containing tar files, so my first step is to invoke DALI's webdataset reader within a pipeline. Afterwards, I would like to filter out unwanted video files before decoding based on their metadata. Afterwards I invoke a second DALI pipeline for decoding the video files. Then, I process the decoded videos (e.g. cutting them up into smaller snippets and finally forward those to another DALI processing pipeline (e.g., for resizing etc). A dummy code looks something like this:
@pipeline_def()
def wds_extraction(paths):
raw_video_bytes = fn.readers.webdataset(paths=paths, ...)
return raw_video_bytes
def filter(source):
for video_bytes in source:
duration, fps = get_metadata(video_bytes)
...
yield video_bytes, duration, fps
@pipeline_def()
def decoding(source, device):
inputs = fn.external_source(source, num_outputs=3) # bytes, duration, fps
video = fn.experimental.decoders.video(inputs [0], device=device)
return video, *inputs[1:] # simply forward duration and fps unchanged ...
def cutting_snippets(source):
...
@pipeline_def()
def resizing(source):
fn.external_source(source, ...)
...
def iterator(paths):
source = wds_extraction_iter(paths) # wraps the wds_extraction pipeline in a DALIRaggedIterator
source = filter(source)
source = decoding_iter(source) # wraps the decoding pipeline in a DALIRaggedIterator
source = cutting_snippets(source)
source = resizing_iter(source) # wraps the resizing pipeline in a DALIRaggedIterator
yield from source
I wanted to ask whether this design choice is efficient even with the context switches between pure python and DALI pipelines. Are there some disadvantages performance-wise? Another quite bothering thing is that I have to forward each piece of data through every DALI pipeline even though they do not get updated anymore. For example, I extract the duration and fps of each video in the filter method and want to forward them until the end to the user. Hence, I must also load them into the DALI pipelines and simply output them again.
Is there a better way to achieve a pipeline like this?
Check for duplicates
I have searched the open bugs/issues and have found no duplicates for this bug report
The text was updated successfully, but these errors were encountered:
Thank you for reaching out.
Your design is overall not that bad. The improvement I can think of is using the parallel external sources to asynchronously load and filter videos before the decoding.
Regarding the metadata you are passing through pipelines, my impression is that they are not that heavy and the overhead will be small.
Describe the question.
Hello everyone,
I have a question regarding some design choices when building a video dataset with DALI. My pipeline consists of several steps where some steps happen within DALI pipelines and some steps are normal python code. Specifically, I have a web dataset consisting of video containing tar files, so my first step is to invoke DALI's webdataset reader within a pipeline. Afterwards, I would like to filter out unwanted video files before decoding based on their metadata. Afterwards I invoke a second DALI pipeline for decoding the video files. Then, I process the decoded videos (e.g. cutting them up into smaller snippets and finally forward those to another DALI processing pipeline (e.g., for resizing etc). A dummy code looks something like this:
I wanted to ask whether this design choice is efficient even with the context switches between pure python and DALI pipelines. Are there some disadvantages performance-wise? Another quite bothering thing is that I have to forward each piece of data through every DALI pipeline even though they do not get updated anymore. For example, I extract the duration and fps of each video in the filter method and want to forward them until the end to the user. Hence, I must also load them into the DALI pipelines and simply output them again.
Is there a better way to achieve a pipeline like this?
Check for duplicates
The text was updated successfully, but these errors were encountered: