More efficient video processing #170

ejolly · 2023-06-22T22:18:45Z

This is an attempt to dramatically lower the memory usage of .detect_video() This PR touches #139 and #165.

Previously we were relying on torch's read_video which always reads all video frames into memory at once. Our skip_frames was then used to slice the frame data during processing to speed things up. This approach is reasonable for short videos, videos with lower framerates, or smaller resolutions, but incrediby inefficient for larger files.

We had hoped there would be an informative error from the torch side of things in case this process failed when running out of memory, but multiple user reports (and validated locally) show that the kernel just crashes, hangs, or even causes a computuer freeze...hardly a graceful failure.

While torch has a VideoReader class, it's currently in beta and using it requires compiling torch from source 🙃.

I first tried storing video frame-counts and making repeated calls to read_video on a per-frame basis, but getting the pts and timing right was non-trivial.

So in order to avoid adding another hard-to-install dependency like openCV, this PR now uses a trick to lazy load video-frames by wrapping and slicing a pyav generator object; the library that torch's read_video uses under-the-hood.

I've verified that even extremely long, high resolution videos work with .detect_video, and that the approach still works with batching, since we're still wrapping our VideoDataset in a torch DataLoader.

@ljchang @TiankangXie If you could test this branch on GPU machines or other platforms to see if we're incurring any additional unexpected overhead that would be super helpful!

ejolly · 2023-06-29T16:53:19Z

@ljchang This PR now includes approximate times for each processed frame as a column the Fex output. Also I realized that parallelization is possible on top of batching because DataLoader takes a num_workers arg. For this reason I'm going to keep the current strategy with creating a new generator for each call to load_frame in VideoDataset instead of using a single shared generator to avoid race conditions.

ejolly added 2 commits June 22, 2023 18:04

Lazy loading of video frames

c64b577

disable fex test until nltools issue is fixed

f80dd5f

ejolly mentioned this pull request Jun 22, 2023

Make .detect_video more memory efficient #139

Closed

add approximate frame time to output of detect_video

f5fb91f

ejolly mentioned this pull request Jul 7, 2023

AU fixes + more efficient video processing #173

Merged

ejolly merged commit 3973bf1 into main Jul 10, 2023
1 of 10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More efficient video processing #170

More efficient video processing #170

ejolly commented Jun 22, 2023 •

edited

Loading

ejolly commented Jun 29, 2023

More efficient video processing #170

More efficient video processing #170

Conversation

ejolly commented Jun 22, 2023 • edited Loading

ejolly commented Jun 29, 2023

ejolly commented Jun 22, 2023 •

edited

Loading