Quest: Add SAM Preprocessing for Videos #206

gkielian · 2024-07-29T23:38:39Z

One way LLMs can do video processing is via image segmentation, with tokens as labels for each entity in ascii.

We can really reduce context length required if doing this method.

Adding this recent repo for pre-processing, could then be used for next frame prediction, and with action in the loop (as demonstrated previously at Mistral SF hackathon):
https://github.com/facebookresearch/segment-anything-2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quest: Add SAM Preprocessing for Videos #206

Quest: Add SAM Preprocessing for Videos #206

gkielian commented Jul 29, 2024

Quest: Add SAM Preprocessing for Videos #206

Quest: Add SAM Preprocessing for Videos #206

Comments

gkielian commented Jul 29, 2024