Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
How to use Chroma with YOLOv5
See the chroma docs and quick-start guide for general information about Chroma.
This PR is intended to illustrate the changes necessary to attach Chroma to a more complex model like the popular YOLO family of object detection models. It is not intended to be merged upstream. We use the YOLOv5 repository from Ultralytics as the base.
get_embeddings.py
is a fully functional script for extracting embeddings from YOLOv5. It takes parts from the detection and validation scripts, but focuses on logging the necessary information to the Chroma client.It also contains a function for associating labels to detections when appropriate. In most cases, this will not require a new function to be written - this is an artifact of how the existing YOLOv5 repo is set up.
We also modify the following:
In
models/yolo.py
, theDetect
class is modified so that the model outputs embeddings alongside detections in the forward pass. This could also be accomplished with a forward hook, without the need to modify the model, but requires some additional complexity. Note that this change does not require the model to be re-trained; it's fine to load any existing weight dict.In
utils/general.py
, we modifynon_max_suppression
so that embeddings are filtered alongside predictions when they are passed in, such that each prediction which makes it through is correctly associated with its embedding.utils/dataloaders.py
is modified so that we can work with datasets which have images, but not labels corresponding to each image. This is mostly a convenience.Try the branch out for yourself!