-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Missing embeddings in collections after a system reboot #2905
Comments
Hi @ymzayek is it possible to share your db if its not sensitive data? Also happy to take them via discord/email if you don't want to post it here. That would help us debug on our end. Otherwise, can we start by knowing what files are under /<chroma_path>/<collection_id>/ |
Closes #2922 Closes #2912 It might be related to #2905 ## Description of changes *Summarize the changes made by this PR.* - Improvements & Bug fixes - ... - New functionality - ... ## Test plan *How are these changes tested?* - [ ] Tests pass locally with `pytest` for python, `yarn test` for js, `cargo test` for rust ## Documentation Changes *Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the [docs repository](https://github.com/chroma-core/docs)?*
@tazarov yes it seems to explain our problem so I think you can close this issue. Thanks! |
@ymzayek, excellent. However, you will have to recreate the missing embeddings. Let me know if you need help with that. |
@tazarov yes we will handle that. Thanks a lot :) |
@ymzayek, can you confirm whether this solved your problem? |
Apologies for the delay. For the moment we haven't upgraded (after a downgrade to 0.5.5). Are there any compatibility issues with the latest version and 0.5.5? But in any case, you can feel free to close this issue |
@ymzayek, there have been a number of bugfixes and improvements introduced in |
What happened?
Hello, I'm working on a project where we use chromadb:0.5.11 as part of rag pipelines. We have succesfully used it to create collections and query them. We use our own embedder for the queries and chunks and do not rely on the chroma embedding method. We have just had an issue where it seemed that the embeddings in a collection got "deleted" or at least they are missing over the weekend after a reboot of the servers that we work on. To be clear, some query search tests on the collections before the weekend and system reboot clearly showed that the embeddings were well added to the collections at time of creation: a retrieval worked as it should to get the relevant documents. Now the same tests return an empty list of documents. I did some debugging by connecting to the chroma server with the chroma http client to maually check the collections and I see that the chunks and metadata still exists but the embeddings are empty. Has anyone seen this problem before? Any ideas about what could have happened or if it could be related to a system reboot? In the chroma logs all looks fine. I just see DEBUG: Starting component PersistentLocalHnswSegment sometimes but not sure that is related.
Possibly related to the following issues linked below but in our use case we do not delete any documents and then add new ones to the same collection. We just create one collection at a time which we then query without any further modifications in any of the data/documents in the collection.
#2512
#2062
#870
Versions
0.5.11
Relevant log output
No response
The text was updated successfully, but these errors were encountered: