Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve embeddings handling #67

Merged
merged 41 commits into from
Mar 5, 2024
Merged

improve embeddings handling #67

merged 41 commits into from
Mar 5, 2024

Conversation

ChuckHend
Copy link
Member

@ChuckHend ChuckHend commented Mar 2, 2024

  • changes default embedding location from new column on existing table to a new table
  • updates the triggers so that they always take all inserts/updates as a batch rather than each row as a separate item. this is a huge performance improvement.
  • realtime updates are only compatible with embeddings being on their own table (TableMethod::join). This improves efficiency for cases where realtime is beneficial by not rewriting the large tuples twice, and keeps the triggers more simple (if embeddings on same table, then the update of embeddings will trigger as well, causing an infinite loop).
  • adds a view that joins embeddings to source table
  • adds a migration to transition all existing realtime vectorize jobs from TableMethod::append to TableMethod::join, which moves their embeddings to their own table in the vectorize schema.

@ChuckHend ChuckHend marked this pull request as ready for review March 4, 2024 19:12
@ChuckHend ChuckHend merged commit e56ee15 into main Mar 5, 2024
6 checks passed
@ChuckHend ChuckHend deleted the tableMethod branch March 5, 2024 04:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant