Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for random prefix with deletion vector in Delta Lake #24040

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ebyhr
Copy link
Member

@ebyhr ebyhr commented Nov 6, 2024

Description

Support delta.randomizeFilePrefixes and delta.randomPrefixLength properties considering S3 rate limit for a prefix.
This change appends a directory having N-alphanumeric characters for each deletion vector binary file likes:

s3://bucket/schema/table/f0B/deletion_vector_bc33e4fa-ee86-412d-9fa3-44606c38abe2.bin
s3://bucket/schema/table/K08/deletion_vector_fd194359-4323-4349-971c-67fd066c28f3.bin

Spark implementation is DMLWithDeletionVectorsHelper.scala#L350-L359

Release notes

(x) This is not user-visible or is docs only, and no release notes are required.

@cla-bot cla-bot bot added the cla-signed label Nov 6, 2024
@github-actions github-actions bot added the delta-lake Delta Lake connector label Nov 6, 2024
@ebyhr ebyhr force-pushed the ebi/delta-dv-random-prefix branch 2 times, most recently from a1a99ee to 7427b76 Compare November 6, 2024 08:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-signed delta-lake Delta Lake connector
Development

Successfully merging this pull request may close these issues.

1 participant