Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replicate reap requests #5

Open
martinsumner opened this issue Sep 8, 2023 · 1 comment
Open

Replicate reap requests #5

martinsumner opened this issue Sep 8, 2023 · 1 comment

Comments

@martinsumner
Copy link

Both un-reaped tombstones, and the process of reaping tombstones will lead to deltas between clusters that must be resolved by some mechanism (e.g. ttaae full-sync). The full-sync process has an impact on load on the cluster, and takes time, time in which over deltas may not be discovered.

For example with delete_mode not keep, If a batch of deletes occurs whilst a node is down in a standby cluster, will lead to the situation where all the deletes are reaped on one cluster, but some of the deletes (those with a preflist including a primary on the failed node) are not reaped on the other cluster - creating a potentially large discrepancy for full-sync to resolve. Tombstones are only reaped when all primaries are available in a cluster for that key.

The safest, and recommended, delete_mode is keep. This addresses the full-sync issue, but each tombstone has an overhead within the vnodes that hold it (typically a memory-overhead for key). In high-churn environments reap policies may still be required to address this overhead (e.g. reaping tombstones > 30 days) - but for efficiency this requires two sync'd clusters to independently reap with full-sync disabled (as otherwise full-sync will continuously try and resurrect tombstones intended for deletion).

To resolve this problem it would be preferable for nextgenrpel to replicate reap requests made via the riak_kv_reaper, so that when reaping occurs on one cluster (i.e. as a result of an aae_fold), the reaping will be replicated to sync'd clusters so that the reap is coordinated across the environment and clusters remain in sync.

The intention is that it should be possible for an operator to run an aae_fold that removes tombstones from multiple sync'd clusters so that full-sync can continue running during the reap, without that full-sync resurrecting any tombstones or detecting any sync issues.

@martinsumner
Copy link
Author

#6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant