Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restore improvement: index #4044

Merged
merged 10 commits into from
Oct 3, 2024
Merged

Restore improvement: index #4044

merged 10 commits into from
Oct 3, 2024

Conversation

Michal-Leszczynski
Copy link
Collaborator

@Michal-Leszczynski Michal-Leszczynski commented Sep 25, 2024

This PR is the core of restore improvements prepared for SM 3.4.
It refactors the general workflow from restoring location by location, manifest by manifest, table by table, to a version where nothing waits on nothing! All of the remote sstable dirs are indexed at the beginning, fed to batch dispatcher, which controls batch creation. For now, the batch dispatcher creates batches in the same way as before, but it will be imporved as a part of #3979.

Rebased on #4040.

Fixes #3981

@Michal-Leszczynski Michal-Leszczynski force-pushed the ml/restore-index branch 3 times, most recently from db5c9ec to 1f15c79 Compare September 26, 2024 12:41
Right now SM restores location by location, manifest by manifest, table by table.
That's why it tracks restore progress by keeping location/manifest/table in the DB.
We are moving away from sequential restore approach in favor of
restoring from all locations/manifests/tables at the same time.
…dFiles

There is no need to iterate versioned files (ListVersionedFiles) and not versioned files (buildFilesSizesCache) separately.
Doing it in a single iteration is faster, and it allows to store all size information in a single place.
@Michal-Leszczynski
Copy link
Collaborator Author

@karol-kokoszka This PR is ready for review!
It already passed CI once, but I decided to split one commit, so the CI is running again. Let's hope this refactor won't break anything.

@mykaul
Copy link
Contributor

mykaul commented Sep 30, 2024

This PR is the core of restore improvements prepared for SM 3.4. It refactors the general workflow from restoring location by location, manifest by manifest, table by table, to a version where nothing waits on nothing! All of the remote sstable dirs are indexed at the beginning, fed to batch dispatcher, which controls batch creation. For now, the batch dispatcher creates batches in the same way as before, but it will be imporved as a part of #3979.

Rebased on #4040.

Fixes #3981

General note - I'm lacking some documentation here. For example, how do we sort them (if we do) after indexing? do we have any way to control it, etc.

@karol-kokoszka
Copy link
Collaborator

We have just the document (design doc?) describing the restore in Scylla Manager 3.1.
https://docs.google.com/document/d/1ugf6XCs28c4pwKsw9Lbu_EFb6ZSgMpJ_-Usf-Q9XAP4/edit#heading=h.f7jv2tbf2ofd

I agree that all the details that are going to be included into 3.4 must be documented.

We can either make it public, by editing README.md of https://github.com/scylladb/scylla-manager/tree/master/pkg/service/restore and putting reference to it in main README, or by just preparing new design doc documenting all the changes/updates or current state.

The design, list of changes were introduced on our meeting where we wanted to understand the bottlenecks from the current implementation. We haven't created formal design doc.... but just set of issues.

Issue for the documentation #4051

Copy link
Collaborator

@karol-kokoszka karol-kokoszka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

pkg/service/restore/index.go Outdated Show resolved Hide resolved
pkg/service/restore/index.go Outdated Show resolved Hide resolved
pkg/service/restore/index.go Show resolved Hide resolved
pkg/service/restore/index.go Outdated Show resolved Hide resolved
pkg/service/restore/index.go Show resolved Hide resolved
@Michal-Leszczynski
Copy link
Collaborator Author

@karol-kokoszka I addressed all the comments, please take a look.

@Michal-Leszczynski
Copy link
Collaborator Author

@karol-kokoszka
Copy link
Collaborator

Please check why Sanity check fails.

@Michal-Leszczynski
Copy link
Collaborator Author

Failures are known issues/flakes:

Please check why Sanity check fails.

Because of the known flake: #3848

This commit introduces the structure of restore workload.
Workload is divided per location->table->remote sstable directory.
This changes the hierarchy established by manifests (location->node->table->remote sstable dir).
It also aggregates files into actual sstables, extracts their IDs, and aggregates their sizes,
and keeps track of sstable versioning.
Indexed workload won't contain sstables that were already
restored during previous restore run.
This is a temporary implementation used for integrating workload
indexing with the rest of the code. It will be improved as a part
of the #3979.
This commit makes use of the new indexing and batching
approaches and uses them in the restore tables codebase.
Recent commits changed versioned batch download so that
if any sstable component is versioned, then all sstable components
are downloaded as versioned files.
It was done in that way to allow easier versioned progress calculation
(we don't store per file size, only the whole sstable size).

This brought to light a bug (that existed before, but was more difficult to hit),
in which restoring batch failed when the whole batch was versioned,
as calling RcloneSyncCopyPaths on empty paths parameter resulted in broken download.

We could just skip the RcloneSyncCopyPaths call when the whole batch is versioned,
but this would leave us without the agentJobID which is a part of sort key
in RestoreRunProgress. Without it, we could potentially overwrite one
restore run progress with another - if both of them happened on the same
RemoteSSTableDir, by the same Host, and were fully versioned.
It would also introduce a different path for restoring regular batch and fully versioned batch,
which is not desirable.

That's why I decided to modify rclone server to allow empty path parameter,
so that it still generates agentJobID, but it doesn't do anything except for that.
Workload info contains location/table/remote sstable dir sstable count,
total size, max and average sstable size.
@Michal-Leszczynski Michal-Leszczynski merged commit 14aef7b into master Oct 3, 2024
49 of 51 checks passed
@Michal-Leszczynski Michal-Leszczynski deleted the ml/restore-index branch October 3, 2024 09:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create batches from all manifest at once
3 participants