Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fetch Migration] Migrate index templates and component templates as a part of metadata migration #477

Merged
merged 4 commits into from
Jan 25, 2024

Conversation

kartg
Copy link
Member

@kartg kartg commented Dec 15, 2023

Description

This change enables index template and component template migration from source to target cluster as a part of the metadata migration step of Fetch Migration. Now that metadata migration performs multiple distinct steps, these have been refactored into encapsulated functions rather than being inline in the run entrypoint.

New functions have been added to index_operations.py to fetch and create component templates and index templates. The IndexTemplateInfo class derives from the ComponentTemplateInfo class since the former can include a "template" definition that overrides its components, and it includes additional fields (such as "composed_of", "priority" and "index_patterns")

Unit tests have been added to maintain high test coverage.

  • Category: Enhancement, New feature, Refactoring

Testing

Tested emperically by running Fetch Migration from a source cluster that had index templates and component templates defined, to a target cluster. The templates were correctly copied over (verified by diff-ing their structures)
Unit test coverage:

$ python -m coverage run -m unittest
............................................................................................
----------------------------------------------------------------------
Ran 92 tests in 0.869s

OK

$ python -m coverage report --omit "*/tests/*"
Name                           Stmts   Miss  Cover
--------------------------------------------------
component_template_info.py        15      0   100%
endpoint_info.py                  24      0   100%
endpoint_utils.py                106      1    99%
fetch_orchestrator.py             76      0   100%
fetch_orchestrator_params.py      22      0   100%
index_diff.py                     20      0   100%
index_doc_count.py                 5      0   100%
index_operations.py              117      0   100%
index_template_info.py             5      0   100%
metadata_migration.py             87      0   100%
metadata_migration_params.py       7      0   100%
metadata_migration_result.py       5      0   100%
migration_monitor.py              90      0   100%
migration_monitor_params.py        6      0   100%
progress_metrics.py               91      0   100%
utils.py                          13      0   100%
--------------------------------------------------
TOTAL                            689      1    99%

Check List

  • New functionality includes testing
    • All tests pass, including unit test, integration test and doctest
  • New functionality has been documented
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link

codecov bot commented Dec 15, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (f7b10a9) 73.51% compared to head (da5b09a) 73.59%.

Additional details and impacted files
@@             Coverage Diff              @@
##               main     #477      +/-   ##
============================================
+ Coverage     73.51%   73.59%   +0.08%     
- Complexity     1180     1183       +3     
============================================
  Files           124      124              
  Lines          4886     4886              
  Branches        439      439              
============================================
+ Hits           3592     3596       +4     
+ Misses          999      997       -2     
+ Partials        295      293       -2     
Flag Coverage Δ
unittests 73.59% <ø> (+0.08%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Collaborator

@gregschohn gregschohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add to the comments or PR description how one would support only metadata migration?

__name: str
__template_def: Optional[dict]

def __init__(self, template_payload: dict, template_key: str = DEFAULT_TEMPLATE_KEY):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When would/could there be a different template_key?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IndexTemplateInfo is a subclass of this class and overrides the template_key value:

https://github.com/kartg/opensearch-migrations/blob/917f273917867a37a4b81a9b613e14e27887a94c/FetchMigration/python/index_template_info.py#L23

This was done because the structure of the index-template and component-template responses are largely identical except for the difference in template key string.

Comment on lines 13 to 14
import requests

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(on the filename, not this line). Index is both a noun and a verb, so it isn't clear to me whether this applies to operations on indices (nouns) or if this is a set of operations of a particular type (verb). index_management.py, or something that could shake the ambiguity would probably help new contributors.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Point taken. Note that I'd like to do this in a follow-up PR to keep the diff in this one focused on the template migration changes


def __fetch_templates(endpoint: EndpointInfo, path: str, root_key: str, factory) -> set:
url: str = endpoint.add_path(path)
# raises RuntimeError in case of any request errors
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you scope this to something more specific to disambiguate errors up the callstack? Leaving them as ConnectionError, HTTPError, etc or preserving the exception (rather than stringifying) would be preferable. SonarQube has some rules to not use RuntimeExceptions for Java & I the principles apply for other languages.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if there's an exception? If looks like it propagates up? It might be fair that this code doesn't handle it, but if there are compound instructions with dependencies, it will be harder to guarantee success without a more granular retry strategy. Saying that everything executes in < 5s and that these calls have never been observed to fail are probably valid reasons NOT to invest time in making them granular though.

Copy link
Member Author

@kartg kartg Jan 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this comment. I just learned that Python3 supports exception chaining so i'll incorporate that here instead of stringifying the underlying exception. Note that i'll continue to use RuntimeError since it's the only built-in exception type that's applicable here. I agree that exception typing in Fetch could be more specific - if you believe that's valuable, I can pick this up as a follow-up refactoring/improvement task.

What happens if there's an exception? If looks like it propagates up?

That's correct. This was an intentional decision to fail the metadata migration if templates could not be fetched (and therefore migrated)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exception chaining added with this commit

Comment on lines 133 to 134
except RuntimeError as e:
raise RuntimeError(f"Failed to fetch component template metadata from cluster endpoint: {e!s}")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the reasons mentioned on the previous comment, I'd rather see the RuntimeException be propagated rather than handling and obscuring the underlying cause.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exception chaining added with this commit

@kartg kartg force-pushed the fetch-index-templates branch 2 times, most recently from 87f156d to d3c0435 Compare January 19, 2024 19:30
Signed-off-by: Kartik Ganesh <gkart@amazon.com>
…template migration

This includes class representations of component and index template information (and their unit tests). Index_operations now also has new functions to fetch and create component/index templates - unit test coverage for this is TBD. Finally, the template migration logic has been added to metadata_migration

Signed-off-by: Kartik Ganesh <gkart@amazon.com>
@kartg
Copy link
Member Author

kartg commented Jan 24, 2024

Could you add to the comments or PR description how one would support only metadata migration?

@gregschohn This is done by including an extra flag with the orchestrator:

$ python fetch_orchestrator.py --create-only ....

This is also captured in the code's argparse / help documentation

Signed-off-by: Kartik Ganesh <gkart@amazon.com>
Signed-off-by: Kartik Ganesh <gkart@amazon.com>
@kartg kartg merged commit 71420f2 into opensearch-project:main Jan 25, 2024
8 checks passed
@kartg kartg deleted the fetch-index-templates branch January 25, 2024 18:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants