-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Fetch Migration] Migrate index templates and component templates as a part of metadata migration #477
[Fetch Migration] Migrate index templates and component templates as a part of metadata migration #477
Changes from all commits
e8938b2
917f273
aaf3761
da5b09a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
# | ||
# Copyright OpenSearch Contributors | ||
# SPDX-License-Identifier: Apache-2.0 | ||
# | ||
# The OpenSearch Contributors require contributions made to | ||
# this file be licensed under the Apache-2.0 license or a | ||
# compatible open source license. | ||
# | ||
|
||
|
||
# Constants | ||
from typing import Optional | ||
|
||
NAME_KEY = "name" | ||
DEFAULT_TEMPLATE_KEY = "component_template" | ||
|
||
|
||
# Class that encapsulates component template information | ||
class ComponentTemplateInfo: | ||
# Private member variables | ||
__name: str | ||
__template_def: Optional[dict] | ||
|
||
def __init__(self, template_payload: dict, template_key: str = DEFAULT_TEMPLATE_KEY): | ||
self.__name = template_payload[NAME_KEY] | ||
self.__template_def = None | ||
if template_key in template_payload: | ||
self.__template_def = template_payload[template_key] | ||
|
||
def get_name(self) -> str: | ||
return self.__name | ||
|
||
def get_template_definition(self) -> dict: | ||
return self.__template_def |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -12,15 +12,21 @@ | |
import jsonpath_ng | ||
import requests | ||
|
||
Comment on lines
13
to
14
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (on the filename, not this line). Index is both a noun and a verb, so it isn't clear to me whether this applies to operations on indices (nouns) or if this is a set of operations of a particular type (verb). index_management.py, or something that could shake the ambiguity would probably help new contributors. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Point taken. Note that I'd like to do this in a follow-up PR to keep the diff in this one focused on the template migration changes |
||
from component_template_info import ComponentTemplateInfo | ||
from endpoint_info import EndpointInfo | ||
from index_doc_count import IndexDocCount | ||
from index_template_info import IndexTemplateInfo | ||
|
||
# Constants | ||
SETTINGS_KEY = "settings" | ||
MAPPINGS_KEY = "mappings" | ||
ALIASES_KEY = "aliases" | ||
COUNT_KEY = "count" | ||
__INDEX_KEY = "index" | ||
__COMPONENT_TEMPLATE_LIST_KEY = "component_templates" | ||
__INDEX_TEMPLATE_LIST_KEY = "index_templates" | ||
__INDEX_TEMPLATES_PATH = "/_index_template" | ||
__COMPONENT_TEMPLATES_PATH = "/_component_template" | ||
__ALL_INDICES_ENDPOINT = "*" | ||
# (ES 7+) size=0 avoids the "hits" payload to reduce the response size since we're only interested in the aggregation, | ||
# and track_total_hits forces an accurate doc-count | ||
|
@@ -106,3 +112,58 @@ def doc_count(indices: set, endpoint: EndpointInfo) -> IndexDocCount: | |
return IndexDocCount(total, count_map) | ||
except RuntimeError as e: | ||
raise RuntimeError(f"Failed to fetch doc_count: {e!s}") | ||
|
||
|
||
def __fetch_templates(endpoint: EndpointInfo, path: str, root_key: str, factory) -> set: | ||
url: str = endpoint.add_path(path) | ||
# raises RuntimeError in case of any request errors | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you scope this to something more specific to disambiguate errors up the callstack? Leaving them as ConnectionError, HTTPError, etc or preserving the exception (rather than stringifying) would be preferable. SonarQube has some rules to not use RuntimeExceptions for Java & I the principles apply for other languages. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What happens if there's an exception? If looks like it propagates up? It might be fair that this code doesn't handle it, but if there are compound instructions with dependencies, it will be harder to guarantee success without a more granular retry strategy. Saying that everything executes in < 5s and that these calls have never been observed to fail are probably valid reasons NOT to invest time in making them granular though. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for this comment. I just learned that Python3 supports exception chaining so i'll incorporate that here instead of stringifying the underlying exception. Note that i'll continue to use
That's correct. This was an intentional decision to fail the metadata migration if templates could not be fetched (and therefore migrated) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Exception chaining added with this commit |
||
try: | ||
resp = __send_get_request(url, endpoint) | ||
result = set() | ||
if root_key in resp.json(): | ||
for template in resp.json()[root_key]: | ||
result.add(factory(template)) | ||
return result | ||
except RuntimeError as e: | ||
# Chain the underlying exception as a cause | ||
raise RuntimeError("Failed to fetch template metadata from cluster endpoint") from e | ||
|
||
|
||
def fetch_all_component_templates(endpoint: EndpointInfo) -> set[ComponentTemplateInfo]: | ||
try: | ||
# raises RuntimeError in case of any request errors | ||
return __fetch_templates(endpoint, __COMPONENT_TEMPLATES_PATH, __COMPONENT_TEMPLATE_LIST_KEY, | ||
lambda t: ComponentTemplateInfo(t)) | ||
except RuntimeError as e: | ||
raise RuntimeError("Failed to fetch component template metadata") from e | ||
|
||
|
||
def fetch_all_index_templates(endpoint: EndpointInfo) -> set[IndexTemplateInfo]: | ||
try: | ||
# raises RuntimeError in case of any request errors | ||
return __fetch_templates(endpoint, __INDEX_TEMPLATES_PATH, __INDEX_TEMPLATE_LIST_KEY, | ||
lambda t: IndexTemplateInfo(t)) | ||
except RuntimeError as e: | ||
raise RuntimeError("Failed to fetch index template metadata") from e | ||
|
||
|
||
def __create_templates(templates: set[ComponentTemplateInfo], endpoint: EndpointInfo, template_path: str) -> dict: | ||
failures = dict() | ||
for template in templates: | ||
template_endpoint = endpoint.add_path(template_path + "/" + template.get_name()) | ||
try: | ||
resp = requests.put(template_endpoint, auth=endpoint.get_auth(), verify=endpoint.is_verify_ssl(), | ||
json=template.get_template_definition(), timeout=__TIMEOUT_SECONDS) | ||
resp.raise_for_status() | ||
except requests.exceptions.RequestException as e: | ||
failures[template.get_name()] = e | ||
# Loop completed, return failures if any | ||
return failures | ||
|
||
|
||
def create_component_templates(templates: set[ComponentTemplateInfo], endpoint: EndpointInfo) -> dict: | ||
return __create_templates(templates, endpoint, __COMPONENT_TEMPLATES_PATH) | ||
|
||
|
||
def create_index_templates(templates: set[IndexTemplateInfo], endpoint: EndpointInfo) -> dict: | ||
return __create_templates(templates, endpoint, __INDEX_TEMPLATES_PATH) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
# | ||
# Copyright OpenSearch Contributors | ||
# SPDX-License-Identifier: Apache-2.0 | ||
# | ||
# The OpenSearch Contributors require contributions made to | ||
# this file be licensed under the Apache-2.0 license or a | ||
# compatible open source license. | ||
# | ||
|
||
from component_template_info import ComponentTemplateInfo | ||
|
||
# Constants | ||
INDEX_TEMPLATE_KEY = "index_template" | ||
|
||
|
||
# Class that encapsulates index template information from a cluster. | ||
# Subclass of ComponentTemplateInfo because the structure of an index | ||
# template is identical to a component template, except that it uses | ||
# a different template key. Also, index templates can be "composed" of | ||
# one or more component templates. | ||
class IndexTemplateInfo(ComponentTemplateInfo): | ||
def __init__(self, template_payload: dict): | ||
super().__init__(template_payload, INDEX_TEMPLATE_KEY) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When would/could there be a different template_key?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IndexTemplateInfo
is a subclass of this class and overrides thetemplate_key
value:https://github.com/kartg/opensearch-migrations/blob/917f273917867a37a4b81a9b613e14e27887a94c/FetchMigration/python/index_template_info.py#L23
This was done because the structure of the index-template and component-template responses are largely identical except for the difference in template key string.