Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Console] Metadata Migration #756

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
b5559fe
Add outline of metadata implementation
mikaylathompson Jun 20, 2024
5b12d5b
Add logic, implement migrate
mikaylathompson Jun 20, 2024
ce00934
Merge branch 'main' into console-lib-metdata-migration
mikaylathompson Jun 20, 2024
40bb291
Add more metadata stuff
mikaylathompson Jun 21, 2024
77c52e3
Support local snapshots from CLI
mikaylathompson Jun 21, 2024
8b59c29
Put the imports back where they were
mikaylathompson Jun 21, 2024
81b7d16
Merge branch 'support-file-system-snapshots' into console-lib-metdata…
mikaylathompson Jun 21, 2024
519b845
Merge support-file-system-snapshots into branch
mikaylathompson Jun 21, 2024
cc82c49
Make changes for filesystem snapshots
mikaylathompson Jun 21, 2024
6a6e7a8
Make changes for filesystem snapshots
mikaylathompson Jun 21, 2024
5ef5302
fix up services.yaml
mikaylathompson Jun 22, 2024
b376543
Add tests
mikaylathompson Jun 24, 2024
a75c5f6
remove target-host parameter
mikaylathompson Jun 24, 2024
2f2ca1b
Update cdk
mikaylathompson Jun 24, 2024
293ab14
Add a few more snapshot tests
mikaylathompson Jun 24, 2024
6364600
Add fs snapshot to service.yaml for docker
mikaylathompson Jun 24, 2024
862e15c
Move auth check to snapshot create instead of init
mikaylathompson Jun 24, 2024
13dcf5a
Merge branch 'console-lib-snapshot-for-filesystem' into console-lib-m…
mikaylathompson Jun 24, 2024
d56d1b7
Get metadata working for fs snapshot
mikaylathompson Jun 24, 2024
c790c52
Add detached metadata migration, simplify logic
mikaylathompson Jun 24, 2024
f1558b6
Merge branch 'main' into console-lib-metdata-migration
mikaylathompson Jun 24, 2024
1945c29
Get metadata in aws working
mikaylathompson Jun 25, 2024
b759838
pre-review cleanup
mikaylathompson Jun 25, 2024
53095aa
Add tests
mikaylathompson Jun 25, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion FetchMigration/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ COPY python/Pipfile python/Pipfile.lock ./
RUN apt -y update
RUN apt -y install python3 python3-pip
RUN pip3 install pipenv
RUN pipenv install --system --deploy --ignore-pipfile
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is included because I was getting this error while deploying to aws. It seems odd that I would run into this and no one else would, so it's possible there's another explanation.
cc: @peternied

#10 [ 6/10] RUN pipenv install --system --deploy --ignore-pipfile
#10 0.471 Usage: pipenv install [OPTIONS] [PACKAGES]...
#10 0.471
#10 0.471 ERROR:: --system is intended to be used for Pipfile installation, not installation of specific packages. Aborting.
#10 0.471 See also: --deploy flag.
#10 ERROR: process "/bin/sh -c pipenv install --system --deploy --ignore-pipfile" did not complete successfully: exit code: 2
------
 > [ 6/10] RUN pipenv install --system --deploy --ignore-pipfile:
0.471 Usage: pipenv install [OPTIONS] [PACKAGES]...
0.471
0.471 ERROR:: --system is intended to be used for Pipfile installation, not installation of specific packages. Aborting.
0.471 See also: --deploy flag.
------
Dockerfile:8
--------------------
   6 |     RUN apt -y install python3 python3-pip
   7 |     RUN pip3 install pipenv
   8 | >>> RUN pipenv install --system --deploy --ignore-pipfile
   9 |
  10 |
--------------------
ERROR: failed to solve: process "/bin/sh -c pipenv install --system --deploy --ignore-pipfile" did not complete successfully: exit code: 2

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you have this commit when you got that error? #753

RUN pipenv install --deploy --ignore-pipfile


ENV FM_CODE_PATH /code
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,9 @@ public static class Args {
@Parameter(names = {"--target-password"}, description = "Optional. The target password; if not provided, will assume no auth on target", required = false)
public String targetPass = null;

@Parameter(names = {"--target-insecure"}, description = "Allow untrusted SSL certificates for target", required = false)
public boolean targetInsecure = false;

@Parameter(names = {"--index-allowlist"}, description = ("Optional. List of index names to migrate"
+ " (e.g. 'logs_2024_01, logs_2024_02'). Default: all indices"), required = false)
public List<String> indexAllowlist = List.of();
Expand Down Expand Up @@ -103,11 +106,14 @@ public static void main(String[] args) throws Exception {
final String targetHost = arguments.targetHost;
final String targetUser = arguments.targetUser;
final String targetPass = arguments.targetPass;
final boolean targetInsecure = arguments.targetInsecure;
final List<String> indexTemplateAllowlist = arguments.indexTemplateAllowlist;
final List<String> componentTemplateAllowlist = arguments.componentTemplateAllowlist;
final int awarenessDimensionality = arguments.minNumberOfReplicas + 1;

final ConnectionDetails targetConnection = new ConnectionDetails(targetHost, targetUser, targetPass);
final ConnectionDetails targetConnection = new ConnectionDetails(targetHost, targetUser, targetPass, targetInsecure);





Expand Down
Original file line number Diff line number Diff line change
@@ -1,20 +1,22 @@
from console_api.apps.orchestrator.serializers import OpenSearchIngestionCreateRequestSerializer
from console_link.models.osi_utils import (InvalidAuthParameters, create_pipeline_from_json, start_pipeline,
stop_pipeline)
from console_link.models.migration import MigrationType
from rest_framework.decorators import api_view, parser_classes
from rest_framework.parsers import JSONParser
from rest_framework.response import Response
from rest_framework import status
from pathlib import Path
import boto3
import datetime
from enum import Enum
import logging

logger = logging.getLogger(__name__)

PIPELINE_TEMPLATE_PATH = f"{Path(__file__).parents[4]}/osiPipelineTemplate.yaml"

MigrationType = Enum('MigrationType', ['OSI_HISTORICAL_MIGRATION'])

Comment on lines +18 to +19
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lewijacn -- neither the file it's being imported from nor the enum referenced exist anymore. I added it locally for the time being, because I don't think this is potentially changeable data at this point. Post-demo, we should 1/ add tests to the api so we don't accidentally break it again, 2/ figure out what this should be calling.


def pretty_request(request, data):
headers = ''
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
import console_link.logic.metrics as logic_metrics
import console_link.logic.backfill as logic_backfill
import console_link.logic.snapshot as logic_snapshot
import console_link.logic.metadata as logic_metadata

from console_link.models.utils import ExitCode
from console_link.environment import Environment
Expand Down Expand Up @@ -36,10 +37,10 @@
@click.option('-v', '--verbose', count=True, help="Verbosity level. Default is warn, -v is info, -vv is debug.")
@click.pass_context
def cli(ctx, config_file, json, verbose):
ctx.obj = Context(config_file)
ctx.obj.json = json
logging.basicConfig(level=logging.WARN - (10 * verbose))
logger.info(f"Logging set to {logging.getLevelName(logger.getEffectiveLevel())}")
ctx.obj = Context(config_file)
ctx.obj.json = json


# ##################### CLUSTERS ###################
Expand Down Expand Up @@ -238,6 +239,26 @@
# ##################### METRICS ###################


@cli.group(name="metadata")
@click.pass_obj
def metadata_group(ctx):
"""All actions related to metadata migration"""
if ctx.env.metadata is None:
raise click.UsageError("Metadata is not set")

Check warning on line 247 in TrafficCapture/dockerSolution/src/main/docker/migrationConsole/lib/console_link/console_link/cli.py

View check run for this annotation

Codecov / codecov/patch

TrafficCapture/dockerSolution/src/main/docker/migrationConsole/lib/console_link/console_link/cli.py#L247

Added line #L247 was not covered by tests


@metadata_group.command(name="migrate")
@click.option("--detach", is_flag=True, help="Run metadata migration in detached mode")
@click.pass_obj
def migrate_metadata_cmd(ctx, detach):
exitcode, message = logic_metadata.migrate(ctx.env.metadata, detach)
if exitcode != ExitCode.SUCCESS:
raise click.ClickException(message)

Check warning on line 256 in TrafficCapture/dockerSolution/src/main/docker/migrationConsole/lib/console_link/console_link/cli.py

View check run for this annotation

Codecov / codecov/patch

TrafficCapture/dockerSolution/src/main/docker/migrationConsole/lib/console_link/console_link/cli.py#L256

Added line #L256 was not covered by tests
click.echo(message)

# ##################### METRICS ###################


@cli.group(name="metrics")
@click.pass_obj
def metrics_group(ctx):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
import yaml
from cerberus import Validator

from console_link.models.metadata import Metadata

logger = logging.getLogger(__name__)

Expand All @@ -26,6 +27,7 @@ def get_snapshot(config: Dict, source_cluster: Cluster, target_cluster: Cluster)
"backfill": {"type": "dict", "required": False},
"metrics_source": {"type": "dict", "required": False},
"snapshot": {"type": "dict", "required": False},
"metadata_migration": {"type": "dict", "required": False}
}


Expand All @@ -35,11 +37,14 @@ class Environment:
backfill: Optional[Backfill] = None
metrics_source: Optional[MetricsSource] = None
snapshot: Optional[Snapshot] = None
metadata: Optional[Metadata] = None

def __init__(self, config_file: str):
logger.info(f"Loading config file: {config_file}")
self.config_file = config_file
with open(self.config_file) as f:
self.config = yaml.safe_load(f)
logger.info(f"Loaded config file: {self.config}")
v = Validator(SCHEMA)
if not v.validate(self.config):
logger.error(f"Config file validation errors: {v.errors}")
Expand Down Expand Up @@ -79,3 +84,7 @@ def __init__(self, config_file: str):
logger.info(f"Snapshot initialized: {self.snapshot}")
else:
logger.info("No snapshot provided")
if 'metadata_migration' in self.config:
self.metadata: Metadata = Metadata(self.config["metadata_migration"],
target_cluster=self.target_cluster,
snapshot=self.snapshot)
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
from typing import Tuple

from console_link.models.metadata import Metadata
from console_link.models.utils import ExitCode, generate_log_file_path
import logging

logger = logging.getLogger(__name__)


def migrate(metadata: Metadata, detached: bool) -> Tuple[ExitCode, str]:
logger.info("Migrating metadata")
if detached:
log_file = generate_log_file_path("metadata_migration")
logger.info(f"Running in detached mode, writing logs to {log_file}")

Check warning on line 14 in TrafficCapture/dockerSolution/src/main/docker/migrationConsole/lib/console_link/console_link/logic/metadata.py

View check run for this annotation

Codecov / codecov/patch

TrafficCapture/dockerSolution/src/main/docker/migrationConsole/lib/console_link/console_link/logic/metadata.py#L13-L14

Added lines #L13 - L14 were not covered by tests
try:
result = metadata.migrate(detached_log=log_file if detached else None)
except Exception as e:
logger.error(f"Failed to migrate metadata: {e}")
return ExitCode.FAILURE, f"Failure when migrating metadata: {e}"

Check warning on line 19 in TrafficCapture/dockerSolution/src/main/docker/migrationConsole/lib/console_link/console_link/logic/metadata.py

View check run for this annotation

Codecov / codecov/patch

TrafficCapture/dockerSolution/src/main/docker/migrationConsole/lib/console_link/console_link/logic/metadata.py#L17-L19

Added lines #L17 - L19 were not covered by tests
if result.success:
return ExitCode.SUCCESS, result.value
return ExitCode.FAILURE, result.value

Check warning on line 22 in TrafficCapture/dockerSolution/src/main/docker/migrationConsole/lib/console_link/console_link/logic/metadata.py

View check run for this annotation

Codecov / codecov/patch

TrafficCapture/dockerSolution/src/main/docker/migrationConsole/lib/console_link/console_link/logic/metadata.py#L22

Added line #L22 was not covered by tests
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ class Cluster:
aws_secret_arn: Optional[str] = None
auth_type: Optional[AuthMethod] = None
auth_details: Optional[Dict[str, Any]] = None
allow_insecure: bool = None
allow_insecure: bool = False

def __init__(self, config: Dict) -> None:
logger.info(f"Initializing cluster with config: {config}")
Expand Down Expand Up @@ -128,6 +128,7 @@ def execute_benchmark_workload(self, workload: str,
elif self.auth_type == AuthMethod.SIGV4:
raise NotImplementedError(f"Auth type {self.auth_type} is not currently support for executing "
f"benchmark workloads")
# Note -- we should censor the password when logging this command
logger.info(f"Running opensearch-benchmark with '{workload}' workload")
subprocess.run(f"opensearch-benchmark execute-test --distribution-version=1.0.0 "
f"--target-host={self.endpoint} --workload={workload} --pipeline=benchmark-only --test-mode "
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,205 @@
import os
import subprocess
from typing import Optional
from cerberus import Validator
import tempfile
import logging

from console_link.models.command_result import CommandResult
from console_link.models.schema_tools import list_schema
from console_link.models.cluster import AuthMethod, Cluster
from console_link.models.snapshot import S3Snapshot, Snapshot, FileSystemSnapshot

logger = logging.getLogger(__name__)

FROM_SNAPSHOT_SCHEMA = {
"type": "dict",
# In the future, there should be a "from_snapshot" and a "from_live_cluster" option, but for now only snapshot is
# supported, so this is required. It _can_ be null, but that requires a snapshot to defined on its own in
# the services.yaml file.
"required": True,
"nullable": True,
"schema": {
"snapshot_name": {"type": "string", "required": True},
"local_dir": {"type": "string", "required": False},
"s3": {
'type': 'dict',
"required": False,
'schema': {
'repo_uri': {'type': 'string', 'required': True},
'aws_region': {'type': 'string', 'required': True},
}
},
"fs": {
'type': 'dict',
"required": False,
'schema': {
'repo_path': {'type': 'string', 'required': True},
}
}
},
# We _should_ have the check below, but I need to figure out how to combine it with a potentially
# nullable block (like this one)
# 'check_with': contains_one_of({'s3', 'fs'})

}

SCHEMA = {
"from_snapshot": FROM_SNAPSHOT_SCHEMA,
"min_replicas": {"type": "integer", "min": 0, "required": False},
"index_allowlist": list_schema(required=False),
"index_template_allowlist": list_schema(required=False),
"component_template_allowlist": list_schema(required=False)
}


def generate_tmp_dir(name: str) -> str:
return tempfile.mkdtemp(prefix=f"migration-{name}-")


class Metadata:
def __init__(self, config, target_cluster: Cluster, snapshot: Optional[Snapshot] = None):
logger.debug(f"Initializing Metadata with config: {config}")
v = Validator(SCHEMA)
if not v.validate(config):
logger.error(f"Invalid config: {v.errors}")
raise ValueError(v.errors)
self._config = config
self._target_cluster = target_cluster
self._snapshot = snapshot

if (not snapshot) and (config["from_snapshot"] is None):
raise ValueError("No snapshot is specified or can be assumed "
"for the metadata migration to use.")

self._min_replicas = config.get("min_replicas", 0)
self._index_allowlist = config.get("index_allowlist", None)
self._index_template_allowlist = config.get("index_template_allowlist", None)
self._component_template_allowlist = config.get("component_template_allowlist", None)
logger.debug(f"Min replicas: {self._min_replicas}")
logger.debug(f"Index allowlist: {self._index_allowlist}")
logger.debug(f"Index template allowlist: {self._index_template_allowlist}")
logger.debug(f"Component template allowlist: {self._component_template_allowlist}")

# If `from_snapshot` is fully specified, use those values to define snapshot params
if config["from_snapshot"] is not None:
logger.debug("Using fully specified snapshot config")
self._init_from_config()
else:
logger.debug("Using independently specified snapshot")
if isinstance(snapshot, S3Snapshot):
self._init_from_s3_snapshot(snapshot)
elif isinstance(snapshot, FileSystemSnapshot):
self._init_from_fs_snapshot(snapshot)

Check warning on line 93 in TrafficCapture/dockerSolution/src/main/docker/migrationConsole/lib/console_link/console_link/models/metadata.py

View check run for this annotation

Codecov / codecov/patch

TrafficCapture/dockerSolution/src/main/docker/migrationConsole/lib/console_link/console_link/models/metadata.py#L92-L93

Added lines #L92 - L93 were not covered by tests

if config["from_snapshot"] is not None and "local_dir" in config["from_snapshot"]:
self._local_dir = config["from_snapshot"]["local_dir"]
else:
self._local_dir = generate_tmp_dir(self._snapshot_name)

logger.debug(f"Snapshot name: {self._snapshot_name}")
if self._snapshot_location == 's3':
logger.debug(f"S3 URI: {self._s3_uri}")
logger.debug(f"AWS region: {self._aws_region}")
else:
logger.debug(f"Local dir: {self._local_dir}")

logger.info("Metadata migration configuration defined")

def _init_from_config(self) -> None:
config = self._config
self._snapshot_location = 's3' if 's3' in config["from_snapshot"] else 'fs'
self._snapshot_name = config["from_snapshot"]["snapshot_name"]

if self._snapshot_location == 'fs':
self._repo_path = config["from_snapshot"]["fs"]["repo_path"]
else:
self._s3_uri = config["from_snapshot"]["s3"]["repo_uri"]
self._aws_region = config["from_snapshot"]["s3"]["aws_region"]

def _init_from_s3_snapshot(self, snapshot: S3Snapshot) -> None:
self._snapshot_name = snapshot.snapshot_name
self._snapshot_location = "s3"
self._s3_uri = snapshot.s3_repo_uri
self._aws_region = snapshot.s3_region

def _init_from_fs_snapshot(self, snapshot: FileSystemSnapshot) -> None:
self._snapshot_name = snapshot.snapshot_name
self._snapshot_location = "fs"
self._repo_path = snapshot.repo_path

Check warning on line 129 in TrafficCapture/dockerSolution/src/main/docker/migrationConsole/lib/console_link/console_link/models/metadata.py

View check run for this annotation

Codecov / codecov/patch

TrafficCapture/dockerSolution/src/main/docker/migrationConsole/lib/console_link/console_link/models/metadata.py#L127-L129

Added lines #L127 - L129 were not covered by tests

def migrate(self, detached_log=None) -> CommandResult:
password_field_index = None
command = [
"/root/metadataMigration/bin/MetadataMigration",
# Initially populate only the required params
"--snapshot-name", self._snapshot_name,
"--target-host", self._target_cluster.endpoint,
"--min-replicas", str(self._min_replicas)
]
if self._snapshot_location == 's3':
command.extend([
"--s3-local-dir", self._local_dir,
"--s3-repo-uri", self._s3_uri,
"--s3-region", self._aws_region,
])
elif self._snapshot_location == 'fs':
command.extend([
"--file-system-repo-path", self._repo_path,
])

if self._target_cluster.auth_details == AuthMethod.BASIC_AUTH:
try:
command.extend([

Check warning on line 153 in TrafficCapture/dockerSolution/src/main/docker/migrationConsole/lib/console_link/console_link/models/metadata.py

View check run for this annotation

Codecov / codecov/patch

TrafficCapture/dockerSolution/src/main/docker/migrationConsole/lib/console_link/console_link/models/metadata.py#L152-L153

Added lines #L152 - L153 were not covered by tests
"--target-username", self._target_cluster.auth_details.get("username"),
"--target-password", self._target_cluster.auth_details.get("password")
])
password_field_index = len(command) - 1
logger.info("Using basic auth for target cluster")
except KeyError as e:
raise ValueError(f"Missing required auth details for target cluster: {e}")

Check warning on line 160 in TrafficCapture/dockerSolution/src/main/docker/migrationConsole/lib/console_link/console_link/models/metadata.py

View check run for this annotation

Codecov / codecov/patch

TrafficCapture/dockerSolution/src/main/docker/migrationConsole/lib/console_link/console_link/models/metadata.py#L157-L160

Added lines #L157 - L160 were not covered by tests

if self._target_cluster.allow_insecure:
command.append("--target-insecure")

if self._index_allowlist:
command.extend(["--index-allowlist", ",".join(self._index_allowlist)])

if self._index_template_allowlist:
command.extend(["--index-template-allowlist", ",".join(self._index_template_allowlist)])

if self._component_template_allowlist:
command.extend(["--component-template-allowlist", ",".join(self._component_template_allowlist)])

if password_field_index:
display_command = command[:password_field_index] + ["********"] + command[password_field_index:]

Check warning on line 175 in TrafficCapture/dockerSolution/src/main/docker/migrationConsole/lib/console_link/console_link/models/metadata.py

View check run for this annotation

Codecov / codecov/patch

TrafficCapture/dockerSolution/src/main/docker/migrationConsole/lib/console_link/console_link/models/metadata.py#L175

Added line #L175 was not covered by tests
else:
display_command = command
logger.info(f"Migrating metadata with command: {' '.join(display_command)}")

if detached_log:
return self._run_as_detached_process(command, detached_log)
return self._run_as_synchronous_process(command)

def _run_as_synchronous_process(self, command) -> CommandResult:
try:
# Pass None to stdout and stderr to not capture output and show in terminal
subprocess.run(command, stdout=None, stderr=None, text=True, check=True)
logger.info(f"Metadata migration for snapshot {self._snapshot_name} completed")
return CommandResult(success=True, value="Metadata migration completed")
except subprocess.CalledProcessError as e:
logger.error(f"Failed to migrate metadata: {str(e)}")
return CommandResult(success=False, value=f"Failed to migrate metadata: {str(e)}")

Check warning on line 192 in TrafficCapture/dockerSolution/src/main/docker/migrationConsole/lib/console_link/console_link/models/metadata.py

View check run for this annotation

Codecov / codecov/patch

TrafficCapture/dockerSolution/src/main/docker/migrationConsole/lib/console_link/console_link/models/metadata.py#L190-L192

Added lines #L190 - L192 were not covered by tests

def _run_as_detached_process(self, command, log_file) -> CommandResult:
try:
with open(log_file, "w") as f:
# Start the process in detached mode
process = subprocess.Popen(command, stdout=f, stderr=subprocess.STDOUT, preexec_fn=os.setpgrp)
logger.info(f"Metadata migration process started with PID {process.pid}")
logger.info(f"Metadata migration logs available at {log_file}")
return CommandResult(success=True, value=f"Metadata migration started with PID {process.pid}\n"
f"Logs are being written to {log_file}")
except subprocess.CalledProcessError as e:
logger.error(f"Failed to create snapshot: {str(e)}")
return CommandResult(success=False, value=f"Failed to migrate metadata: {str(e)}")

Check warning on line 205 in TrafficCapture/dockerSolution/src/main/docker/migrationConsole/lib/console_link/console_link/models/metadata.py

View check run for this annotation

Codecov / codecov/patch

TrafficCapture/dockerSolution/src/main/docker/migrationConsole/lib/console_link/console_link/models/metadata.py#L203-L205

Added lines #L203 - L205 were not covered by tests
Loading
Loading