Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Console] Backfill -- Add interface and YAML specification #728

Merged
Original file line number Diff line number Diff line change
@@ -1,17 +1,34 @@
# Console_link Library
- [Services.yaml spec](#servicesyaml-spec)
- [Cluster](#cluster)
- [Metrics Source](#metrics-source)
- [Backfill](#backfill)
- [Reindex From Snapshot](#reindex-from-snapshot)
- [OpenSearch Ingestion](#opensearch-ingestion)
- [Usage](#usage)
- [Library](#library)
- [CLI](#cli)
- [Global Options](#global-options)
- [Objects](#objects)
- [Commands \& options](#commands--options)
- [Development](#development)
- [Unit Tests](#unit-tests)
- [Coverage](#coverage)



The console link library is designed to provide a unified interface for the many possible backend services involved in a migration. The interface can be used by multiple frontends--a CLI app and a web API, for instance.

![Console_link Library Diagram](console_library_diagram.svg)


The user defines their migration services in a `migration_services.yaml` file, by default found at `/etc/migration_services.yaml`.

Currently, the supported services are:
* `source_cluster`: Source cluster details
* `target_cluster`: Target cluster details
* `metrics_source`: Metrics source details
* `backfill`: Backfill migration details

- `source_cluster`: Source cluster details
- `target_cluster`: Target cluster details
- `metrics_source`: Metrics source details
- `backfill`: Backfill migration details

For example:

Expand Down Expand Up @@ -45,14 +62,17 @@ backfill:
- "migration_deployment=1.0.6"
```

### Services.yaml spec

#### Cluster
## Services.yaml spec
peternied marked this conversation as resolved.
Show resolved Hide resolved

### Cluster

Source and target clusters have the following options:

- `endpoint`: required, the endpoint to reach the cluster.

Exactly one of the following blocks must be present:

- `no_auth`: may be empty, no authorization to use.
- `basic_auth`:
- `username`
Expand All @@ -61,58 +81,119 @@ Exactly one of the following blocks must be present:

Having a `source_cluster` and `target_cluster` is required.

#### Metrics Source
### Metrics Source

Currently, the two supported metrics source types are `prometheus` and `cloudwatch`.
Exactly one of the following blocks must be present:

- `prometheus`:
- `endpoint`: required

- `cloudwatch`: may be empty if region is not specified
- `aws_region`: optional. if not provided, the usual rules are followed for determining aws region. (`AWS_DEFAULT_REGION`, `~/.aws/config`, etc.)

### Backfill

Backfill can be performed via several mechansims. The primary two supported by the console library are Reindex From Snapshot (RFS) and
OpenSearch Ingestion Pipeline (OSI).
peternied marked this conversation as resolved.
Show resolved Hide resolved

#### Reindex From Snapshot

#### Backfill
Depending on the purpose/deployment strategy, RFS can be used in Docker or on AWS in an Elastic Container Service (ECS) deployment.
Most of the parameters for these two are the same, with some additional ones specific to the deployment.

Currently, the only supported backfill migration type is `opensearch_ingestion`.
- `reindex_from_snapshot`
- `snapshot_repo`: optional, path to the snapshot repo. If not provided, ???
- `snapshot_name`: optional, name of the snapshot to use as the source. If not provided, ???
- `scale`: optional int, number of instances to enable when `backfill start` is run. In the future, this will be modifiable during the run
with `backfill scale X`. Default is 1.

There is also a block that specifies the deployment type. Exactly one of the following blocks must be present:

- `docker`:
- `socket`: optional, path to mounted docker socket, defaults to `/var/run/docker.sock`

- `ecs`:
- `service_name`: required, name of the ECS service for RFS
- `aws_region`: optional. if not provided, the usual rules are followed for determining aws region. (`AWS_DEFAULT_REGION`, `~/.aws/config`, etc.)

Both of the following are valid RFS backfill specifications:

```yaml
backfill:
reindex_from_snapshot:
docker:
```

```yaml
backfill:
reindex_from_snapshot:
snapshot_repo: "abc"
snapshot_name: "def"
scale: 3
ecs:
service_name: migration-aws-integ-reindex-from-snapshot
aws-region: us-east-1
```

#### OpenSearch Ingestion
- `pipeline_role_arn`: required, IAM pipeline role containing permissions to read from source and read/write to target, more details [here](https://docs.aws.amazon.com/opensearch-service/latest/developerguide/pipeline-security-overview.html#pipeline-security-sink)
- `vpc_subnet_ids`: required, VPC subnets to place the OSI pipeline in
- `security_group_ids`: required, security groups to apply to OSI pipeline for accessing source and target clusters
- `aws_region`: required, AWS region to look for pipeline role and secrets for cluster
- `pipeline_name`: optional, name of OSI pipeline
- `index_regex_selection`: optional, list of index inclusion regex strings for selecting indices to migrate
- `log_group_name`: optional, name of existing CloudWatch log group to use for OSI logs
- `tags`: optional, list of tags to apply to OSI pipeline

# Usage

- `opensearch_ingestion`
- `pipeline_role_arn`: required, IAM pipeline role containing permissions to read from source and read/write to target, more details [here](https://docs.aws.amazon.com/opensearch-service/latest/developerguide/pipeline-security-overview.html#pipeline-security-sink)
- `vpc_subnet_ids`: required, VPC subnets to place the OSI pipeline in
- `security_group_ids`: required, security groups to apply to OSI pipeline for accessing source and target clusters
- `aws_region`: required, AWS region to look for pipeline role and secrets for cluster
- `pipeline_name`: optional, name of OSI pipeline
- `index_regex_selection`: optional, list of index inclusion regex strings for selecting indices to migrate
- `log_group_name`: optional, name of existing CloudWatch log group to use for OSI logs
- `tags`: optional, list of tags to apply to OSI pipeline

## Usage

### Library

The library can be imported and used within another application.
Use `pip install .` from the top-level `console_link` directory to install it locally and then import it as, e.g. `from console_link.models.metrics_source import MetricsSource`

#### CLI
### CLI

The CLI comes installed on the migration console. If you'd like to install it elsewhere, `pip install .` from the top-level `console_link` directory will install it and setup a `console` executable to access it.

Autocomplete can be enabled by adding `eval "$(_CONSOLE_COMPLETE=bash_source console)"` to your `.bashrc` file, or `eval "$(_FOO_BAR_COMPLETE=zsh_source foo-bar)"` to your `.zshrc` and re-sourcing your shell.

The structure of cli commands is:
`console [--global-options] OBJECT COMMAND [--options]`

##### Global Options
#### Global Options

The available global options are:

- `--config-file FILE` to specify the path to a config file (default is `/etc/migration_services.yaml`)
- `--json` to get output in JSON designed for machine consumption instead of printing to the console

##### Objects
#### Objects

Currently, the two objects available are `cluster` and `metrics`.

##### Commands & options
#### Commands & options

Each object has its own commands available, and each command has its own options. To see the available commands and options, use:
```

```sh
console OBJECT --help
```

## Development

To install the library for development purposes, create a virtual env and install the library. It will automatically install its dependencies as well.

```shell
python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install --upgrade pip
pip install -e . # or the path to the console_link directory
```

### Unit Tests

Unit tests can be run from this current `console_link/` by first installing dependencies then running pytest:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,8 @@
import click
import console_link.logic.clusters as logic_clusters
import console_link.logic.metrics as logic_metrics
from console_link.logic.instantiation import Environment
import console_link.logic.backfill as logic_backfill
from console_link.environment import Environment
from console_link.models.metrics_source import Component, MetricStatistic
import logging

Expand All @@ -15,7 +16,10 @@
class Context(object):
def __init__(self, config_file) -> None:
self.config_file = config_file
self.env = Environment(config_file)
try:
self.env = Environment(config_file)
except Exception as e:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a blocker, but - should we be more specific here? E.g. it isn't necessarily a user input issue if they don't have access to the file, for example.

raise click.ClickException(str(e))
self.json = False


Expand All @@ -32,16 +36,17 @@
logging.basicConfig(level=logging.WARN - (10 * verbose))
logger.info(f"Logging set to {logging.getLevelName(logger.getEffectiveLevel())}")


# ##################### CLUSTERS ###################


@cli.group(name="clusters")
@click.pass_obj
def cluster_group(ctx):
if ctx.env.source_cluster is None:
raise ValueError("Source cluster is not set")
raise click.UsageError("Source cluster is not set")

Check warning on line 47 in TrafficCapture/dockerSolution/src/main/docker/migrationConsole/lib/console_link/console_link/cli.py

View check run for this annotation

Codecov / codecov/patch

TrafficCapture/dockerSolution/src/main/docker/migrationConsole/lib/console_link/console_link/cli.py#L47

Added line #L47 was not covered by tests
if ctx.env.target_cluster is None:
raise ValueError("Target cluster is not set")
raise click.UsageError("Target cluster is not set")

Check warning on line 49 in TrafficCapture/dockerSolution/src/main/docker/migrationConsole/lib/console_link/console_link/cli.py

View check run for this annotation

Codecov / codecov/patch

TrafficCapture/dockerSolution/src/main/docker/migrationConsole/lib/console_link/console_link/cli.py#L49

Added line #L49 was not covered by tests


@cluster_group.command(name="cat-indices")
Expand Down Expand Up @@ -74,7 +79,7 @@
@click.pass_obj
def replayer_group(ctx):
if ctx.env.replayer is None:
raise ValueError("Replayer is not set")
raise click.UsageError("Replayer is not set")

Check warning on line 82 in TrafficCapture/dockerSolution/src/main/docker/migrationConsole/lib/console_link/console_link/cli.py

View check run for this annotation

Codecov / codecov/patch

TrafficCapture/dockerSolution/src/main/docker/migrationConsole/lib/console_link/console_link/cli.py#L82

Added line #L82 was not covered by tests


@replayer_group.command(name="start")
Expand All @@ -93,42 +98,75 @@
def backfill_group(ctx):
"""All actions related to historical/backfill data migrations"""
if ctx.env.backfill is None:
raise ValueError("Backfill migration is not set")
raise click.UsageError("Backfill migration is not set")

Check warning on line 101 in TrafficCapture/dockerSolution/src/main/docker/migrationConsole/lib/console_link/console_link/cli.py

View check run for this annotation

Codecov / codecov/patch

TrafficCapture/dockerSolution/src/main/docker/migrationConsole/lib/console_link/console_link/cli.py#L101

Added line #L101 was not covered by tests


@backfill_group.command(name="describe")
@click.pass_obj
def describe_backfill_cmd(ctx):
click.echo(logic_backfill.describe(ctx.env.backfill, as_json=ctx.json))

@backfill_group.command(name="create-migration")

@backfill_group.command(name="create")
peternied marked this conversation as resolved.
Show resolved Hide resolved
@click.option('--pipeline-template-file', default='/root/osiPipelineTemplate.yaml', help='Path to config file')
@click.option("--print-config-only", is_flag=True, show_default=True, default=False,
help="Flag to only print populated pipeline config when executed")
@click.pass_obj
def create_migration_backfill_cmd(ctx, pipeline_template_file, print_config_only):
"""Create migration action"""
ctx.env.backfill.create(pipeline_template_path=pipeline_template_file, print_config_only=print_config_only)
def create_backfill_cmd(ctx, pipeline_template_file, print_config_only):
exitcode, message = logic_backfill.create(ctx.env.backfill,

Check warning on line 116 in TrafficCapture/dockerSolution/src/main/docker/migrationConsole/lib/console_link/console_link/cli.py

View check run for this annotation

Codecov / codecov/patch

TrafficCapture/dockerSolution/src/main/docker/migrationConsole/lib/console_link/console_link/cli.py#L116

Added line #L116 was not covered by tests
pipeline_template_path=pipeline_template_file,
print_config_only=print_config_only)
if exitcode != 0:
raise click.ClickException(message)
click.echo(message)

Check warning on line 121 in TrafficCapture/dockerSolution/src/main/docker/migrationConsole/lib/console_link/console_link/cli.py

View check run for this annotation

Codecov / codecov/patch

TrafficCapture/dockerSolution/src/main/docker/migrationConsole/lib/console_link/console_link/cli.py#L119-L121

Added lines #L119 - L121 were not covered by tests


@backfill_group.command(name="start-migration")
@backfill_group.command(name="start")
@click.option('--pipeline-name', default=None, help='Optionally specify a pipeline name')
@click.pass_obj
def start_migration_backfill_cmd(ctx, pipeline_name):
"""Start migration action"""
ctx.env.backfill.start(pipeline_name=pipeline_name)
def start_backfill_cmd(ctx, pipeline_name):
exitcode, message = logic_backfill.start(ctx.env.backfill, pipeline_name=pipeline_name)
if exitcode != 0:
raise click.ClickException(message)
click.echo(message)

Check warning on line 131 in TrafficCapture/dockerSolution/src/main/docker/migrationConsole/lib/console_link/console_link/cli.py

View check run for this annotation

Codecov / codecov/patch

TrafficCapture/dockerSolution/src/main/docker/migrationConsole/lib/console_link/console_link/cli.py#L128-L131

Added lines #L128 - L131 were not covered by tests


@backfill_group.command(name="stop-migration")
@backfill_group.command(name="stop")
@click.option('--pipeline-name', default=None, help='Optionally specify a pipeline name')
@click.pass_obj
def stop_migration_backfill_cmd(ctx, pipeline_name):
"""Stop migration action"""
ctx.env.backfill.stop(pipeline_name=pipeline_name)
def stop_backfill_cmd(ctx, pipeline_name):
exitcode, message = logic_backfill.stop(ctx.env.backfill, pipeline_name=pipeline_name)
if exitcode != 0:
raise click.ClickException(message)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a blocker, just curious - why raise ClickException rather than anything else?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It exits with a non-zero exit code and also displays the error "nicely" --not with a full traceback, etc. I think subclassing that exception to create custom ones could be a good way to go in the future to get more specific errors while keeping the convenience features.

Docs are here: https://click.palletsprojects.com/en/8.1.x/exceptions/#exception-handling

click.echo(message)

Check warning on line 141 in TrafficCapture/dockerSolution/src/main/docker/migrationConsole/lib/console_link/console_link/cli.py

View check run for this annotation

Codecov / codecov/patch

TrafficCapture/dockerSolution/src/main/docker/migrationConsole/lib/console_link/console_link/cli.py#L138-L141

Added lines #L138 - L141 were not covered by tests


@backfill_group.command(name="scale")
@click.argument("units", type=int, required=True)
@click.pass_obj
def scale_backfill_cmd(ctx, units: int):
exitcode, message = logic_backfill.scale(ctx.env.backfill, units)
if exitcode != 0:
raise click.ClickException(message)
click.echo(message)

Check warning on line 151 in TrafficCapture/dockerSolution/src/main/docker/migrationConsole/lib/console_link/console_link/cli.py

View check run for this annotation

Codecov / codecov/patch

TrafficCapture/dockerSolution/src/main/docker/migrationConsole/lib/console_link/console_link/cli.py#L148-L151

Added lines #L148 - L151 were not covered by tests


@backfill_group.command(name="status")
@click.pass_obj
def status_backfill_cmd(ctx):
exitcode, message = logic_backfill.status(ctx.env.backfill)
if exitcode != 0:
raise click.ClickException(message)
click.echo(message)

Check warning on line 160 in TrafficCapture/dockerSolution/src/main/docker/migrationConsole/lib/console_link/console_link/cli.py

View check run for this annotation

Codecov / codecov/patch

TrafficCapture/dockerSolution/src/main/docker/migrationConsole/lib/console_link/console_link/cli.py#L157-L160

Added lines #L157 - L160 were not covered by tests

# ##################### METRICS ###################


@cli.group(name="metrics")
@click.pass_obj
def metrics_group(ctx):
if ctx.env.metrics_source is None:
raise ValueError("Metrics source is not set")
raise click.UsageError("Metrics source is not set")

Check warning on line 169 in TrafficCapture/dockerSolution/src/main/docker/migrationConsole/lib/console_link/console_link/cli.py

View check run for this annotation

Codecov / codecov/patch

TrafficCapture/dockerSolution/src/main/docker/migrationConsole/lib/console_link/console_link/cli.py#L169

Added line #L169 was not covered by tests


@metrics_group.command(name="list")
Expand Down
Loading
Loading