-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add RFS to CDK #575
Merged
Merged
Add RFS to CDK #575
Changes from all commits
Commits
Show all changes
17 commits
Select commit
Hold shift + click to select a range
55c74d1
Add docker compose setup for RFS with preloading data, also minor NPE…
lewijacn ee71868
Update preload data mechanism and add min replicas attribute for now
lewijacn 9ee2113
Add structure for RFS service in CDK
lewijacn 40d2564
Minor changes after testing RFS E2E with CDK
lewijacn f618704
Improvements to RFS gradle build, documentation, RFS completion refresh
lewijacn e42989c
Merge remote-tracking branch 'origin/main' into setup-rfs-compose
lewijacn 6f0814b
Merge remote-tracking branch 'origin/setup-rfs-compose' into rfs-cdk
lewijacn d0558b7
Add doc for starting RFS with CDK and update CDK test cases
lewijacn f66a4b8
Update index file
lewijacn 0d5c8a7
Minor additional to build Docker image for RFS as well in buildDocker…
lewijacn 231e4f0
Minor updates per PR feedback
lewijacn 967966a
Remove archive data file in favor of generating datasets with a multi…
lewijacn 5657830
Merge remote-tracking branch 'origin/main' into setup-rfs-compose
lewijacn c8ab364
Merge remote-tracking branch 'origin/setup-rfs-compose' into rfs-cdk
lewijacn 378c3fa
Update naming for preload data RFS and slightly alter pattern. Remove…
lewijacn 4f4e5a3
Add comment for RFS ES Source Dockerfile
lewijacn 2819114
Merge remote-tracking branch 'origin/main' into rfs-cdk
lewijacn File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
#!/bin/bash | ||
|
||
generate_data_requests() { | ||
endpoint="http://localhost:9200" | ||
# If auth or SSL is used, the correlating OSB options should be provided in this array | ||
options=() | ||
client_options=$(IFS=,; echo "${options[*]}") | ||
set -o xtrace | ||
|
||
echo "Running opensearch-benchmark workloads against ${endpoint}" | ||
echo "Running opensearch-benchmark w/ 'geonames' workload..." && | ||
opensearch-benchmark execute-test --distribution-version=1.0.0 --target-host=$endpoint --workload=geonames --pipeline=benchmark-only --test-mode --kill-running-processes --workload-params "target_throughput:0.5,bulk_size:10,bulk_indexing_clients:1,search_clients:1" --client-options=$client_options && | ||
echo "Running opensearch-benchmark w/ 'http_logs' workload..." && | ||
opensearch-benchmark execute-test --distribution-version=1.0.0 --target-host=$endpoint --workload=http_logs --pipeline=benchmark-only --test-mode --kill-running-processes --workload-params "target_throughput:0.5,bulk_size:10,bulk_indexing_clients:1,search_clients:1" --client-options=$client_options && | ||
echo "Running opensearch-benchmark w/ 'nested' workload..." && | ||
opensearch-benchmark execute-test --distribution-version=1.0.0 --target-host=$endpoint --workload=nested --pipeline=benchmark-only --test-mode --kill-running-processes --workload-params "target_throughput:0.5,bulk_size:10,bulk_indexing_clients:1,search_clients:1" --client-options=$client_options && | ||
echo "Running opensearch-benchmark w/ 'nyc_taxis' workload..." && | ||
opensearch-benchmark execute-test --distribution-version=1.0.0 --target-host=$endpoint --workload=nyc_taxis --pipeline=benchmark-only --test-mode --kill-running-processes --workload-params "target_throughput:0.5,bulk_size:10,bulk_indexing_clients:1,search_clients:1" --client-options=$client_options | ||
} | ||
|
||
dataset=$1 | ||
|
||
if [[ "$dataset" == "default_osb_test_workloads" ]]; then | ||
/usr/local/bin/docker-entrypoint.sh eswrapper & echo $! > /tmp/esWrapperProcess.pid && sleep 10 && generate_data_requests | ||
elif [[ "$dataset" == "skip_dataset" ]]; then | ||
echo "Skipping data generation step" | ||
mkdir -p /usr/share/elasticsearch/data/nodes | ||
else | ||
echo "Unknown dataset provided: ${dataset}" | ||
exit 1; | ||
fi |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
65 changes: 65 additions & 0 deletions
65
...oyment/cdk/opensearch-service-migration/lib/service-stacks/reindex-from-snapshot-stack.ts
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,65 @@ | ||
import {StackPropsExt} from "../stack-composer"; | ||
import {IVpc, SecurityGroup} from "aws-cdk-lib/aws-ec2"; | ||
import {CpuArchitecture} from "aws-cdk-lib/aws-ecs"; | ||
import {Construct} from "constructs"; | ||
import {join} from "path"; | ||
import {MigrationServiceCore} from "./migration-service-core"; | ||
import {Effect, PolicyStatement} from "aws-cdk-lib/aws-iam"; | ||
import {StringParameter} from "aws-cdk-lib/aws-ssm"; | ||
import { | ||
createOpenSearchIAMAccessPolicy, | ||
createOpenSearchServerlessIAMAccessPolicy | ||
} from "../common-utilities"; | ||
|
||
|
||
export interface ReindexFromSnapshotProps extends StackPropsExt { | ||
readonly vpc: IVpc, | ||
readonly sourceEndpoint: string, | ||
readonly fargateCpuArch: CpuArchitecture, | ||
readonly extraArgs?: string, | ||
readonly otelCollectorEnabled?: boolean | ||
} | ||
|
||
export class ReindexFromSnapshotStack extends MigrationServiceCore { | ||
|
||
constructor(scope: Construct, id: string, props: ReindexFromSnapshotProps) { | ||
super(scope, id, props) | ||
let securityGroups = [ | ||
SecurityGroup.fromSecurityGroupId(this, "serviceConnectSG", StringParameter.valueForStringParameter(this, `/migration/${props.stage}/${props.defaultDeployId}/serviceConnectSecurityGroupId`)), | ||
SecurityGroup.fromSecurityGroupId(this, "defaultDomainAccessSG", StringParameter.valueForStringParameter(this, `/migration/${props.stage}/${props.defaultDeployId}/osAccessSecurityGroupId`)), | ||
] | ||
|
||
const artifactS3Arn = StringParameter.valueForStringParameter(this, `/migration/${props.stage}/${props.defaultDeployId}/artifactS3Arn`) | ||
const artifactS3AnyObjectPath = `${artifactS3Arn}/*` | ||
const artifactS3PublishPolicy = new PolicyStatement({ | ||
effect: Effect.ALLOW, | ||
resources: [artifactS3Arn, artifactS3AnyObjectPath], | ||
actions: [ | ||
"s3:*" | ||
] | ||
}) | ||
|
||
const openSearchPolicy = createOpenSearchIAMAccessPolicy(this.region, this.account) | ||
const openSearchServerlessPolicy = createOpenSearchServerlessIAMAccessPolicy(this.region, this.account) | ||
let servicePolicies = [artifactS3PublishPolicy, openSearchPolicy, openSearchServerlessPolicy] | ||
|
||
const osClusterEndpoint = StringParameter.valueForStringParameter(this, `/migration/${props.stage}/${props.defaultDeployId}/osClusterEndpoint`) | ||
const s3Uri = `s3://migration-artifacts-${this.account}-${props.stage}-${this.region}/rfs-snapshot-repo` | ||
let rfsCommand = `/rfs-app/runJavaWithClasspath.sh com.rfs.ReindexFromSnapshot --s3-local-dir /tmp/s3_files --s3-repo-uri ${s3Uri} --s3-region ${props.env?.region} --snapshot-name rfs-snapshot --min-replicas 1 --enable-persistent-run --lucene-dir '/lucene' --source-host ${props.sourceEndpoint} --target-host ${osClusterEndpoint} --source-version es_7_10 --target-version os_2_11` | ||
rfsCommand = props.extraArgs ? rfsCommand.concat(` ${props.extraArgs}`) : rfsCommand | ||
|
||
this.createService({ | ||
serviceName: 'reindex-from-snapshot', | ||
taskInstanceCount: 0, | ||
dockerDirectoryPath: join(__dirname, "../../../../../", "RFS/docker"), | ||
dockerImageCommand: ['/bin/sh', '-c', rfsCommand], | ||
securityGroups: securityGroups, | ||
taskRolePolicies: servicePolicies, | ||
cpuArchitecture: props.fargateCpuArch, | ||
taskCpuUnits: 1024, | ||
taskMemoryLimitMiB: 4096, | ||
...props | ||
}); | ||
} | ||
|
||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked, but didn't find a way to JUST LOAD the data. The opensearch-benchmark runs load data and then they do tests on it. That latter part takes up most of the time & is work that we're not interested in for RFS. We might really, really want to load the data, do an RFS migration, then run the rest of the test so that we could test CDC on a historically migrated cluster.
It might be a good idea to open an issue or submit a PR with a new option for OSB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably raise an issue as I don't have a good understanding of how intertwined these two things are from looking at the actual workloads: https://github.com/opensearch-project/opensearch-benchmark-workloads/tree/main/geonames