Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update ClinVar update docs for v4 #1251

Merged
merged 1 commit into from
Nov 2, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 7 additions & 5 deletions deploy/docs/UpdateClinvarVariants.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@

2. Run data pipeline

ClinVar pipelines use VEP and thus must be run on clusters with VEP installed and configured. To match gnomAD v2.1, GRCh37 ClinVar variants should be annotated with VEP 85. To match gnomAD v3.1, GRCh38 ClinVar variants should be annotated with VEP 101.
ClinVar pipelines use VEP and thus must be run on clusters with VEP installed and configured. To match gnomAD v2.1 (GRCh37) ClinVar variants should be annotated with VEP 85. To match gnomAD v4.0 (GRCh38) ClinVar variants should be annotated with VEP 101.

1. Start Dataproc cluster

Expand All @@ -23,8 +23,8 @@
GRCh38

```
./deployctl dataproc-cluster start vep101 \
--init=gs://gcp-public-data--gnomad/resources/vep/v101/init-vep101.sh \
./deployctl dataproc-cluster start vep105 \
--init=gs://gcp-public-data--gnomad/resources/vep/v105/vep105-init.sh \
--metadata=VEP_CONFIG_PATH=/vep_data/vep-gcloud.json,VEP_CONFIG_URI=file:///vep_data/vep-gcloud.json,VEP_REPLICATE=us \
--master-machine-type n1-highmem-8 \
--worker-machine-type n1-highmem-8 \
Expand All @@ -44,9 +44,11 @@
GRCh38

```
./deployctl data-pipeline run --cluster vep101 clinvar_grch38
./deployctl data-pipeline run --cluster vep105 clinvar_grch38
```

\*Note: The `vep105-init.sh` script is inconsistent about starting Docker. As a workaround, after starting the Dataproc Cluster, SSH into every individual node and run `sudo systemctl start docker`

3. Load variants to Elasticsearch

GRCh37
Expand All @@ -58,7 +60,7 @@
GRCh38

```
./deployctl elasticsearch load-datasets --dataproc-cluster vep101 clinvar_grch38_variants
./deployctl elasticsearch load-datasets --dataproc-cluster vep105 clinvar_grch38_variants
```

4. [Update Elasticsearch index aliases](./ElasticsearchIndexAliases.md)
Expand Down