-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Azure-compatibility #610
base: main
Are you sure you want to change the base?
Azure-compatibility #610
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking like awesome progress!
'-c', | ||
'--cloud', | ||
required=False, | ||
default=DEFAULT_CLOUD_ENVIRONMENT, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we omit instead of provide a default? Which lets the analysis-runner decide the default
server/util.py
Outdated
if environment == 'gcp': | ||
# do this to check access-members cache | ||
gcp_project = dataset_config.get('gcp', {}).get('projectId') | ||
|
||
if not gcp_project: | ||
raise web.HTTPBadRequest( | ||
reason=f'The analysis-runner does not support checking group members for the {environment} environment' | ||
) | ||
elif environment == 'azure': | ||
azure_resource_group = dataset_config.get('azure', {}).get('resourceGroup') | ||
|
||
if not azure_resource_group: | ||
raise web.HTTPBadRequest( | ||
reason=f'The analysis-runner does not support checking group members for the {environment} environment' | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can remove this, group member checks are not in secrets, therefore no gcp_project ID is needed anymore (I think):
if environment == 'gcp': | |
# do this to check access-members cache | |
gcp_project = dataset_config.get('gcp', {}).get('projectId') | |
if not gcp_project: | |
raise web.HTTPBadRequest( | |
reason=f'The analysis-runner does not support checking group members for the {environment} environment' | |
) | |
elif environment == 'azure': | |
azure_resource_group = dataset_config.get('azure', {}).get('resourceGroup') | |
if not azure_resource_group: | |
raise web.HTTPBadRequest( | |
reason=f'The analysis-runner does not support checking group members for the {environment} environment' | |
) |
server/util.py
Outdated
if environment == 'gcp': | ||
output_dir = f'gs://cpg-{dataset}-{cpg_namespace(access_level)}/{output_prefix}' | ||
elif environment == 'azure': | ||
# TODO: need a way for analysis runner to know where to save metadata |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It follows the same sort of convention right, where storage-account is cpg{datasetWithoutTabs}
azure://{storage-account}/{main,test}/{output_prefix}
test/hail_batch_job.py
Outdated
import hailtop.batch as hb | ||
|
||
|
||
@click.command() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a couple of test workflows in examples/batch
, can you use them or move this one to there?
Changes to the analysis runner to enable Azure compatability.