The continuous integration (CI) and continuous delivery (CD) pipeline of
GATK-SV is developed on GitHub Actions.
The CI/CD pipeline is defined via multiple workflows where each is
a .yml
file stored under the .github/workflows
directory. The workflows
are triggered automatically when a pull request (PR) is issued or merged.
The workflows automate testing, building, and deploying the pipeline,
and they currently cover the following areas.
- Lint Python scripts (
pytest.yaml
): using flake8 asserts if the Python scripts follow the PEP-8 style guides; - Test, build, and publish docker images using
build_docker.py
(sv_pipeline_docker.yml
).
The GATK-SV Docker images are built and published using
the build_docker.py
, which
is documented at this README and can be
executed locally.
The Docker Images workflow
(DIW) automates the
test, build, and publication of GATK-SV Docker images using build_docker.py
,
such that, the images are built when a PR is issued against the repository,
and published to Google Cloud Platform (GCP) Container Registry
(GCR) (at us.gcr.io
) when the PR is merged.
The DIW consists of three jobs:
-
Determine Build Args
. This job determines the arguments to be used by thebuild_docker.py
script, specifically:- Given the size and the number of GATK-SV Docker images, DIW builds and
publishes only the Docker images affected by the changes introduced in
a PR. Accordingly, first the files changed between the
HEAD
and theBASE
commits of the PR'd branch are determined usinggit diff
(for details, please refer to the in-line comments for the stepDetermine Commit SHAs
inDIW
), and then the affected images are determined. These images are used as the values of--targets
argument of thebuild_docker.py
script. - A step to compose a tag for the Docker images in the
DATE-HEAD_SHA_8
template, whereDATE
isYYYYMMDD
extracted from the time stamp of the last commit on the PR'd branch, andHEAD_SHA_8
is the first eight characters of its commit SHA. For instance20211201-86fe06fd
.
- Given the size and the number of GATK-SV Docker images, DIW builds and
publishes only the Docker images affected by the changes introduced in
a PR. Accordingly, first the files changed between the
-
Test Images Build
. This job is triggered when a commit is pushed to the PR'd branch; it builds docker images determined by theDetermine Build Args
job. This job fails if the building of the Docker images was unsuccessful. The Docker images built by this job will not be published to GCR and are discarded as the job succeeds. -
Publish
. This job is triggered when a PR is merged to a commit is pushed to themain
branch. Similar to theTest Images Build
job, this job builds Docker images and fails if the build process was unsuccessful. However, in addition, this job pushes the built images to GCR. To authorize access to GCR, this job assumes a GCP service account (SA) with read and write access to the GCR registry. The secrets related to the SA are defined as encrypted environment secrets.
_This section describes configuring the Deploy
environment to be used
by the Publish
job and is intended for the edification of repository admins.
An SA is used to authorize DIW to access GCR. (A future extension may
adopt an OpenID Connect [OIDC]-based
authentication and authorization). In order to assume the SA, the Publish
job needs the SA secrets (e.g., private key
and client email
) and
project name
. This information is defined in a GitHub environment,
and is exposed to the Publish
job as encrypted environment secrets.
The encrypted secrets are decrypted in the environment context and are
not exposed to the user code (if the norms of best practices are followed).
GitHub's environment secrets are a subset of repository-wide secrets,
which is a subset of organization-level secrets. We encrypt SA credentials
as GitHub's environment secrets as they allow pausing the execution of any
action that accesses the environment until it is approved by assigned
individuals.
In order to set up the Deploy
environment, you may take the following steps:
-
Create an SA on GCP IAM. For simplicity, you may assign the service account the
Editor
role. However, in order to follow the principles of the least privilege, you may assign theStorage Object Admin
,Storage Legacy Bucket Writer
, andStorage Object Viewer
as the minimum required permissions (ref). -
Get the service account's keys by going to the
Service Accounts page
and selecting the above-created service account and going to theKEYS
tab. Then click on theADD KEY
button, and chooseCreate new key
. In the pop-up window, selectJSON
type and click on theCREATE
button. It will download a JSON file containing the secrets required to assume the service account. -
Base64 encode the service account's secrets in the JSON format as the following.
openssl base64 -in service-account.json -out service-account.txt
-
Create an environment following these steps and name the environment
Deploy
. -
Create the following two encrypted secrets in the
Deploy
environment using these steps:name
:GCP_PROJECT_ID
;value
: the ID of the GCP project under which you will use the GCR registry.name
:GCP_GCR_SA_KEY
;value
: the above-created base64 encoding of the SA's secrets. After you set this encrypted secret, we recommend that you delete both the.json
and.txt
files containing SA's secrets.
-
[Optional] Under the
Environment protection rules
on theDeploy
environment's configuration page, you may check theRequired reviewers
checkbox and assign maintainers who can approve the execution of the instances of the jobs that require access to theDeploy
environment.
Once the Deploy
environment is set up, and the Required reviewers
option under the section Environment protection rules
is checked,
with every push to the main
branch (e.g., merging a PR), the
DIW execution will pause at the Publish
job with the following
message:
Waiting for review: Deploy needs approval to start deploying changes.
If enabled, any Required reviewers
will see the following
additional link that they can click to approve or reject running the
job.
Review pending deployments