Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-40924: [MATLAB][Packaging] Add script for uploading Release Candidate (RC) MLTBX packages for the MATLAB bindings to the Apache Arrow GitHub Releases area #40956

Closed
wants to merge 29 commits into from

Conversation

sgilmore10
Copy link
Member

@sgilmore10 sgilmore10 commented Apr 2, 2024

Rationale for this change

This is a follow up to #38660 and this mailing list discussion.

We would like to upload the MLTBX packages for the MATLAB bindings to the GitHub Releases area for the Apache Arrow GitHub project.

Using GitHub Releases will enable tight integration with the MathWorks File Exchange, which should make it easy for users to discover and install the MATLAB interface using MATLAB.

To make this possible, we need to update the release scripts under /dev/release to upload the MLTBX packages built by crossbow to the GitHub Releases area.

What changes are included in this PR?

  1. Modified dev/release/05-binary-upload.sh to sign and upload release candidate MLTBX files to apache/arrow's GitHub Releases Area using the gh release create command.
  2. Marked the created GitHub Release as a prerelease to differentiate between release candidates and official releases.
  3. Compared the format of the created GitHub Releases to the format of apache/arrow-flight-sql-postgresql's GitHub Releases to validate our changes made sense.

Are these changes tested?

I was not able to run dev/release/05-binary-upload.sh directly, but I did test the gh create release, gpg, and shasum commands worked as expected by creating a "dummy" release for this repository: https://github.com/sgilmore10/TestRepo/releases/tag/1.0.0rc1

Are there any user-facing changes?

No.

Future Directions

  1. Modify dev/release/post-02-upload.sh to create a non-prerelease GitHub Release for the MATLAB bindings.
  2. Create an Apache Arrow entry on the MathWorks File Exchange to facilitate a one-click install workflow for the MATLAB bindings.
  3. Refactor the changes made to 05-binary-upload.sh into a separate function that can be reused by post-02-upload.sh to reduce code duplication.

Open Questions

  1. Should we include the list of issues resolved in the release notes for prereleases?
  2. @kou, @assignUser, or @raulcd, do you have suggestions on how we can validate the changes to dev/release/05-binary-upload.sh?

Copy link

github-actions bot commented Apr 2, 2024

⚠️ GitHub issue #40924 has been automatically assigned in GitHub to PR creator.

@github-actions github-actions bot added the awaiting review Awaiting review label Apr 2, 2024
2. Update git tag
3. Fix target branch name
@kou
Copy link
Member

kou commented Apr 3, 2024

Ah, sorry for not sharing how to implement this in https://lists.apache.org/thread/hm7cbrkggomhx7nxdzr5xkrmkrztmmwo ...

How about using git tag -a apache-arrow-X.Y.Z-rcN, git push apache apache-arrow-X.Y.Z-rcN and GitHub Actions for the RC tag?

We can detect whether this push is for tag or commit by if: github.ref_type == 'tag'.
See also: https://docs.github.com/en/actions/learn-github-actions/contexts#github-context

If this push is for tag, we can download the built .mltbx file and upload it to GitHub Releases by GitHub Actions. We can't sign by GitHub Actions. So we need to sign on local later. We can do it by a release script.

This is a workflow what arrow-adbc already uses.
See also:

  1. git tag: https://github.com/apache/arrow-adbc/blob/a35fae0f11c29b6207c1361986d7900441562d64/dev/release/01-prepare.sh#L76
  2. Upload artifacts to GitHub Releases by GitHub Actions: https://github.com/apache/arrow-adbc/blob/a35fae0f11c29b6207c1361986d7900441562d64/.github/workflows/packaging.yml#L945-L999
  3. Download artifacts from GitHub Releases, sign them and upload to GitHub Releases: https://github.com/apache/arrow-adbc/blob/main/dev/release/02-sign.sh
  4. Change the associated tag of GitHub Releases from apache-arrow-adbc-X.Y.Z-rcN to apache-arrow-adbc-X.Y.Z after a vote is passed: https://github.com/apache/arrow-adbc/blob/main/dev/release/post-02-binary.sh
    • I want to use different approach in apache/arrow for this. I don't want to change existing GitHub Releases for this. I want to create a new GitHub Releases for apache-arrow-X.Y.Z instead of reusing a GitHub Releases for apache-arrow-X.Y.Z-rcN.
    • arrow-flight-sql-postgresql uses the different approach: https://github.com/apache/arrow-flight-sql-postgresql/blob/main/.github/workflows/release.yaml
      • git tag -a apache-arrow-X.Y.Z and git push apache apache-arrow-X.Y.Z trigger this GitHub Actions workflow.

@assignUser
Copy link
Member

We can't sign by GitHub Actions

Not yet but the option exists! We can sign via gha workflow (in apache/arrow), INFRA has to review and approve the workflow doing the signing and will add the key to be used as a repo secret. See https://www.apache.org/legal/release-policy.html#release-signing

This change was added sometime last year but I never had the time to follow up but with the current discussions about making releases easier and this new addition it might be the right time to tackle it! (for both the monorepo and subprojects)

@kou
Copy link
Member

kou commented Apr 4, 2024

#37350 is a related issue for automated release signing.

@sgilmore10
Copy link
Member Author

Hi @kou and @assignUser,

Thanks for sending us a lot of helpful resources on how to best cut a GitHub Release. @kevingurney and I will review them and re-work our implementation to use GitHub Actions.

To start, we may use the manual signing approach @kou laid out just to get something working, but we would like to transition to an automated signing approach as @assignUser suggested. We'd be happy to investigate how we can get approval from the ASF to automate signing our release artifacts.

Thanks again!
Sarah

@assignUser
Copy link
Member

I just wanted to say I really appreciate your work on this and your style of communication!

Just to clarify the comment about the automated release signing was more a tangent and we would likely implement it for all things in the monorepo at once. So you can continue with the manual way for now with no problems!

@sgilmore10
Copy link
Member Author

sgilmore10 commented Apr 4, 2024

Thanks for clarifying @assignUser! We're excited to be working with you guys on this!

@sgilmore10
Copy link
Member Author

sgilmore10 commented Apr 4, 2024

As we were looking into this, @kevingurney and I realized we need to be able to connect the crossbow job ID that created the MLTBX file with the GitHub Actions workflow. We aren't aware of any existing mechanism that can provide inputs to a GitHub Actions Workflow that is trigged automatically. As an aside, it is possible to supply inputs to a GitHub Actions Workflow that is triggered manually (see this article).

One workaround to enable the automatic approach would be to embed the crossbow job ID in the git tag's message and then extract the ID in the workflow. Using this ID and the archery crossbow download-artifacts command, we could download the appropriate MLTBX file.

We recognize that embedding the ID in the git tag message is not ideal, so please let us know if you have any other suggestions.

@sgilmore10 sgilmore10 marked this pull request as draft April 4, 2024 20:27
@kou
Copy link
Member

kou commented Apr 5, 2024

We can use dev/release/04-binary-download.sh ${MAJOR}.${MINOR}.${PATCH} ${RC} --task-filter matlab to download the .mltbx file. :-)

release_notes="Release Candidate: ${version} RC${rc}"
title="Apache Arrow ${version} RC${rc}"
repository="https://github.com/apache/arrow"
gh release create \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this means we requires the gh CLI installed locally, right?
Can we add a note on the requirements for the release tasks that we require gh CLI installed?
https://github.com/apache/arrow/blob/main/docs/source/developers/release.rst

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @raulcd,

This change is out of date - sorry about that. Based on @kou's feedback, we going to use a GitHub Actions Workflow - triggered by pushing the tag apache-arrow-X.Y.Z-rc{N} - to create release instead of using gh. I just haven't updated 05-binary-upload.sh yet, but will do so soon!

Copy link
Member Author

@sgilmore10 sgilmore10 Apr 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I just realized you are right. We will need to add a note about requiring the GItHub CLI to be installed locally. While we are no longer using gh to create the release in 04-binary-upload.sh, we will need to use gh once the GitHub release is created to upload the signed MLTBX files.

I'll make sure to add a note about this requirement to the release documentation.

Sorry about the confusion!

release_tag="apache-arrow-${version_with_rc}"
tag_message="Release candidate: ${version_with_rc}"
git tag -a ${release_tag} -m ${tag_message}
git push apache ${release_tag}
Copy link
Member

@kou kou Apr 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you do this in dev/release/01-prepare.sh something like the following?

diff --git a/dev/release/01-prepare.sh b/dev/release/01-prepare.sh
index 01fa2f3d80..f5ec0a8e26 100755
--- a/dev/release/01-prepare.sh
+++ b/dev/release/01-prepare.sh
@@ -33,7 +33,7 @@ next_version=$2
 next_version_snapshot="${next_version}-SNAPSHOT"
 rc_number=$3
 
-release_tag="apache-arrow-${version}"
+rc_tag="apache-arrow-${version}-rc${rc_number}"
 release_branch="release-${version}"
 release_candidate_branch="release-${version}-rc${rc_number}"
 
@@ -45,9 +45,9 @@ release_candidate_branch="release-${version}-rc${rc_number}"
 : ${PREPARE_TAG:=${PREPARE_DEFAULT}}
 
 if [ ${PREPARE_TAG} -gt 0 ]; then
-  if [ $(git tag -l "${release_tag}") ]; then
-    echo "Delete existing git tag $release_tag"
-    git tag -d "${release_tag}"
+  if [ $(git tag -l "${rc_tag}") ]; then
+    echo "Delete existing git tag $rc_tag"
+    git tag -d "${rc_tag}"
   fi
 fi
 
@@ -91,7 +91,7 @@ if [ ${PREPARE_LINUX_PACKAGES} -gt 0 ]; then
 fi
 
 if [ ${PREPARE_VERSION_PRE_TAG} -gt 0 ]; then
-  echo "Prepare release ${version} on tag ${release_tag} then reset to version ${next_version_snapshot}"
+  echo "Prepare release ${version} on tag ${rc_tag} then reset to version ${next_version_snapshot}"
 
   update_versions "${version}" "${next_version}" "release"
   git commit -m "MINOR: [Release] Update versions for ${version}"
@@ -100,5 +100,5 @@ fi
 ############################## Tag the Release ##############################
 
 if [ ${PREPARE_TAG} -gt 0 ]; then
-  git tag -a "${release_tag}" -m "[Release] Apache Arrow Release ${version}"
+  git tag -a "${rc_tag}" -m "[Release] Apache Arrow Release ${version}"
 fi

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @kou,

Thanks for sharing this diff! We just have one clarification question:

Based on the diff you shared, it looks like you want us to replace the release_tag with the rc_tag. Did you mean for us to actually add the rc_tag in addition to the release_tag? We asking because we assume we still want to create a release_tag.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean for us to actually add the rc_tag in addition to the release_tag?

No. I mean for us to replace the release_tag with rc_tag.

We asking because we assume we still want to create a release_tag.

I want to change the current workflow:

  1. Create a apache-arrow-X.Y.Z tag for X.Y.Z RC0
  2. (Found a problem for X.Y.Z RC0)
  3. Remove the apache-arrow-X.Y.Z tag for X.Y.Z RC0
  4. Create a apache-arrow-X.Y.Z tag for X.Y.Z RC1
  5. ...

to the following:

  1. Create a apache-arrow-X.Y.Z-rc0 tag for X.Y.Z RC0
  2. (Found a problem for X.Y.Z RC0)
  3. Create a apache-arrow-X.Y.Z-rc1 tag for X.Y.Z RC1
  4. Vote
  5. Passed
  6. Create a apache-arrow-X.Y.Z tag from apache-arrow-X.Y.Z-rc1 like apache/arrow-adbc and apache/arrow-flight-sql-postgresql do

Hmm. It's better that we handle the workflow change related task in a separated issue...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification! I've created a issue #41102 to track this work. Once this PR is merged, I'll take ownership of that issue and make the required changes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @kou,

On second thought, we think it actually makes more sense to tackle #41102 first since the new GitHub Actions Workflow (package.yml) requires the release candidate tag (apache-arrow.X.Y.Z-rcN) to be pushed. We'll put this PR on pause for a bit until that #41102 is closed.

Best,
Sarah

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense!

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting committer review Awaiting committer review labels Apr 5, 2024
@github-actions github-actions bot added Component: Documentation awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Apr 16, 2024
@assignUser
Copy link
Member

Superseeded by #41131

@assignUser assignUser closed this Apr 25, 2024
sgilmore10 added a commit that referenced this pull request Jun 13, 2024
…didates (e.g. apache-arrow-{MAJOR}.{MINOR}.{PATCH}-rc{RC_NUM}) (#41131)

### Rationale for this change

As per @ kou's [suggestion](#40956 (comment)) in #40956, we should create unique git tags (e.g. `apache-arrow-{MAJOR}.{MINOR}.{VERSION}-rc{RC_NUM}`) instead re-using the same git tag (`apache-arrow-{MAJOR}.{MINOR}.{VERSION}`) for each release candidate. The official release candidate tag (`apache-arrow-{MAJOR}.{MINOR}.{VERSION}`) should be created **only** after a release candidate is voted on and accepted. This "official" release tag should point to the same object in the git database as the accepted release candidate tag. 

The new release workflow could look like the following:

> 1. Create a apache-arrow-X.Y.Z-rc0 tag for X.Y.Z RC0 
> 2. (Found a problem for X.Y.Z RC0)
> 3. Create a apache-arrow-X.Y.Z-rc1 tag for X.Y.Z RC1
> 4. Vote
> 5. Passed
> 6. Create a apache-arrow-X.Y.Z tag from apache-arrow-X.Y.Z-rc1 ike apache/arrow-adbc and apache/arrow-flight-sql-postgresql do

See @ kou's [comment](#40956 (comment)) for more details.

### What changes are included in this PR?

1. Updated `dev/release/01-prepare.sh` to create release-candidate-specific git tags (e.g. `apache-arrow-{MAJOR}.{MINOR}.{PATCH}-rc{RC_NUM}`).
2. Updated scripts in `dev/release` to use the new git tag name. 
3. Added GitHub Workflow file  `publish_release_candidate.yml`. This workflow is triggered when a release candidate git tag is pushed and creates a Prerelease GitHub Release.
4. Added logic to `dev/release/02-post-binary.sh` to create and push the release git tag (i.e. `apache-arrow-{MAJOR}.{MINOR}.{PATCH}`).
5. Added GitHub Workflow `publish_release.yml`. This workflow is triggered when the release tag is pushed and creates a GitHub Release for the approved release (i.e. the voted upon release).
6. Added `dev/release/post-16-delete-release-candidates.sh` to delete the release candidate git tags and their associated GitHub Releases. 
7. Updated `docs/developers/release.rst` with the new steps. 

### Are these changes tested?

1. We were not able to verify the changes made to the scripts in `dev/release`. Any suggestions on how we can verify these scripts would be much appreciated :)
2. We did test the new GitHub Workflows (`publish_release_candidate.yml` and `publish_release.yml`) work as intended by pushing git tags to [`mathworks/arrow`](https://github.com/mathworks/arrow).

### Are there any user-facing changes?

No.

### Open Questions

1. We noticed that [apache/arrow-flight-sql-postgresql](https://github.com/apache/arrow-flight-sql-postgresql/releases) does **not** delete the release candidate Prereleases from their GitHub Releases area. Should we be doing the same? Or would it be preferable to just delete the the release candidates **without** deleting the release candidate tags.
2. We're not that familiar with ruby, so we're not sure if the changes we made to `dev/release/02-source-test.rb` make sense.

### Future Directions

1.  Continue working on #40956
2. Add logic to auto-sign release artifacts in GitHub Actions Workflows.

* GitHub Issue: #41102

Lead-authored-by: Sarah Gilmore <sgilmore@mathworks.com>
Co-authored-by: Sarah Gilmore <74676073+sgilmore10@users.noreply.github.com>
Co-authored-by: Sutou Kouhei <kou@cozmixng.org>
Signed-off-by: Sarah Gilmore <sgilmore@mathworks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants