Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Openshift-ci onboarding #39

Merged
merged 1 commit into from
Aug 1, 2023

Conversation

israel-hdez
Copy link

@israel-hdez israel-hdez commented Jul 26, 2023

What this PR does / why we need it:
These are the needed changes to have openshift-ci running the E2E tests successfully.

There are several groups of E2E tests that can be deduced from the .github/workflows/e2e-test.yaml file: fast, slow, explainer, transformer-mms, qpext, grpc, helm, raw and kourier. For ODH, the fast, slow and grpc groups are the ones that cover the features that are going to be supported in the initial adoption of ODH.

This commit contains the needed adaptations to the E2E tests of the fast and slow groups to successfully run them in an openshift cluster. It also adds a few scripts on test/scripts/openshift-ci to run these E2Es in the openshift-ci operator.

Some of these changes should be seen as provisional and should be rolled back:

  • test/e2e/common/utils.py: because of the networking/DNS expectations, that are currently not covered in ODH's installation. These changes should be rolled back once the following ticked is fixed: Expose predictor route for KServe InferenceServices odh-model-controller#59
  • test/e2e/predictor/*:
    • In general all changes under this path should be seen as provisional. However, since ODH won't support all ServingRuntimes, it is possible that some of the tests will stay out.
    • There are some GRPC-related tests marked as skipped. Since this work is not enabling the grpc group, a subsequent commit/PR for enabling GRPC E2Es should remove/revert those skip marks.
    • Also, there are some tests skipped with the Not testable in ODH at the moment reason. The root cause of the failure should be investigated to re-enable these tests.
  • python/kserve/kserve/models/v1beta1_inference_service.py: This is injecting an annotation that is required given the specifics of OSSM/Maistra and OpenShift-Serverless that are used in ODH. This annotation is, currently, user responsibility and this was the cleanest way to add it in the E2Es. Being platform-specific, it's been discussed that this (and some other) annotation should be injected by some controller to relief the user from this responsibility. If this happens, this change should be reverted.

Also, ideally, changes to the following files should be contributed back to upstream. Those changes are not required in upstream and should have no effect, but in openshift-ci become required because a different builder image is being used:

  • Dockerfile
  • agent.Dockerfile

Which issue(s) this PR fixes :
Related #5

Feature/Issue validation/testing:

This is about the tests themselves :-)

Checklist:

🙂 Have you added unit/e2e tests that prove your fix is effective or that this feature works?
✔️ Has code been commented, particularly in hard-to-understand areas?
➖ Have you made corresponding changes to the documentation?

Release note:

NONE

@heyselbi heyselbi mentioned this pull request Jul 26, 2023
@heyselbi heyselbi linked an issue Jul 26, 2023 that may be closed by this pull request
@israel-hdez israel-hdez force-pushed the os-ci-onboarding branch 6 times, most recently from a070dc9 to cffa750 Compare July 27, 2023 20:33
These are the needed changes to have openshift-ci running the E2E tests successfully.

There are several groups of E2E tests that can be deduced from the .github/workflows/e2e-test.yaml file: fast, slow, explainer, transformer-mms, qpext, grpc, helm, raw and kourier. For ODH, the `fast`, `slow` and `grpc` groups are the ones that cover the features that are going to be supported in the initial adoption of ODH.

This commit contains the needed adaptations to the E2E tests of the `fast` and `slow` groups to successfully run them in an openshift cluster. It also adds a few scripts on test/scripts/openshift-ci to run these E2Es in the openshift-ci operator.

Some of these changes should be seen as provisional and should be rolled back:
* test/e2e/common/utils.py: because of the networking/DNS expectations, that are currently not covered in ODH's installation.
* test/e2e/predictor/*:
  * In general all changes under this path should be seen as provisional. However, since ODH won't support all ServingRuntimes, it is possible that some of the tests will stay out.
  * There are some GRPC-related tests marked as skipped. Since this work is not enabling the `grpc` group, a subsequent commit/PR for enabling GRPC E2Es should remove/revert those skip marks.
  * Also, there are some tests skipped with the `Not testable in ODH at the moment` reason. The root cause of the failure should be investigated to re-enable these tests.
* python/kserve/kserve/models/v1beta1_inference_service.py: This is injecting an annotation that is required given the specifics of OSSM/Maistra and OpenShift-Serverless that are used in ODH. This annotation is, currently, user responsibility and this was the cleanest way to add it in the E2Es. Being platform-specific, it's been discussed that this (and some other) annotation should be injected by some controller to relief the user from this responsibility. If this happens, this change should be reverted.

Also, ideally, changes to the following files should be contributed back to upstream. Those changes are not required in upstream and should have no effect, but in openshift-ci become required because a different builder image is being used:
* Dockerfile
* agent.Dockerfile

Signed-off-by: Edgar Hernández <23639005+israel-hdez@users.noreply.github.com>
@israel-hdez israel-hdez marked this pull request as ready for review July 27, 2023 20:56
@@ -150,6 +150,8 @@ def metadata(self, metadata):
:param metadata: The metadata of this V1beta1InferenceService. # noqa: E501
:type: V1ObjectMeta
"""
if metadata is not None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these tests from upstream? If so, why are we modifying them downstream first?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, these come from upstream. In the main comment of the PR is on the list of things that we may want to revert.

It is here because upstream does not need this. It is specific to our OS prerequisites.

@Xaenalt Xaenalt self-requested a review July 28, 2023 18:16
@openshift-ci
Copy link

openshift-ci bot commented Aug 1, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: israel-hdez, Xaenalt

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [Xaenalt,israel-hdez]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved label Aug 1, 2023
@Xaenalt
Copy link
Member

Xaenalt commented Aug 1, 2023

/retest

@openshift-ci
Copy link

openshift-ci bot commented Aug 1, 2023

@israel-hdez: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/pr-image-mirror-kserve-router ecff079 link true /test pr-image-mirror-kserve-router
ci/prow/pr-image-mirror-kserve-agent ecff079 link true /test pr-image-mirror-kserve-agent
ci/prow/pr-image-mirror-kserve-controller ecff079 link true /test pr-image-mirror-kserve-controller

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@israel-hdez
Copy link
Author

@vaibhavjainwiz There seems to be a bad configuration for the image push.

Merging, since images build OK, but cannot be pushed.

@israel-hdez israel-hdez merged commit 23c823c into opendatahub-io:master Aug 1, 2023
14 of 18 checks passed
@israel-hdez israel-hdez deleted the os-ci-onboarding branch August 2, 2023 17:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Expose predictor route for KServe InferenceServices [P0] Update KServe e2e
3 participants