Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NO-JIRA: chore(nbcs): update tooling in controller's Makefiles - kubernetes and kustomize #432

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

jstourac
Copy link
Member

@jstourac jstourac commented Oct 25, 2024

Some tooling updates and fix. I bumped into this during my work on https://issues.redhat.com/browse/RHOAIENG-14687.

This fix is part of https://issues.redhat.com/browse/RHOAIENG-15141.

Description

The update of the kubernetes version isn't mandatory and I can remove it from here if you want to. But I think it's reasonable.
The fix for the kustomize package is necessary to make our e2e tests pass (at least partially) again. This change is a followup of the #391.

How Has This Been Tested?

Try to run in components/odh-notebook-controller:

  • make test - all should pass.
  • make run-ci-e2e-tests - 15 pass, 2 fail - you need to be logged into some cluster to make this work

Merge criteria:

  • The commits are squashed in a cohesive manner and have meaningful messages.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work

The version 1.23 is no longer supported [1] and only releases from
1.28 - 1.31 versions are supported. Based on the [2], the
OpenShift 4.13 (oldest we care about right now) is based on Kubernetes
in version 1.26, so let's allign to that version.

* [1] https://kubernetes.io/releases/
* [2] https://access.redhat.com/solutions/4870701
This is a followup of opendatahub-io#391 (1b2dd4f).

Without this, we can see the following failure during the e2e tests
execution:
```
make run-ci-e2e-tests
...
GOBIN=/home/jstourac/workspace/rhosai/odh/kubeflow/components/odh-notebook-controller/bin go install sigs.k8s.io/kustomize/v3/cmd/kustomize@v5.0.2
go: sigs.k8s.io/kustomize/v3/cmd/kustomize@v5.0.2: sigs.k8s.io/kustomize/v3/cmd/kustomize@v5.0.2: invalid version: unknown revision v3/cmd/kustomize/v5.0.2
make[1]: *** [Makefile:243: kustomize] Error 1
...
```
This caused that all e2e tests failed. After this fix, some e2e tests
pass. The other seem to fail on a problem unrelated to this one.
@jstourac jstourac self-assigned this Oct 25, 2024
Copy link

openshift-ci bot commented Oct 25, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from jstourac. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jiridanek
Copy link
Member

/lgtm

What's that we need to be running in ci and we aren't? Because, both GitHub actions and ocp-ci are passing now, right? Bit the customize line is clearly wrong...

@jstourac
Copy link
Member Author

/lgtm

What's that we need to be running in ci and we aren't? Because, both GitHub actions and ocp-ci are passing now, right? Bit the customize line is clearly wrong...

Hm, but we're not running e2e tests anywhere in our CI here in GitHub, are we? I didn't investigate the failure very much, I can do so later maybe.

@jiridanek
Copy link
Member

jiridanek commented Oct 25, 2024

The e2e that need you to bring your own openshift? Right, we don't run that in gha ci. We should, eventually. Thought it runs in ocp či, but apparently really not.

@jstourac
Copy link
Member Author

ci/prow/odh-notebook-controller-e2e

Hm, @jiridanek you are right, we're truly running them: https://github.com/openshift/release/blob/master/ci-operator/config/opendatahub-io/kubeflow/opendatahub-io-kubeflow-main.yaml#L107.

And it is green in your PR #430 (this run) even though they failed, see the results here:

/go/src/github.com/opendatahub-io/kubeflow/components/odh-notebook-controller/bin/controller-gen rbac:roleName=manager-role crd webhook paths="./..." output:crd:artifacts:config=config/crd/bases
GOBIN=/go/src/github.com/opendatahub-io/kubeflow/components/odh-notebook-controller/bin go install sigs.k8s.io/kustomize/v3/cmd/kustomize@v5.0.2
go: sigs.k8s.io/kustomize/v3/cmd/kustomize@v5.0.2: sigs.k8s.io/kustomize/v3/cmd/kustomize@v5.0.2: invalid version: unknown revision v3/cmd/kustomize/v5.0.2
make[1]: *** [kustomize] Error 1
make[1]: Leaving directory `/go/src/github.com/opendatahub-io/kubeflow/components/odh-notebook-controller'
make[1]: Entering directory `/go/src/github.com/opendatahub-io/kubeflow/components/odh-notebook-controller'
go test ./e2e/ -run ^TestE2ENotebookController -v \
	--nb-namespace=odh-notebook-controller-system \
	--deploymentMode=oauth \
	"--skip-deletion=false" 
go: downloading github.com/stretchr/testify v1.8.4
go: downloading github.com/pmezard/go-difflib v1.0.0
=== RUN   TestE2ENotebookController
=== RUN   TestE2ENotebookController/validate_controllers
=== RUN   TestE2ENotebookController/validate_controllers/Validate_Kubeflow_notebook_controller
    notebook_controller_test.go:32: 
        	Error Trace:	/go/src/github.com/opendatahub-io/kubeflow/components/odh-notebook-controller/e2e/notebook_controller_test.go:32
        	Error:      	Not equal: 
        	            	expected: true
        	            	actual  : false
        	Test:       	TestE2ENotebookController/validate_controllers/Validate_Kubeflow_notebook_controller
        	Messages:   	error getting notebook-controller configmap
=== RUN   TestE2ENotebookController/validate_controllers/Validate_ODH_notebook_controller
    notebook_controller_test.go:45: 
        	Error Trace:	/go/src/github.com/opendatahub-io/kubeflow/components/odh-notebook-controller/e2e/notebook_controller_test.go:45
        	Error:      	Received unexpected error:
        	            	timed out waiting for the condition
        	Test:       	TestE2ENotebookController/validate_controllers/Validate_ODH_notebook_controller
        	Messages:   	error in validating odh notebook controller
--- FAIL: TestE2ENotebookController (60.29s)
    --- FAIL: TestE2ENotebookController/validate_controllers (60.29s)
        --- FAIL: TestE2ENotebookController/validate_controllers/Validate_Kubeflow_notebook_controller (0.20s)
        --- FAIL: TestE2ENotebookController/validate_controllers/Validate_ODH_notebook_controller (60.08s)
FAIL
FAIL	github.com/opendatahub-io/kubeflow/components/odh-notebook-controller/e2e	60.321s
FAIL
make[1]: *** [e2e-test-oauth] Error 1

We probably don't propagate the return code in that place properly or don't fail as we should.

My run here failed even before the e2e tests were triggered, so I'll try to re-run it to see whether I'll see what I saw during my local testing:

/retest

@jstourac
Copy link
Member Author

We probably don't propagate the return code in that place properly or don't fail as we should.

I raised the #433 to address this issue. Also created https://issues.redhat.com/browse/RHOAIENG-15141 to track these E2E test failures so we can plan on it.

My run here failed even before the e2e tests were triggered, so I'll try to re-run it to see whether I'll see what I saw during my local testing:

So, retest didn't work - the failure was the very same and again - it even didn't get to the point of the tests execution --- no e2e directory in here. This is the failure, something related to the cluster preparation:

{"component":"entrypoint","error":"wrapped process failed: exit status 6","file":"sigs.k8s.io/prow/pkg/entrypoint/run.go:84","func":"sigs.k8s.io/prow/pkg/entrypoint.Options.internalRun","level":"error","msg":"Error executing test process","severity":"error","time":"2024-10-25T18:47:20Z"}
error: failed to execute wrapped command: exit status 6

@jstourac
Copy link
Member Author

Let’s try again

/retest

Copy link

openshift-ci bot commented Oct 26, 2024

@jstourac: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/odh-notebook-controller-e2e a2c94c8 link unknown /test odh-notebook-controller-e2e

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants