Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prevent dspa status update on a terminating dspa #729

Merged
merged 1 commit into from
Oct 18, 2024

Conversation

HumairAK
Copy link
Contributor

When a dspa is marked for deletion, there is no state logic to handle dspa statuses, so the update can at times try to update a dspa that may have already been deleted, this change should prevent (or reduce) the chances of this happening, in the future proper/graceful dspa termination handling should be added.

The following error appears when you try to delete the dspa, it may not always show up, it depends on reconcile timing and k8s server response:

2024-10-18T17:10:32-04:00       ERROR   Encountered error when updating the DSPA status {"namespace": "dspa2", "dspa_name": "sample", "error": "Operation cannot be fulfilled on datasciencepipelinesapplications.datasciencepipelinesapplications.opendatahub.io \"sample\": StorageError: invalid object, Code: 4, Key: /kubernetes.io/datasciencepipelinesapplications.opendatahub.io/datasciencepipelinesapplications/dspa2/sample, ResourceVersion: 0, AdditionalErrorMsg: Precondition failed: UID in precondition: a17d9b5d-127f-4e5d-8560-1fbce2e27dcd, UID in object meta: "}
github.com/opendatahub-io/data-science-pipelines-operator/controllers.(*DSPAReconciler).updateStatus
        /home/hukhan/projects/github/rhods/data-science-pipelines-operator/controllers/dspipeline_controller.go:370
github.com/opendatahub-io/data-science-pipelines-operator/controllers.(*DSPAReconciler).Reconcile
        /home/hukhan/projects/github/rhods/data-science-pipelines-operator/controllers/dspipeline_controller.go:224
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
        /home/hukhan/go/1.21.3/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:118
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
        /home/hukhan/go/1.21.3/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:314
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
        /home/hukhan/go/1.21.3/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:265
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
        /home/hukhan/go/1.21.3/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:226

notice:

ERROR   Encountered error when updating the DSPA status

This happens between:

r.refreshDspa(ctx, dspa, req, log)
...
err := r.Status().Update(ctx, dspa)

So we just do a check for the deletiontimestamp field, and if it's set, we don't attempt to update the status

when a dspa is marked for deletion, there is no state logic to handle
dspa statuses, so the update can at times try to update a dspa that may
have already been deleted, this change should prevent (or reduce) the
chances of this happening, in the future proper/graceful dspa
termination handling should be added

Signed-off-by: Humair Khan <HumairAK@users.noreply.github.com>
Copy link
Contributor

openshift-ci bot commented Oct 18, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from humairak. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@dsp-developers
Copy link
Contributor

A new image has been built to help with testing out this PR: quay.io/opendatahub/data-science-pipelines-operator:pr-729
An OCP cluster where you are logged in as cluster admin is required.

To use this image run the following:

cd $(mktemp -d)
git clone git@github.com:opendatahub-io/data-science-pipelines-operator.git
cd data-science-pipelines-operator/
git fetch origin pull/729/head
git checkout -b pullrequest 54b2cee97f092279c072f9a0cf2cd773b1bc8adb
oc new-project opendatahub
make deploy IMG="quay.io/opendatahub/data-science-pipelines-operator:pr-729"

More instructions here on how to deploy and test a Data Science Pipelines Application.

@gregsheremeta
Copy link
Contributor

/lgtm

possibly follow up with #692

@openshift-ci openshift-ci bot added the lgtm label Oct 18, 2024
@HumairAK HumairAK merged commit ef30372 into opendatahub-io:main Oct 18, 2024
5 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants