[bug][v1.23]: cluster_ca_cert and cluster_ca_key always trigger cluster updater #530

ddelange · 2022-03-31T11:29:05Z

Hi again!

I just tried out v1.23, and spotted this one triggering the cluster updater. I haven't provided the secrets block in my cluster config.

The text was updated successfully, but these errors were encountered:

eddycharly · 2022-03-31T21:17:04Z

Thanks for reporting.
Does it happen every time or only when you upgrade from a previous version of the provider ?

ddelange · 2022-04-01T06:06:06Z

I would say every time, after a couple applies like that I just checked this morning and again the same scenario on a terraform apply: revision = 13 -> 14 on kops_cluster.cluster

Maybe useful info: I have completely destroyed the 1.22 cluster and restarted 1.23 in a different AZ as part of the upgrade, so the chances of 1.22 remnants are small (although I didn't explicitly check whether tfstate was empty/deleted before spinning up 1.23)

argoyle · 2022-04-01T06:13:17Z

Just for helping out narrowing in what might be the problem, I don't see this on my own prod-cluster where I have a docker config defined and not on a fresh test-cluster either where I have no docker config.

eddycharly · 2022-04-01T06:57:58Z

@ddelange did you see the same problem with 1.22 or only 1.23 ?

eddycharly · 2022-04-01T07:26:11Z

I spent some time trying to reproduce the issue, but I didn't succeed.
Can you share your tf config ?

ddelange · 2022-04-01T07:56:38Z

Hmm, interesting! Thanks for checking guys. This was not the case before upgrading to 1.23. The only diff to cluster.tf apart from the version bump, was adding the containerd.config_override block. Adding a (basicauth) private docker registry to the cluster was the whole reason for updrading to 1.23: as 1.22 has max containerd version 1.4, I needed 1.5+ such that we can make use of containerd registry mirror auth functionality.

Here's an excerpt of our cluster.tf

eddycharly · 2022-04-01T08:03:31Z

This looks similar to what I tested.
You used the provider v1.23.0-alpha.1 with k8s 1.23 right ?

ddelange · 2022-04-01T08:07:26Z

Correct, 1.23.0-alpha.1 and

variable "kubernetes_version" {
  type        = string
  description = "Kubernetes version to use for the cluster. MAJOR.MINOR here should not be newer than the kops provider version in versions.tf ref https://kops.sigs.k8s.io/welcome/releases/"
  default     = "v1.23.5"
}

I'm now trying to isolate the bug to the config_override block (applying now without it, seeing if afterwards another apply will trigger the updater again)

eddycharly · 2022-04-01T08:17:13Z

I don't think your issue is related to the config override block, but thanks for trying it out.

ddelange · 2022-04-01T08:17:40Z

Looks like unrelated indeed. Behaviour stays the same 🤔

eddycharly · 2022-04-01T08:23:20Z

Could be related to permissions in the s3 bucket.
Can you check the files are correctly stored in the bucket (pki/private/kubernetes-ca/keyset.yaml) ?

ddelange · 2022-04-01T08:31:36Z

The file exists, looks like a valid manifest. Created (r/w perms) by my AWS account yesterday when I created the 1.23 cluster. Same permissions as the rest of the files, e.g. under /instancegroups

eddycharly · 2022-04-01T08:38:48Z

Hmmm, I'm dry on ideas :-(
There was an invalid reference to the k/k package in v1.23.0-alpha.1 but I don't think it would cause such an issue.
I can try to cut v1.23.0-alpha.2 though.

ddelange · 2022-04-01T08:43:31Z

Many thanks for taking the time!

I would just close with that for now and if it disappears over time (or I miraculously find a fix) I will report back here.

Have a nice weekend 💥

ddelange · 2022-04-02T16:58:40Z

hotfix seems to be holding steady :)

  lifecycle {
    ignore_changes = [
      secrets,
    ]
  }

eddycharly · 2022-04-02T19:20:16Z

It’s a hack, you shouldn’t need this.
If possible I would empty the bucket and try from scratch completely.

eddycharly · 2022-04-02T20:36:53Z

I just released v1.23.0-alpha.2 but I doubt it will fix your issue.

ddelange · 2022-04-02T20:39:39Z

Thanks for the ping!

ddelange · 2022-04-03T20:09:27Z

If possible I would empty the bucket and try from scratch completely.

Seems like it did not solve the issue 🤔

peter-svensson · 2022-06-02T10:48:42Z

Me and @argoyle had this issue as well on some clusters.
Seems to work if we always define secrets in the config, like:

  secrets {
    docker_config   = "{}"
  }

eddycharly · 2022-06-02T10:50:18Z

Interesting, I am going to reopen and investigate the issue again.

eddycharly · 2022-06-02T13:24:42Z

I was able to somewhat reproduce the issue:

create a cluster with ca cert/private key
update the cluster, removing ca cert/private key
next applies always trigger an update

Would that look like the scenario you are hitting ?

ddelange · 2022-06-02T14:10:03Z

fwiw, I never used the secrets block

eddycharly · 2022-06-02T15:00:42Z

I have a good suspect in mind but no access to an AWS account, making it difficult to track it down.

CAs are stored in pki/private/kubernetes-ca/keyset.yaml (maybe in pki/private/ca/keyset.yaml with older kOps versions).

When one doesn't provide a CA cert/key, kOps will create one and I'm not sure if it is stored in the same place (I suppose it is but can't confirm).

From what I understand, it's not possible to remove a CA that is being used so it is probably related.

What could happen:

You create a cluster without specifying the CA
kOps generates a CA automatically and stores it
Finally the CA ends up in the state
When you run a plan, if you don't specify the secrets block, terraform tries to delete the CA
Deleting a CA is not possible (we can eventually rotate it but not delete a CA being used)
The CA will constantly be added to the state and there will be a permanent diff

Now, if you specify the secrets block (you can leave it empty) it looks like terraform is not complaining anymore (because CA cert/key are marked as computed).

resource "kops_cluster" "cluster" {
  // ...
  secrets {}
  // ...
}

It would be nice if someone can confirm this.

eddycharly · 2022-06-02T15:04:48Z

@peter-svensson @argoyle I suspect you can use an empty secrets block, no need to use a dummy docker_config as mentioned in my comment above.

ddelange · 2022-06-02T17:20:23Z

--- a/k8s/kops/cluster.tf
+++ b/k8s/kops/cluster.tf
@@ -217,12 +217,7 @@ EOF
     }
   }

-  lifecycle {
-    ignore_changes = [
-      secrets,
-    ]
-  }
+  secrets {}
 }

did indeed not trigger the updater!

argoyle · 2022-06-02T17:27:58Z

Seems to do the trick with our setup as well 🎉

ddelange · 2022-06-02T17:30:07Z

we also have a

  authorization {
    always_allow {}
  }

block for the same reason btw :)

eddycharly · 2022-06-02T17:36:24Z

Great, thanks for testing it !
I will consider making secrets required and document that it can be left empty.

eddycharly · 2022-06-13T20:37:28Z

@ddelange why not RBAC ?

  authorization {
    rbac {}
  }

ddelange · 2022-06-15T07:36:35Z

We're spinning up Rancher v2 on the cluster via helm chart v2.6.5.

I flipped the authorization config this morning and recreated the cluster, but the rancher mechanics (lots of opaque helm jobs etc) don't like it and the cluster won't show up anymore in the Rancher UI. I've been skimming through all different pod logs but no rbac related messages.

Ironically, some time ago I managed to fix another Rancher related issue by creating a ClusterRole. Which shouldn't have been necessary as we had always_allow if I understand correctly 🤔

Like I wrote there, spinning up with always_allow will still show rbac:

$ kubectl api-versions | grep rbac
rbac.authorization.k8s.io/v1

EDIT: now succesfully changed to rbac. The cluster was only not being recognised because I was logged in to rancher using github oauth (which had recreated permissions defaulting to standard user because I wiped the cluster), logging in as admin allowed me to see the cluster again

ddelange closed this as completed Apr 1, 2022

clayrisser mentioned this issue May 5, 2022

kops_cluster_updater thinks there's a revision when there are no changes #569

Closed

eddycharly reopened this Jun 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[bug][v1.23]: cluster_ca_cert and cluster_ca_key always trigger cluster updater #530

[bug][v1.23]: cluster_ca_cert and cluster_ca_key always trigger cluster updater #530

ddelange commented Mar 31, 2022

eddycharly commented Mar 31, 2022 •

edited

Loading

ddelange commented Apr 1, 2022 •

edited

Loading

argoyle commented Apr 1, 2022

eddycharly commented Apr 1, 2022

eddycharly commented Apr 1, 2022

ddelange commented Apr 1, 2022

eddycharly commented Apr 1, 2022

ddelange commented Apr 1, 2022

eddycharly commented Apr 1, 2022

ddelange commented Apr 1, 2022

eddycharly commented Apr 1, 2022

ddelange commented Apr 1, 2022

eddycharly commented Apr 1, 2022

ddelange commented Apr 1, 2022

ddelange commented Apr 2, 2022

eddycharly commented Apr 2, 2022

eddycharly commented Apr 2, 2022

ddelange commented Apr 2, 2022

ddelange commented Apr 3, 2022

peter-svensson commented Jun 2, 2022

eddycharly commented Jun 2, 2022

eddycharly commented Jun 2, 2022

ddelange commented Jun 2, 2022

eddycharly commented Jun 2, 2022

eddycharly commented Jun 2, 2022

ddelange commented Jun 2, 2022

argoyle commented Jun 2, 2022 •

edited

Loading

ddelange commented Jun 2, 2022

eddycharly commented Jun 2, 2022

eddycharly commented Jun 13, 2022

ddelange commented Jun 15, 2022 •

edited

Loading

[bug][v1.23]: cluster_ca_cert and cluster_ca_key always trigger cluster updater #530

[bug][v1.23]: cluster_ca_cert and cluster_ca_key always trigger cluster updater #530

Comments

ddelange commented Mar 31, 2022

eddycharly commented Mar 31, 2022 • edited Loading

ddelange commented Apr 1, 2022 • edited Loading

argoyle commented Apr 1, 2022

eddycharly commented Apr 1, 2022

eddycharly commented Apr 1, 2022

ddelange commented Apr 1, 2022

eddycharly commented Apr 1, 2022

ddelange commented Apr 1, 2022

eddycharly commented Apr 1, 2022

ddelange commented Apr 1, 2022

eddycharly commented Apr 1, 2022

ddelange commented Apr 1, 2022

eddycharly commented Apr 1, 2022

ddelange commented Apr 1, 2022

ddelange commented Apr 2, 2022

eddycharly commented Apr 2, 2022

eddycharly commented Apr 2, 2022

ddelange commented Apr 2, 2022

ddelange commented Apr 3, 2022

peter-svensson commented Jun 2, 2022

eddycharly commented Jun 2, 2022

eddycharly commented Jun 2, 2022

ddelange commented Jun 2, 2022

eddycharly commented Jun 2, 2022

eddycharly commented Jun 2, 2022

ddelange commented Jun 2, 2022

argoyle commented Jun 2, 2022 • edited Loading

ddelange commented Jun 2, 2022

eddycharly commented Jun 2, 2022

eddycharly commented Jun 13, 2022

ddelange commented Jun 15, 2022 • edited Loading

eddycharly commented Mar 31, 2022 •

edited

Loading

ddelange commented Apr 1, 2022 •

edited

Loading

argoyle commented Jun 2, 2022 •

edited

Loading

ddelange commented Jun 15, 2022 •

edited

Loading