Reports aggregation in a separate process #32

eddycharly · 2022-09-12T12:19:24Z

This PR proposes to change the way we generate reports by generating per resource reports, let Kubernetes take care of reports lifecycle, and aggregate reports into higher level reports in a separate process/controller.

It is aligned with other KDPs meant to decompose Kyverno in multiple processes.

chipzoller · 2022-09-12T12:26:38Z

You've got some relics from another KDP you used as a template in here :)

eddycharly · 2022-09-12T12:31:47Z

You've got some relics from another KDP you used as a template in here :)

Fixed the one in the Metas section

eddycharly · 2022-09-12T12:32:55Z

Removed Prior art that leaked from the other KDP too.

realshuting · 2022-09-12T13:20:57Z

proposals/reports.md

+1. At admission time, all policies running in audit mode are run against the admission request and produce report results.
+1. When a policy is created/updated/deleted, if the policy can run in background mode, reports are updated according to the policy changes.
+1. Periodically, policies running in background mode are re eveluated against resources present in the cluster and reports are updated accordingly.


Index:
1.
2.
3.

This will appear correctly in md viewer.

realshuting · 2022-09-12T13:23:32Z

proposals/reports.md

+# Proposal
+
+In this proposal, we study the possibility to change the way reports are generated by:
+- creating one report per resource


Is report here a report change request or a policy report?

I tested with RCR but could be a policy report as well.

Boojapho · 2022-09-12T13:26:55Z

@eddycharly If you align the reports and TTL with each resource, then it seems like we could just populate the report in the status of the resource itself. I would think that would simplify the architecture. Aggregation would occur by reading the resource status.

eddycharly · 2022-09-12T13:29:43Z

@Boojapho The status of analysed resources ?

Boojapho · 2022-09-12T14:46:47Z

@Boojapho The status of analysed resources ?

Just realized when you asked this that Kyverno doesn't own the status fields of the analyzed resources. So, this wouldn't work.

You could achieve a similar effect using annotations on the analyzed resource. But, that would require you to have write access to all resources, so that's not going to work either.

eddycharly · 2022-09-12T15:05:34Z

Yeah, basically we want a report, makes sense that it lives in it's own type.
I prefer not to bake that in the resource itself TBH.

chipzoller · 2022-09-12T15:08:11Z

It seems that creating a 1:1 mapping of resource:report could really put a squeeze on Kubernetes' data store (etcd). To me, it still seems like splitting reports per policy (and overflowing to more if the need arises) may be more efficient.

eddycharly · 2022-09-12T15:47:41Z

Splitting per policy has some advantages too but we won't be able to use k8s native garbage collection.
Let's try to list pro/cons of both approaches.

realshuting · 2022-09-12T15:50:36Z

Splitting per policy has some advantages too but we won't be able to use k8s native garbage collection. Let's try to list pro/cons of both approaches.

Initially I thought we were going to set ownerReference for RCRs (to solve the current RCR cleanup issue), and generate a report per namespace when we discussed offline.

eddycharly · 2022-09-12T15:53:03Z

Initially I thought we were going to set ownerReference for RCRs (to solve the current RCR cleanup issue)

That's what i have now. This doesn't make a real difference for @chipzoller point though, it's still a one:one mapping.

realshuting · 2022-09-12T15:55:31Z

Initially I thought we were going to set ownerReference for RCRs (to solve the current RCR cleanup issue)

That's what i have now. This doesn't make a real difference for @chipzoller point though, it's still a one:one mapping.

I meant to generate an RCR per resource, and merge RCRs to a single policy report for one namespace.

eddycharly · 2022-09-12T15:57:07Z

It seems that creating a 1:1 mapping of resource:report could really put a squeeze on Kubernetes' data store

True but I don't know if it's an issue, when managing a large cluster you usually size the control plane adequately.

eddycharly · 2022-09-12T15:57:40Z

I meant to generate an RCR per resource, and merge RCRs to a single policy report for one namespace.

Do you expect that we remove the RCR once merged into the report ?
My initial idea was to keep it for as long as the resource exists, hence creating a 1:1 mapping.

realshuting · 2022-09-12T16:01:09Z

I meant to generate an RCR per resource, and merge RCRs to a single policy report for one namespace.

Do you expect that we remove the RCR once merged into the report ? My initial idea was to keep it for as long as the resource exists, hence creating a 1:1 mapping.

Do you expect that we remove the RCR once merged into the report ?

No, per this proposal, the RCR shares the same lifecycle with the resource. Just that in addition to creating the RCR per resource, we will merge RCRs into one report per namespace. But this could increase the size by x2 as we duplicate the data.

eddycharly · 2022-09-12T16:05:58Z

But this could increase the size by x2 as we duplicate the data

Are you suggesting that we don't need to aggregate at the namespace level ? ;-)
In the end if we have the reports for all resources, why do we need to aggregate ?
We could just list reports instead of get, maybe just aggregate summaries ?

Boojapho · 2022-09-12T17:58:41Z

You should consider minimizing the Kubernetes API call load in the solution. The more you can aggregate in memory and then write to the report once, the less probability of hitting throttling issues.

eddycharly · 2022-09-14T15:07:28Z

You should consider minimizing the Kubernetes API call load in the solution

Definitely.

On the other hand, The solution we have now is not perfect, we are constantly creating/deleting RCRs instead of keeping them around and updating them when necessary.

Signed-off-by: Charles-Edouard Brétéché <charled.breteche@gmail.com>

chipzoller · 2023-04-30T12:32:52Z

This has already been completed, yes?

eddycharly requested review from chipzoller, JimBugwadia, prateekpandey14, realshuting, sambhav and vyankyGH September 12, 2022 12:19

eddycharly force-pushed the reports-v2 branch from 8244e13 to 05028a7 Compare September 12, 2022 12:31

eddycharly force-pushed the reports-v2 branch from 05028a7 to f9d0c2d Compare September 12, 2022 12:32

realshuting reviewed Sep 12, 2022

View reviewed changes

eddycharly mentioned this pull request Sep 13, 2022

refactor: abstract policy report eraser implementation kyverno/kyverno#4612

Closed

Reports aggregation in a separate process

8529ac0

Signed-off-by: Charles-Edouard Brétéché <charled.breteche@gmail.com>

eddycharly force-pushed the reports-v2 branch from f9d0c2d to 8529ac0 Compare September 19, 2022 11:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reports aggregation in a separate process #32

Reports aggregation in a separate process #32

eddycharly commented Sep 12, 2022

chipzoller commented Sep 12, 2022

eddycharly commented Sep 12, 2022

eddycharly commented Sep 12, 2022

realshuting Sep 12, 2022

eddycharly Sep 12, 2022

realshuting Sep 12, 2022

eddycharly Sep 12, 2022

Boojapho commented Sep 12, 2022

eddycharly commented Sep 12, 2022

Boojapho commented Sep 12, 2022

eddycharly commented Sep 12, 2022

chipzoller commented Sep 12, 2022

eddycharly commented Sep 12, 2022

realshuting commented Sep 12, 2022

eddycharly commented Sep 12, 2022 •

edited

Loading

realshuting commented Sep 12, 2022 •

edited

Loading

eddycharly commented Sep 12, 2022

eddycharly commented Sep 12, 2022 •

edited

Loading

realshuting commented Sep 12, 2022

eddycharly commented Sep 12, 2022 •

edited

Loading

Boojapho commented Sep 12, 2022

eddycharly commented Sep 14, 2022

chipzoller commented Apr 30, 2023

Reports aggregation in a separate process #32

Are you sure you want to change the base?

Reports aggregation in a separate process #32

Conversation

eddycharly commented Sep 12, 2022

chipzoller commented Sep 12, 2022

eddycharly commented Sep 12, 2022

eddycharly commented Sep 12, 2022

realshuting Sep 12, 2022

Choose a reason for hiding this comment

eddycharly Sep 12, 2022

Choose a reason for hiding this comment

realshuting Sep 12, 2022

Choose a reason for hiding this comment

eddycharly Sep 12, 2022

Choose a reason for hiding this comment

Boojapho commented Sep 12, 2022

eddycharly commented Sep 12, 2022

Boojapho commented Sep 12, 2022

eddycharly commented Sep 12, 2022

chipzoller commented Sep 12, 2022

eddycharly commented Sep 12, 2022

realshuting commented Sep 12, 2022

eddycharly commented Sep 12, 2022 • edited Loading

realshuting commented Sep 12, 2022 • edited Loading

eddycharly commented Sep 12, 2022

eddycharly commented Sep 12, 2022 • edited Loading

realshuting commented Sep 12, 2022

eddycharly commented Sep 12, 2022 • edited Loading

Boojapho commented Sep 12, 2022

eddycharly commented Sep 14, 2022

chipzoller commented Apr 30, 2023

eddycharly commented Sep 12, 2022 •

edited

Loading

realshuting commented Sep 12, 2022 •

edited

Loading

eddycharly commented Sep 12, 2022 •

edited

Loading

eddycharly commented Sep 12, 2022 •

edited

Loading