Predict config changes for purging #563

m5r · 2023-06-21T14:30:39Z

Describe the issue

Ensure community members can own sustainable CHT deployments without Medic directly involved

App developers can easily visualize and quantify the impact of a change to config for purging

Additional context
Related allies OKR

jkuester · 2023-06-27T17:59:17Z

Behavior Overview

(@m5r please correct this if it is wrong!)

A new cht-conf action, dry-run-purge-config has been added. When you execute this action, it will call the new API endpoint with your current purge config and print the results. The results will indicate:

When the next purge will run
How many total docs would be purged with the new config
How many currently purged docs would be unpurged with the new config
How many docs would not have their purged status change

m5r · 2024-01-25T11:11:18Z

As noted in the initial cht-core PR, we tried to solve this by running the purging code minus the database mutations (aka dry run) but we ran into the same limits as actual purging with slow queries that made a dry run take hours to complete. Here is a copy of our test results:

I got some disappointing news about our purging dry run solution 😞

I've started a dry run of a purge in my morning on a clone of Muso-Mali with a beefy machine with similar specs: Xeon E5-2686 v4 @ 2.30GHz, 256 GB of RAM, ~650GB of data stored on a 1.5 TB disk. I'm using a fork of CHT 3.13.0 with the purging dry run API living on the temporary branch 3.13.0-FR-dry-run-purging.

It's the beginning of the night over here and the dry run is still going. It took nearly 5 hours to simulate purging contacts, processing ~10k records with each batched request. Our assumption was that queries were cheap and mutating the data was the expensive part of purging that makes the process so slow but it turns out the queries are expensive as we're seeing roughly the same performances as actual purging despite using couchdb views.

It averages 35% of CPU usage with spikes to 80% and any loss of connection between cht-conf and the API during the dry run results in wasted CPU usage as cht-conf can't reconnect to the API to wait for the results while the API keeps running the dry run.

With all this, it's safe to say we cannot move forward with this solution and we should go back to the design step for this feature.

m5r added the Type: Feature Add something new label Jun 21, 2023

m5r self-assigned this Jun 21, 2023

m5r mentioned this issue Jun 21, 2023

feat(#563): command to measure the effect of purging rules changes #564

Closed

m5r removed their assignment Oct 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Predict config changes for purging #563

Predict config changes for purging #563

m5r commented Jun 21, 2023

jkuester commented Jun 27, 2023

m5r commented Jan 25, 2024

Predict config changes for purging #563

Predict config changes for purging #563

Comments

m5r commented Jun 21, 2023

jkuester commented Jun 27, 2023

Behavior Overview

m5r commented Jan 25, 2024