Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix logging of closure limits #34

Merged
merged 7 commits into from
Oct 9, 2024
Merged

Fix logging of closure limits #34

merged 7 commits into from
Oct 9, 2024

Conversation

pratikunterwegs
Copy link
Collaborator

@pratikunterwegs pratikunterwegs commented Oct 7, 2024

This PR fixes an issue where closures were not ended in model runs where the closure began after the epidemic had stopped growing. This mostly affected edge cases of countries with large spare hospital capacity, and relatively late response_times. In such cases the closure end time was assigned as the simulation end time, inflating costs related to closures. The fix:

  • Prevents closures from being activated by the hospital_capacity trigger if the epidemic is not growing, even if the hospital capacity threshold is crossed;

  • Prevents closures from being activated by the response_time trigger if the epidemic is not growing between stage 01 and stage 02. Closures are manually turned off if the epidemic is not growing ($R_t < 1.0$).

Tests for different response times check that the model behaves as expected.

Miscellaneous changes

  • The package now requires a minimum R version >= v4.3 due to the use of the new (in base R) function array2DF().

This PR builds off #33 and needs to be rebased on main before merging.

@pratikunterwegs pratikunterwegs self-assigned this Oct 7, 2024
Copy link
Contributor

@richfitz richfitz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some queries here quickly - I'm losing track of the switches sorry

R/closure.R Outdated
state,
c(N_AGE_GROUPS, N_MODEL_COMPARTMENTS, N_ECON_STRATA, N_VACCINE_STRATA)
)
cm <- parameters[["contact_matrix"]] %*% diag(parameters[["demography"]])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If neither of these inputs never change is it better perhaps to compute in parameter set up once rather than adding this matrix multiplication to every root check?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added to parameters.

R/closure.R Outdated
c(N_AGE_GROUPS, N_MODEL_COMPARTMENTS, N_ECON_STRATA, N_VACCINE_STRATA)
)
cm <- parameters[["contact_matrix"]] %*% diag(parameters[["demography"]])
rt <- r_eff(parameters[["r0"]], state, cm)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to get a "r_greater_than_one" check function written at some point that could be much faster than doing this accurately - would be good to get some profiles done at some point

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would that be a simple wrapper around this operation, or is there something more complex I'm missing?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something more complex - we don't care about the eigenvalues very much - we just care about the sign of the leading one. (Much) faster algorithms exist for finding the leading eigenvalue and I think we can generalise that (or find an existing generalisation) if we just care about the sign. But no point doing any of that unless this is actually a timesink, hence the suggestion of profiling first

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have thoughts on this that open up another can of worms related to the $R_0$ to $\beta$ conversion used in this model.

I've worked out a relatively simple mathematical expression for the eigenvalue here: https://hackmd.io/xfzSdlY-SnGblvP8aOA7Fw, as a precursor to calculating $\beta$, but we don't really use this in the model as I wasn't able to explain it sufficiently to EPPI (and they have their own maths, which I think is equivalent). So this alternative exists if we can sit down and hash out whether the maths is correct.


# NOTE: trigger response and log closure start time only if
# epidemic is growing
if (rt >= 1.0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we also be checking somewhere here that switch is already FALSE? Or is it sufficient that hosp_switch is FALSE?

Copy link
Collaborator Author

@pratikunterwegs pratikunterwegs Oct 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that switch and hosp_switch are equivalent flags in stage 01 (1:response_time).

The sequence of events is:

  • Hospitalisations cross the capacity threshold, but hosp_switch is FALSE;
  • hosp_switch is set to TRUE activating excess mortality;
  • IFF the epidemic is growing, the response is triggered and switch is turned on activating a response;
  • time is logged as closure_start_time;

I have split up the two switches in make_rt_end_event() as they are independent once the response is triggered by the default response_time (t = 30).

@pratikunterwegs
Copy link
Collaborator Author

pratikunterwegs commented Oct 8, 2024

Some queries here quickly - I'm losing track of the switches sorry

The issue with the switches is that they're split over multiple model stages and events, rather than having an integrated event function that modifies each switch from within the ODE RHS.

This isn't going to get any easier to keep track of as I need to add a vaccination switch which I omitted in #33.

@pratikunterwegs
Copy link
Collaborator Author

Just following up on the eigenvalue calculation comment with this reprex benchmarking the power iteration method for eigenvalue approximation against R's eigen(). For our use use in r_eff() the speed up is about 2x (on my machine). Greater benefits for larger matrices - we just have a very small matrix 4x4 so don't see the gains as much.

I can look into other algorithms, or we could work out the maths with EPPI as I'd be more confident in the model if both use-cases of the eigenvalue ($R_0$ and $R_t$) used the same maths.

# power iteration method for leading eigenvalue
power_iteration <- function(A, tol = 1e-6, max_iter = 1000) {
  # Initialize a random vector
  n <- nrow(A)
  v <- rnorm(n)
  v <- v / sqrt(sum(v^2))  # Normalize the vector
  
  lambda_old <- 0  # Initial value for the leading eigenvalue
  
  for (i in 1:max_iter) {
    # Multiply the matrix by the vector
    v_new <- A %*% v
    
    # Normalize the new vector
    v_new <- v_new / sqrt(sum(v_new^2))
    
    # Estimate the leading eigenvalue (Rayleigh quotient)
    lambda_new <- as.numeric(t(v_new) %*% A %*% v_new)
    
    # Check for convergence
    if (abs(lambda_new - lambda_old) < tol) {
      break
    }
    
    # Update the vector and eigenvalue for the next iteration
    v <- v_new
    lambda_old <- lambda_new
  }
  
  sign(lambda_new)
}

# R `eigen()` method for eigenvalues
f = function(A) {
  eigv = max(Re(eigen(A)$values))
  
  sign(eigv)
}

# small matrix similar to contact matrix %*% diag(p_susc)
A = matrix(
  runif(16, 0, 9999), 4, 4
)

# check for 4x4 matrix
microbenchmark::microbenchmark(
  power_iteration = power_iteration(A),
  trad = f(A),
  check = "identical"
)
#> Unit: microseconds
#>             expr    min      lq     mean  median      uq      max neval
#>  power_iteration 23.410 27.9335 31.84206 29.7945 35.4180   64.752   100
#>             trad 44.339 48.2375 66.72733 50.8470 60.3515 1322.670   100

# check for 100x100 matrix
A = matrix(
  runif(10000, 0, 9999), 100, 100
)

# check for 4x4 matrix
microbenchmark::microbenchmark(
  power_iteration = power_iteration(A),
  trad = f(A),
  check = "identical"
)
#> Unit: microseconds
#>             expr      min       lq      mean  median       uq      max neval
#>  power_iteration  111.798  134.937  145.8999  140.80  154.838  225.568   100
#>             trad 3759.610 3883.711 3955.9885 3923.24 3968.626 5882.294   100

@richfitz
Copy link
Contributor

richfitz commented Oct 9, 2024

Let's also find out what fraction of the total time it takes before worrying - in the case that prompted writing eigen1 it was taking something like 70% of the total time to simulate! But these were also quite large matrices (on the order of hundreds of rows/columns) where the gains were large.

In the case here, we should be able to adapt the power method to work out if we're heading above or below 1, which would be faster again. But if you have an actual math solution, which I'd believe is findable with a 4x4, that will surely be the fastest

@pratikunterwegs pratikunterwegs merged commit 0c84220 into main Oct 9, 2024
10 checks passed
pratikunterwegs added a commit that referenced this pull request Oct 9, 2024
pratikunterwegs added a commit that referenced this pull request Oct 10, 2024
pratikunterwegs added a commit that referenced this pull request Oct 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants