Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding forecasting functionality #14

Closed
adamkucharski opened this issue Dec 9, 2022 · 7 comments
Closed

Adding forecasting functionality #14

adamkucharski opened this issue Dec 9, 2022 · 7 comments

Comments

@adamkucharski
Copy link
Member

Once we have an estimate of CFR, which based on current estimation methods will typically run up to the most recent death, it would be possible to generate a forecast forward in time based on the estimated CFR, time from onset-to-outcome and recent case numbers.

@sbfnk
Copy link
Contributor

sbfnk commented Dec 9, 2022

This functionality already exists in EpiNow2 (with example applications in the US and UK) so I don't see the value of replicating this here.

What might be good would be to ensure interopability by providing output in a format that can be used by forecast_secondary for forecasting.

@adamkucharski
Copy link
Member Author

Thanks for flagging that functionality. Agree doesn't make sense to duplicate, but a couple of thoughts/questions:

  • Does it introduce an inconsistency in interpretation of uncertainty if method used to estimate is different to one used to forecast? E.g. feeding MLE CFR into EpiNow2 model? Also (but less of an issue perhaps) introduces an rstan dependency for all forecasts.
  • We'll be including functionality to calculate CFR from individual level data (e.g. among those with known outcomes in line lists), and could therefore forecast on individual level outcomes then aggregate into incidence, which I think would be more robust than aggregating then forecasting? Of course, if data only available as incidence, then makes no difference.

@sbfnk
Copy link
Contributor

sbfnk commented Dec 14, 2022

Does it introduce an inconsistency in interpretation of uncertainty if method used to estimate is different to one used to forecast? E.g. feeding MLE CFR into EpiNow2 model? Also (but less of an issue perhaps) introduces an rstan dependency for all forecasts.

Re inconsistency possibly - I guess ideally this would all be done in the same generative model that jointly estimated the delays and make a forecast, rather than first making an estimate and then feeding it into a forward model. The rstan dependency is a fair point. There is an PR with an R implementation of the simulate_secondary model (which is used by the forecasting function) in EpiNow2 - in theory this could live in a separate package.

We'll be including functionality to calculate CFR from individual level data (e.g. among those with known outcomes in line lists), and could therefore forecast on individual level outcomes then aggregate into incidence, which I think would be more robust than aggregating then forecasting? Of course, if data only available as incidence, then makes no difference.

Ah, that would definitely be different. I think you'd still need a model for expected case incidence though and how it behaves in the future unless you'd only be interested in outcomes of already observed cases. Note that functionality to estimate delays from individual-level data have recently also been implemented e.g. in EpiLine and dynamicaltruncation (though both stan-based).

@adamkucharski
Copy link
Member Author

Thanks, very useful. Have had a go at implementing EpiLine for Ebola 2022 global.health line lit here. I looked into a direct comparison between the two, but the challenge is that EpiLine appears hard coded for a Johnson SU distribution, whereas dynamicaltruncation uses lognormal, so not straightforward to do a direct simulation recovery.

It also made me aware there are two slightly different problems being tackled here. One is joint estimation of delay function and truncated infection dynamics (i.e. above packages) and another is estimation of CFR, adjusting for truncation (i.e. focus of datadelay). For small, noisy datasets (e.g. Ebola) the full joint distribution is often not identifiable.

However, a generative model like your EpiNow2 implementation would be useful if the CFR (or more likely, the reporting rate) is changing over time, which would I think end up with similar underlying framework as our earlier under-ascertainment analysis.

In the case where delay from onset-to-outcome is independent of outcome (i.e. delay to death and recovery the same, so quite a strong assumption), then individual-level data conditioned on known outcome wouldn't need any further adjustment, because just need to calculate the CFR directly as a ratio (as the generative process for both deaths and non-deaths is the same).

On wider note, I noticed I spent a fair bit of time getting the right data structures, dependency versions, valid inputs etc. set up to compare these different outbreak models, so this would be a natural area for contributions if this functionality is a priority for some of the pipelines we're working on.

@pratikunterwegs
Copy link
Collaborator

Just checking to see whether this issue is also still relevant? Thoughts or suggestions for implementation?

@adamkucharski
Copy link
Member Author

We've been exploring an EpiNow2 comparison in the case studies @CarmenTamayo has been working on (with static estimation). But had some stability issues recently - would be good to revisit @sbfnk to make sure the estimation function is still appropriate. We'll share the Rmd when Carmen is back.

@pratikunterwegs
Copy link
Collaborator

Related to epiverse-trace/epiparameter#250. Closing this issue as this will be covered in {howto} using {EpiNow2}.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants