R scripts for MODY prediction in CPRD
Bayesian hierarchical model combining case-control data with UNITED (population representative data). It uses a mixture approach splitting patients based on a latent variable T. If the patient is
The function used for predictions is available at 00.prediction_functions.R
. In order to use it, you must load the function into the environment.
source("00.prediction_functions.R")
You also need to load the Bayesian model posteriors (model parameters) and format them the right way.
rcs_parms <- readRDS("model_posteriors/rcs_parms.rds")
posterior_samples_T1D <- readRDS("model_posteriors/type_1_model_posteriors.rds")
posterior_samples_T1D_obj <- list(post = posterior_samples_T1D$samples)
class(posterior_samples_T1D_obj) <- "T1D"
In order to make predictions, you need the following variables:
pardm
: Parent history of Diabetes, with history represented by1
and no history represented by0
agerec
: Age at recruitmenthba1c
: HbA_{1c}agedx
: Age at diagnosissex
: Sex, with Male represented by a0
and Female represented by a1
bmi
: BMIT
: C-peptide and autoantibody testing, with C+ and A- represented by0
, C- or A+ represented by1
and any other combination represented byNA
.
The dataset has to be formatted as a tibble.
predictions_x <- as_tibble(as.matrix(select(patients_dataset, pardm, agerec, hba1c, agedx, sex, bmi, T)))
Last thing to do is the predictions themselves. For that, use
predictions_T1D <- predict(posterior_samples_T1D_obj, predictions_x, rcs_parms) %>%
apply(., 2, function(x) {
data.frame(prob = mean(x), LCI = quantile(x, probs = 0.025), UCI = quantile(x, probs = 0.975))
}) %>%
bind_rows()
this will make the predictions and calculate the mean (lower and upper credible intervals at 2.5% and 97.5%) probability of having a MODY gene. If you are not looking at the uncertainty, only use the mean prediction.
Bayesian shrinkage recalibration logistic model which scales the odds ratios and adjusts the intercept based on the general population data.
The function used for predictions is available at 00.prediction_functions.R
. In order to use it, you must load the function into the environment.
source("00.prediction_functions.R")
You also need to load the Bayesian model posteriors (model parameters).
posterior_samples_T2D <- readRDS("model_posteriors/type_2_model_posteriors.rds")
posterior_samples_T2D_obj <- list(post = posterior_samples_T2D$samples)
class(posterior_samples_T2D_obj) <- "T2D"
In order to make predictions, you need the following variables:
pardm
: Parent history of Diabetes, with history represented by1
and no history represented by0
agerec
: Age at recruitmenthba1c
: HbA_{1c}agedx
: Age at diagnosissex
: Sex, with Male represented by a0
and Female represented by a1
bmi
: BMIinsoroha
: Patient currently on insulin or tables, with TRUE represented by1
and FALSE represented by0
The dataset has to be formatted as a tibble.
predictions_x <- as_tibble(as.matrix(select(patients_dataset, pardm, agerec, hba1c, agedx, sex, bmi, insoroha)))
Last thing to do is the predictions themselves. For that, use
predictions_T2D <- predict(posterior_samples_T2D_obj, predictions_x) %>%
apply(., 2, function(x) {
data.frame(prob = mean(x), LCI = quantile(x, probs = 0.025), UCI = quantile(x, probs = 0.975))
}) %>%
bind_rows()
this will make the predictions and calculate the mean (lower and upper credible intervals at 2.5% and 97.5%) probability of having a MODY gene. If you are not looking at the uncertainty, only use the mean prediction.