Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Major update that improves support for formulas specification #582

Open
wants to merge 35 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
4cf4953
Major update that improves support for formulas specification
stefvanbuuren Sep 11, 2023
ea84be3
Convert documentation Rd tags to markdown tags for roxygen2
stefvanbuuren Sep 11, 2023
5c6bee2
Add a data argument to nimp() to calculate number of imputations per …
stefvanbuuren Sep 12, 2023
755c23a
Restore classic predictorMatrix behaviour that sets predictorMatrix[j…
stefvanbuuren Sep 13, 2023
c2da03c
Clean up source, identicate that there is still a problem with edit.s…
stefvanbuuren Sep 13, 2023
28821a6
Create a make.nest(), n2b() and b2n() function for working with nest …
stefvanbuuren Sep 13, 2023
731bf25
Insist that predictorMatrix has a zero diagonal
stefvanbuuren Sep 13, 2023
8f92307
- Prevention of NA propagation
stefvanbuuren Sep 18, 2023
772c876
Add exit checks on mids object
stefvanbuuren Sep 18, 2023
465bd5c
Add test for zero predictorMatrix row if method == "", deal with rela…
stefvanbuuren Sep 18, 2023
c8ed335
Update news
stefvanbuuren Sep 18, 2023
05a0209
Update documentation for mice() arguments
stefvanbuuren Sep 18, 2023
6033fc6
Update list of builtin imputation methods
stefvanbuuren Sep 18, 2023
29fee22
Reorder sequence of mice() arguments
stefvanbuuren Sep 18, 2023
fef881b
Reorder nest in data sequence
stefvanbuuren Sep 19, 2023
ba383eb
Use lowercase 'b' and 'f' for automatic naming of blocks and formulas
stefvanbuuren Sep 19, 2023
4175534
Update error message in mpmm
stefvanbuuren Sep 19, 2023
0166992
Sort terms both for pred and formulas
stefvanbuuren Sep 19, 2023
35b6084
Create a mechanism to inform check.method() of the set of variables t…
stefvanbuuren Sep 21, 2023
65f544f
Introduce NA types in initialize.imp()
stefvanbuuren Sep 21, 2023
d9c6fa6
Update nest printing in print.mids()
stefvanbuuren Sep 21, 2023
b9e398e
Add support for blots to multivariate imputation models
stefvanbuuren Sep 21, 2023
0345ec3
Rename `nest` to `parcel`
stefvanbuuren Sep 21, 2023
07a79e9
Use lower case default block names
stefvanbuuren Sep 21, 2023
53916f4
Rename `blots` to `dots`
stefvanbuuren Sep 21, 2023
3c09055
Rename files from blots/nest to dots/parcel
stefvanbuuren Sep 21, 2023
3cebc30
Add deprecation support for make.blots()
stefvanbuuren Sep 21, 2023
7b7a17c
Implement autoremove in check.predictorMatrix() and check.formulas()
stefvanbuuren Sep 21, 2023
8c4bb38
Write one loggedEvent for each removed variable
stefvanbuuren Sep 22, 2023
24688b1
Abort mice when user speficies mixes of `formulas` and `predictorMatr…
stefvanbuuren Sep 22, 2023
e1c475f
Update NEWS.md
stefvanbuuren Sep 22, 2023
da6396b
Reorder mice() arguments into a clusters of operations
stefvanbuuren Oct 2, 2023
db5caf6
Remove superfluous construct.parcel(), make remove.rhs.variables() in…
stefvanbuuren Oct 2, 2023
f5d5c99
Add MICE 4 Syntax Documentation CONCEPT as a vignette
stefvanbuuren Oct 2, 2023
6edcd71
Rebuild site to include article mice4syntax
stefvanbuuren Oct 2, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Package: mice
Type: Package
Version: 3.16.5
Version: 3.16.5.9001
Title: Multivariate Imputation by Chained Equations
Date: 2023-09-04
Authors@R: c(person("Stef", "van Buuren", role = c("aut","cre"),
Expand Down Expand Up @@ -100,3 +100,5 @@ BugReports: https://github.com/amices/mice/issues
LinkingTo: cpp11, Rcpp
License: GPL (>= 2)
RoxygenNote: 7.2.3
Roxygen: list(markdown = TRUE)
VignetteBuilder: knitr
5 changes: 5 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,7 @@ export(convergence)
export(densityplot)
export(estimice)
export(extractBS)
export(f2p)
export(fico)
export(filter)
export(fix.coef)
Expand All @@ -90,8 +91,10 @@ export(is.mitml.result)
export(lm.mids)
export(make.blocks)
export(make.blots)
export(make.dots)
export(make.formulas)
export(make.method)
export(make.parcel)
export(make.post)
export(make.predictorMatrix)
export(make.visitSequence)
Expand Down Expand Up @@ -148,6 +151,7 @@ export(nelsonaalen)
export(nic)
export(nimp)
export(norm.draw)
export(p2f)
export(parlmice)
export(pool)
export(pool.compare)
Expand Down Expand Up @@ -256,6 +260,7 @@ importFrom(stats,spline)
importFrom(stats,summary.glm)
importFrom(stats,terms)
importFrom(stats,update)
importFrom(stats,update.formula)
importFrom(stats,var)
importFrom(stats,vcov)
importFrom(tidyr,complete)
Expand Down
44 changes: 44 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,47 @@
# mice 3.16.5.9001

## New behaviours and features

1. TWO SEPARATE INTERFACES FOR MODEL SPECIFICATION: This version promotes two interfaces to specify imputations models: predictor (`predictorMatrix` + `parcel` + `method`) and formula (`formulas + method`). This version does not accept anymore accept mixes of `predictorMatrix` and `formulas` arguments in the call to `mice()`.

2. NA-PROPAGATION PREVENTION. This version detects when a predictor contains missing values that are not imputed. In order to prevent NA propagation, `mice()` can follow two strategies: "Autoremove" (remove incomplete predictor(s) from the RHS, set `method` to `""`, adapt `predictorMatrix`, `formulas` and `blocks`, write to loggedEvents), or "Autoimpute" (Impute incomplete predictor and adapt `method`, `predictorMatrix`, `formulas`, and so on). "Autoremove" is implemented and current default. Use `mice(..., autoremove = FALSE)` to revert to old behavior (NA propagation).

3. SUBMODELS: The `predictorMatrix` input can be a square submatrix of the full `predictorMatrix` when its dimensions are named. `mice()` will augment the tiny `predictorMatrix` to the full matrix and always return a p * p named matrix corresponding to the p columns in the data. Unmentioned variables not be imputed, and the `predictorMatrix`, `formulas` and `method` are adapted accordingly.

4. DROP NON-SQUARE PREDICTOR MATRIX: Version 3.0 introduced non-square versions, but its interpretation turned out to be complex and ambiguous. For clarity, this update works with a predictor matrix that is square with both dimensions identically named with the names of the variables in the data. Variable groups are now specified through the `parcel` argument.

5. NEW PARCEL ARGUMENT. There is a new `parcel` argument that is easier to use. The print of the `mids` object shows `parcel` when it is different from the default.
`parcel` can take over the role of `blocks` in specification. `blocks` is soft-deprecated, but still widely used within the program code.

6. NEW DOTS ARGUMENT. The `blots` argument is renamed to `dots`

7. EXIT VALIDATION: Adds a new `validate.mids()` checks the `mids` object before exit.


## Changes

- Adds functions to convert between `predictorMatrix` and `formulas` specification
- Adds support to pass down user-specified options to multivariate imputation methods
- Now uses lowercase default block names
- The `predictorMatrix` input may be unnamed if its size is p * p. For other than p * p, an unnamed matrix generated an error.
- Performs stricter checks on zero rows in predictorMatrix under empty imputation method
- Adds new function `remove.rhs.variables()`
- Removes codes designed to work specifically with a non-square `predictorMatrix`
- Generates an error if `predictorMatrix` has fewer rows than length of `blocks`
- Better initialization using typed `NA`s in `initialize.imp()`
- Rewritten the documentation of all `mice()` arguments to be precise and consistent

## New exit checks

- `rownames(predictorMatrix)` must match `colnames(data)`
- length of `formulas` and `blocks` must be equal
- length of `formulas` and `method` must be equal
- length of `dots` and `method` must be equal
- length of `method` vector cannot exceed number of variables
- length of `imp` and number of variables must be equal

## Other fixes

* Prepares for the deprecation of the `blocks` argument at various places
* Removes the need for `blocks` in `initialize_chain()`
* In `rbind()`, when formulas are concatenated and duplicate names are found, also rename the duplicated variables in formulas by their new name
Expand Down
20 changes: 10 additions & 10 deletions R/D1.R
Original file line number Diff line number Diff line change
Expand Up @@ -2,25 +2,25 @@
#'
#' The D1-statistics is the multivariate Wald test.
#'
#' @param fit1 An object of class \code{mira}, produced by \code{with()}.
#' @param fit0 An object of class \code{mira}, produced by \code{with()}. The
#' model in \code{fit0} is a nested within \code{fit1}. The default null
#' model \code{fit0 = NULL} compares \code{fit1} to the intercept-only model.
#' @param fit1 An object of class `mira`, produced by `with()`.
#' @param fit0 An object of class `mira`, produced by `with()`. The
#' model in `fit0` is a nested within `fit1`. The default null
#' model `fit0 = NULL` compares `fit1` to the intercept-only model.
#' @param dfcom A single number denoting the
#' complete-data degrees of freedom of model \code{fit1}. If not specified,
#' it is set equal to \code{df.residual} of model \code{fit1}. If that cannot
#' complete-data degrees of freedom of model `fit1`. If not specified,
#' it is set equal to `df.residual` of model `fit1`. If that cannot
#' be done, the procedure assumes (perhaps incorrectly) a large sample.
#' @param df.com Deprecated
#' @note Warning: `D1()` assumes that the order of the variables is the
#' same in different models. See
#' \url{https://github.com/amices/mice/issues/420} for details.
#' <https://github.com/amices/mice/issues/420> for details.
#' @references
#' Li, K. H., T. E. Raghunathan, and D. B. Rubin. 1991.
#' Large-Sample Significance Levels from Multiply Imputed Data Using
#' Moment-Based Statistics and an F Reference Distribution.
#' \emph{Journal of the American Statistical Association}, 86(416): 1065–73.
#' *Journal of the American Statistical Association*, 86(416): 1065–73.
#'
#' \url{https://stefvanbuuren.name/fimd/sec-multiparameter.html#sec:wald}
#' <https://stefvanbuuren.name/fimd/sec-multiparameter.html#sec:wald>
#' @examples
#' # Compare two linear models:
#' imp <- mice(nhanes2, seed = 51009, print = FALSE)
Expand All @@ -34,7 +34,7 @@
#' fit0 <- with(imp, glm(gen > levels(gen)[1] ~ hgt + hc, family = binomial))
#' D1(fit1, fit0)
#' }
#' @seealso \code{\link[mitml]{testModels}}
#' @seealso [mitml::testModels()]
#' @export
D1 <- function(fit1, fit0 = NULL, dfcom = NULL, df.com = NULL) {
install.on.demand("mitml")
Expand Down
8 changes: 4 additions & 4 deletions R/D2.R
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,13 @@
#' @inheritParams mitml::testModels
#' @note Warning: `D2()` assumes that the order of the variables is the
#' same in different models. See
#' \url{https://github.com/amices/mice/issues/420} for details.
#' <https://github.com/amices/mice/issues/420> for details.
#' @references
#' Li, K. H., X. L. Meng, T. E. Raghunathan, and D. B. Rubin. 1991.
#' Significance Levels from Repeated p-Values with Multiply-Imputed Data.
#' \emph{Statistica Sinica} 1 (1): 65–92.
#' *Statistica Sinica* 1 (1): 65–92.
#'
#' \url{https://stefvanbuuren.name/fimd/sec-multiparameter.html#sec:chi}
#' <https://stefvanbuuren.name/fimd/sec-multiparameter.html#sec:chi>
#' @examples
#' # Compare two linear models:
#' imp <- mice(nhanes2, seed = 51009, print = FALSE)
Expand All @@ -27,7 +27,7 @@
#' fit0 <- with(imp, glm(gen > levels(gen)[1] ~ hgt + hc, family = binomial))
#' D2(fit1, fit0)
#' }
#' @seealso \code{\link[mitml]{testModels}}
#' @seealso [mitml::testModels()]
#' @export
D2 <- function(fit1, fit0 = NULL, use = "wald") {
install.on.demand("mitml")
Expand Down
26 changes: 13 additions & 13 deletions R/D3.R
Original file line number Diff line number Diff line change
Expand Up @@ -3,34 +3,34 @@
#' The D3-statistic is a likelihood-ratio test statistic.
#'
#' @details
#' The \code{D3()} function implement the LR-method by
#' The `D3()` function implement the LR-method by
#' Meng and Rubin (1992). The implementation of the method relies
#' on the \code{broom} package, the standard \code{update} mechanism
#' for statistical models in \code{R} and the \code{offset} function.
#' on the `broom` package, the standard `update` mechanism
#' for statistical models in `R` and the `offset` function.
#'
#' The function calculates \code{m} repetitions of the full
#' The function calculates `m` repetitions of the full
#' (or null) models, calculates the mean of the estimates of the
#' (fixed) parameter coefficients \eqn{\beta}. For each imputed
#' imputed dataset, it calculates the likelihood for the model with
#' the parameters constrained to \eqn{\beta}.
#'
#' The \code{mitml::testModels()} function offers similar functionality
#' for a subset of statistical models. Results of \code{mice::D3()} and
#' \code{mitml::testModels()} differ in multilevel models because the
#' \code{testModels()} also constrains the variance components parameters.
#' The `mitml::testModels()` function offers similar functionality
#' for a subset of statistical models. Results of `mice::D3()` and
#' `mitml::testModels()` differ in multilevel models because the
#' `testModels()` also constrains the variance components parameters.
#' For more details on
#'
#' @seealso \code{\link{fix.coef}}
#' @seealso [fix.coef()]
#' @inheritParams D1
#' @return An object of class \code{mice.anova}
#' @return An object of class `mice.anova`
#' @references
#' Meng, X. L., and D. B. Rubin. 1992.
#' Performing Likelihood Ratio Tests with Multiply-Imputed Data Sets.
#' \emph{Biometrika}, 79 (1): 103–11.
#' *Biometrika*, 79 (1): 103–11.
#'
#' \url{https://stefvanbuuren.name/fimd/sec-multiparameter.html#sec:likelihoodratio}
#' <https://stefvanbuuren.name/fimd/sec-multiparameter.html#sec:likelihoodratio>
#'
#' \url{http://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#setting-residual-variances-to-a-fixed-value-zero-or-other}
#' <http://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#setting-residual-variances-to-a-fixed-value-zero-or-other>
#' @examples
#' # Compare two linear models:
#' imp <- mice(nhanes2, seed = 51009, print = FALSE)
Expand Down
Loading