README.Rmd

---
output:
  github_document:
    fig_width: 5
    fig_height: 5
  html_document:
    fig_width: 5
    fig_height: 5
    self_contained: no
    keep_md: yes
---

```{r setup, include=FALSE}
options(width = 90)
knitr::opts_chunk$set(fig.retina = 2,
                      tidy.opts = list(width.cutoff = 90),  # For code
                      width = 90)  # For output
knitr::read_chunk("broomify-amelia.R")
```

# Show multiply imputed results in a side-by-side regression table with broom and huxtable

[Andrew Heiss](https://www.andrewheiss.com) • 2018-03-08

---

*tl;dr*: Use the functions in [`broomify-amelia.R`](broomify-amelia.R) to use `broom::tidy()`, `broom::glance()`, and `huxtable::huxreg()` on lists of multiply imputed models.

---

The whole reason I went into the rabbit hole of the mechanics of merging imputed regression results [in the previous post](https://www.andrewheiss.com/blog/2018/03/07/amelia-tidy-melding/) was so I could easily report these results in papers and writeups. In political science and economics (and probably other social science disciplines), it's fairly standard to report many regression models in a side-by-side table, with a column for each model and rows for each coefficient. R packages like [`stargazer`](https://cran.r-project.org/package=stargazer) and [`huxtable`](https://cran.r-project.org/package=huxtable) make this fairly straightforward.


```{r load-libraries-impute, warning=FALSE, message=FALSE}
library(tidyverse)
library(Amelia)
library(stargazer)
library(huxtable)
library(broom)

# Use the africa dataset from Ameila
data(africa)

# Build some example models
model_original1 <- lm(gdp_pc ~ trade + civlib, data = africa)
model_original2 <- lm(gdp_pc ~ trade + civlib + infl, data = africa)
```

Stargazer takes a list of models:

```{r stargazer, warning=FALSE, results="markup"}
stargazer(model_original1, model_original2, type = "text")
```

As does huxtable's `huxreg()`:

```{r huxtable, results="markup"}
huxreg(model_original1, model_original2) %>%
  print_screen()
```

[Stargazer has support for a ton of different model types](https://cran.r-project.org/web/packages/stargazer/vignettes/stargazer.pdf) (see `?stargazer` for details), but they're all hardcoded into stargazer's internal code and adding more is tricky. [Huxtable](https://hughjonesd.github.io/huxtable/), on the other hand, doesn't rely on hardcoded model processing, but instead will display any model that works with `broom::tidy()` and `broom::glance()`. The [`broom` package](https://cran.r-project.org/package=broom) supports way more models than stargazer (including models created with [`rstan`](https://cran.r-project.org/package=rstan) and [`rstanarm`](https://cran.r-project.org/package=rstanarm)!), and because of this, huxtable is far more extensible—if you can create a `tidy()` and a `glance()` function for a type of model, huxtable can use it. 

Also, stargazer was written before [R Markdown](https://rmarkdown.rstudio.com/) was really a thing, so it has excellent support for HTML and LaTeX output, but that's it. Including stargazer tables in an R Markdown document is a hassle, especially if you want to be able to convert it to Word ([I've written a Python script for doing this](https://github.com/andrewheiss/edb-social-pressure/blob/master/bin/stargazer2docx.py)—that's how much extra work it takes). Huxtable, though, was written after the R Markdown and tidyverse revolutions, so it supports piping *and* can output to HTML, LaTeX, *and* Markdown (with `huxtable::print_md()`). 

This history is important because it means that models based on multiple imputation **will not work with stargazer.** Melding all the coefficients across imputations creates nice data frames of model results, but it doesn't create a model that stargazer can work with. This is unfortunate, especially given [how much I use stargazer](https://github.com/search?l=&q=stargazer+user%3Aandrewheiss&ref=advsearch&type=Code&utf8=%E2%9C%93). However, if we could make a `tidy()` and a `glance()` function that could work with a list of multiply imputed models, huxtable would solve all our problems.

So here's how to solve all your problems :)

First, we'll impute the missing data in the Africa data set, nest the imputed data in a larger data frame, and run a model on each imputed dataset:

```{r imputation-models}
set.seed(1234)
imp_amelia <- amelia(x = africa, m = 5, cs = "country", ts = "year", logs = "gdp_pc", p2s = 0)

models_imputed_df <- bind_rows(unclass(imp_amelia$imputations), .id = "m") %>%
  group_by(m) %>%
  nest() %>% 
  mutate(model = data %>% map(~ lm(gdp_pc ~ trade + civlib, data = .)))

models_imputed_df
```

Before we do anything with the models in `models_imputed_df$model`, first we can define a few functions to extend broom. R's S3 object system means that a function named `whatever.blah()` will automatically work when called on objects with the class `blah`. This is how broom generally works—there are functions named `tidy.anova()`, `tidy.glm()`, `tidy.lm()`, etc. that will do the correct tidying when run on `anova`, `glm`, and `lm` objects. Huxtable also takes advantage of this S3 object system—it will call the appropriate tidy and glance functions based on the class of the models passed to it.

To make a list of models work with broom, we need to invent a new class of model. In this example I've named it `melded`, but it could be anything. Here are three functions designed to work on `melded` objects (the code for these is largely based on [the previous post about melding coefficients](https://www.andrewheiss.com/blog/2018/03/07/amelia-tidy-melding/)). These functions are also found in [`broomify-amelia.R`](broomify-amelia.R), which you can add to your project (maybe someday this could be an actual package, but I don't see a reason for it yet).

```{r broomify, message=FALSE}
```

With these three functions, we can now use `glance()` and `tidy()` on a list of models with the class `melded`, like so:

```{r}
# Extract the models into a vector and make it a "melded" class
models_imputed <- models_imputed_df$model
# Without this, R won't use our custom tidy.melded() or glance.melded() functions
class(models_imputed) <- "melded"

glance(models_imputed)
tidy(models_imputed)
```

Even better, though, is that we can use these imputed models in a huxtable regression table. And, because I included a column named `m` in `glance.melded()`, we can also include it in the regression output!

```{r huxtable-table, warning=FALSE, message=FALSE, results="markup"}
huxreg(model_original1, model_original2, models_imputed,
       statistics = c(N = "nobs", R2 = "r.squared", `Adj R2` = "adj.r.squared", 
                      "logLik", "AIC", "m")) %>% 
  print_screen()
```