Skip to content

Latest commit

 

History

History
170 lines (134 loc) · 5.96 KB

README.md

File metadata and controls

170 lines (134 loc) · 5.96 KB

freqtables freqtables hex logo

CRAN status Downloads

The goal of freqtables is to quickly make tables of descriptive statistics for categorical variables (i.e., counts, percentages, confidence intervals). This package is designed to work in a tidyverse pipeline, and consideration has been given to get results from R to Microsoft Word ® with minimal pain.

Installation

You can install the released version of freqtables from CRAN with:

install.packages("freqtables")

And the development version from GitHub with:

# install.packages("devtools")
devtools::install_github("brad-cannell/freqtables")

Example

Because freqtables is intended to be used in a dplyr pipeline, loading dplyr into your current R session is recommended.

library(dplyr)
library(freqtables)

The examples below will use R’s built-in mtcars data set.

data("mtcars")

freq_table()

The freq_table() function produces one-way and two-way frequency tables for categorical variables. In addition to frequencies, the freq_table() function displays percentages, and the standard errors and confidence intervals of the percentages. For two-way tables only, freq_table() also displays row (subgroup) percentages, standard errors, and confidence intervals.

For one-way tables, the default 95 percent confidence intervals displayed are logit transformed confidence intervals equivalent to those used by Stata. Additionally, freq_table() will return Wald (“linear”) confidence intervals if the argument to ci_type = “wald”.

For two-way tables, freq_table() returns logit transformed confidence intervals equivalent to those used by Stata.

Here is an example of using freq_table() to create a one-way frequency table with all function arguments left at their default values:

mtcars %>% 
  freq_table(am)
#>   var cat  n n_total percent       se   t_crit      lcl      ucl
#> 1  am   0 19      32  59.375 8.820997 2.039513 40.94225 75.49765
#> 2  am   1 13      32  40.625 8.820997 2.039513 24.50235 59.05775

Here is an example of using freq_table() to create a two-way frequency table with all function arguments left at their default values:

mtcars %>% 
  freq_table(am, cyl)
#> # A tibble: 6 × 17
#>   row_var row_cat col_var col_cat     n n_row n_total percent_total se_total
#>   <chr>   <chr>   <chr>   <chr>   <int> <int>   <int>         <dbl>    <dbl>
#> 1 am      0       cyl     4           3    19      32          9.38     5.24
#> 2 am      0       cyl     6           4    19      32         12.5      5.94
#> 3 am      0       cyl     8          12    19      32         37.5      8.70
#> 4 am      1       cyl     4           8    13      32         25        7.78
#> 5 am      1       cyl     6           3    13      32          9.38     5.24
#> 6 am      1       cyl     8           2    13      32          6.25     4.35
#> # … with 8 more variables: t_crit_total <dbl>, lcl_total <dbl>,
#> #   ucl_total <dbl>, percent_row <dbl>, se_row <dbl>, t_crit_row <dbl>,
#> #   lcl_row <dbl>, ucl_row <dbl>

You can learn more about the freq_table() function and ways to adjust default behaviors in vignette(“descriptive_analysis”).

freq_test()

The freq_test() function is an S3 generic. It currently has methods for conducting hypothesis tests on one-way and two-way frequency tables. Further, it is made to work in a dplyr pipeline with the freq_table() function.

For the freq_table_two_way class, the methods used are Pearson’s chi-square test of independence Fisher’s exact test. When cell counts are <= 5, Fisher’s Exact Test is considered more reliable.

Here is an example of using freq_test() to test the equality of proportions on a one-way frequency table with all function arguments left at their default values:

mtcars %>%
  freq_table(am) %>%
  freq_test() %>%
  select(var:percent, p_chi2_pearson)
#>   var cat  n n_total percent p_chi2_pearson
#> 1  am   0 19      32  59.375      0.2888444
#> 2  am   1 13      32  40.625      0.2888444

Here is an example of using freq_test() to conduct a chi-square test of independence on a two-way frequency table with all function arguments left at their default values:

mtcars %>%
  freq_table(am, vs) %>%
  freq_test() %>%
  select(row_var:n, percent_row, p_chi2_pearson)
#> # A tibble: 4 × 7
#>   row_var row_cat col_var col_cat     n percent_row p_chi2_pearson
#>   <chr>   <chr>   <chr>   <chr>   <int>       <dbl>          <dbl>
#> 1 am      0       vs      0          12        63.2          0.341
#> 2 am      0       vs      1           7        36.8          0.341
#> 3 am      1       vs      0           6        46.2          0.341
#> 4 am      1       vs      1           7        53.8          0.341

You can learn more about the freq_table() function and ways to adjust default behaviors in vignette(“using_freq_test”).

freq_format()

The freq_format function is intended to make it quick and easy to format the output of the freq_table function for tables that may be used for publication. For example, a proportion and 95% confidence interval could be formatted as “24.00 (21.00 - 27.00).”

mtcars %>%
  freq_table(am) %>%
  freq_format(
    recipe = "percent (lcl - ucl)",
    name = "percent_95",
    digits = 2
  ) %>%
  select(var, cat, percent_95)
#>   var cat            percent_95
#> 1  am   0 59.38 (40.94 - 75.50)
#> 2  am   1 40.62 (24.50 - 59.06)

You can learn more about the freq_format() function by reading the function documentation.