-
Notifications
You must be signed in to change notification settings - Fork 7
/
21-inference_for_two_independent_means-web.Rmd
689 lines (397 loc) · 26.2 KB
/
21-inference_for_two_independent_means-web.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
# Inference for two independent means
<!-- Please don't mess with the next few lines! -->
<style>h5{font-size:2em;color:#0000FF}h6{font-size:1.5em;color:#0000FF}div.answer{margin-left:5%;border:1px solid #0000FF;border-left-width:10px;padding:25px} div.summary{background-color:rgba(30,144,255,0.1);border:3px double #0000FF;padding:25px}</style>`r options(scipen=999)`<p style="color:#ffffff">`r intToUtf8(c(50,46,48))`</p>
<!-- Please don't mess with the previous few lines! -->
::: {.summary}
### Functions introduced in this chapter: {-}
No new R functions are introduced here.
:::
## Introduction
If we have a numerical variable and a categorical variable with two categories, we can think of the numerical variable as response and the categorical variable as predictor. The idea is that the two categories sort your numerical data into two groups which can be compared. Assuming the two groups are independent of each other, we can use them as samples of two larger populations. This leads to inference to decide if the difference between the means of the two groups is statistically significant and then estimate the difference between the means of the two populations represented. The relevant hypothesis test is called a two-sample t test (or Welch's t test, to be specific).
### Install new packages
There are no new packages used in this chapter.
### Download the R notebook file
Check the upper-right corner in RStudio to make sure you're in your `intro_stats` project. Then click on the following link to download this chapter as an R notebook file (`.Rmd`).
<a href = "https://vectorposse.github.io/intro_stats/chapter_downloads/21-inference_for_two_independent_means.Rmd" download>https://vectorposse.github.io/intro_stats/chapter_downloads/21-inference_for_two_independent_means.Rmd</a>
Once the file is downloaded, move it to your project folder in RStudio and open it there.
### Restart R and run all chunks
In RStudio, select "Restart R and Run All Chunks" from the "Run" menu.
## Load packages
We load the standard `tidyverse`, `janitor`, and `infer` packages. We also use the `MASS` package for the `birthwt` data.
```{r}
library(tidyverse)
library(janitor)
library(infer)
library(MASS)
```
## Research question
Recall the `birthwt` data that was collected at Baystate Medical Center, Springfield, Mass during 1986. In a previous chapter, we measured low birth weight babies using a categorical variable that served as an indicator for low birth weight.
##### Exercise 1 {-}
How was it determined if a baby was considered "low birth weight" for purposes of constructing the variable `low`? Use the help file to find out.
::: {.answer}
Please write up your answer here.
:::
*****
We have the actual birth weight of the babies in this data. So, rather than using a coarse classification into a binary "yes or no" variable, why not use the full precision of the birth weight measured in grams? This is a very precisely measured numerical variable.
We'd like to compare mean birth weights among two groups: women who smoked during pregnancy, and women who didn't.
## Data preparation
The actual mean weights in each sample (the smoking women and the nonsmoking women) can be found using a `group_by` and `summarise` pipeline:
```{r}
birthwt %>%
group_by(smoke) %>%
summarise(mean(bwt))
```
Note that 0 means "nonsmoker" and 1 means "smoker". Looks like We need to address the fact the `smoke` variable is recorded as a numerical variable instead of a categorical variable. Here is `birthwt2` that we will use from here on out:
```{r}
birthwt2 <- birthwt %>%
mutate(smoke_fct = factor(smoke, levels = c(0, 1), labels = c("Nonsmoker", "Smoker")))
birthwt2
```
```{r}
glimpse(birthwt2)
```
The difference between the means is now calculated using `infer` tools. We will store the result as `obs_diff` for "observed difference".
```{r}
obs_diff <- birthwt2 %>%
specify(response = bwt, explanatory = smoke_fct) %>%
calculate(stat = "diff in means", order = c("Nonsmoker", "Smoker"))
obs_diff
```
##### Exercise 2 {-}
What would happen if we used `order = c("Smoker", "Nonsmoker")` instead? Why might we have a slight preference for `order = c("Nonsmoker", "Smoker")`?
::: {.answer}
Please write up your answer here.
:::
*****
Note that it will not actually make a difference to the inferential process in which order we subtract. However, we do have to be consistent to use the same order throughout. When interpreting the test statistic, effect size, and confidence interval, we will need to pay attention to the order of subtraction to make sure we are interpreting our results correctly.
## Every day I'm shuffling
Whenever there are two groups, the obvious null hypothesis is that there is no difference between them.
Consider the `smoke` variable. If there were truly no difference in mean birth weights between women who smoked and women who didn't, then it shouldn't matter if we know the smoking status or not. It becomes irrelevant under the assumption of the null.
We can simulate this assumption by shuffling the list of smoking status. More concretely, we can randomly assign a smoking status label to each mother and then calculate the average birth weight in each group. Since the smoking labels are random, there's no reason to expect a difference between the two average weights other than random fluctuations due to sampling variability.
For example, here is the actual smoking status of the women:
```{r}
birthwt2$smoke_fct
```
But we're going to use values that have been randomly shuffled, like this one, for example:
```{r}
set.seed(1729)
sample(birthwt2$smoke_fct)
```
The `infer` package will perform this random shuffling over and over again. Given the now arbitrary labels of "Nonsmoker" and "Smoker" (which are meaningless because each women was assigned to one of these labels randomly with no regard to her actual smoking status), `infer` will calculate the mean birth weights among the first group of women (labeled "Nonsmokers" but not really consisting of all nonsmokers) and the second group of women (labeled "Smokers" but not really consisting of all smokers). Finally `infer` will compute the difference between those two means. And it will do this process 1000 times.
```{r}
set.seed(1729)
bwt_smoke_test <- birthwt2 %>%
specify(response = bwt, explanatory = smoke_fct) %>%
hypothesize(null = "independence") %>%
generate(reps = 1000, type = "permute") %>%
calculate(stat = "diff in means", order = c("Nonsmoker", "Smoker"))
bwt_smoke_test
```
##### Exercise 3 {-}
Before we graph these simulated values, what do you guess will be the mean value? Keep in mind that we have computed differences in the mean birth weights between two groups of women. But because we have shuffled the smoking labels randomly, we aren't really calculating the difference in mean birth weights of nonsmokers vs smokers. We're just computing the difference in mean birth weights of randomly assigned groups of women.
::: {.answer}
Please write up your answer here.
:::
*****
Here's the visualization:
```{r}
bwt_smoke_test %>%
visualize()
```
No surprise that this histogram looks nearly normal, centered at zero: the simulation is working under the assumption of the null hypothesis of no difference between the groups.
Here is the same plot but including our sample difference:
```{r}
bwt_smoke_test %>%
visualize() +
shade_p_value(obs_stat = obs_diff, direction = "two_sided")
```
Our observed difference (from the sampled data) is quite far out into the tail of this simulated sampling distribution, so it appears that our actual data would be somewhat unlikely due to pure chance alone if the null hypothesis were true.
We can even find a P-value by calculating how many of our sampled values are as extreme or more extreme than the observed data difference.
```{r}
bwt_smoke_test %>%
get_p_value(obs_stat = obs_diff, direction = "two-sided")
```
Indeed, this is a small P-value.
## The sampling distribution model
In the previous section, we simulated the sampling distribution under the assumption of a null hypothesis of no difference between the groups. It certainly looked like a normal model, but which normal model? The center is obviously zero, but what about the standard deviation?
Let's assume that both groups come from populations that are normally distributed with normal models $N(\mu_{1}, \sigma_{1})$ and $N(\mu_{2}, \sigma_{2})$. If we take samples of size $n_{1}$ from group 1 and $n_{2}$ from group 2, some fancy math shows that the distribution of the differences between sample means is
$$
N\left(\mu_{1} - \mu_{2}, \sqrt{\frac{\sigma_{1}^{2}}{n_{1}} + \frac{\sigma_{2}^{2}}{n_{2}}}\right).
$$
Under the assumption of the null, the difference of the means is zero ($\mu_{1} - \mu_{2} = 0$). Unfortunately, though, we make no assumption on the standard deviations. It should be clear that the only solution is to substitute the sample standard deviations $s_{1}$ and $s_{2}$ for the population standard deviations $\sigma_{1}$ and $\sigma_{2}$.^[When we were testing two proportions with categorical data, one option (described in an optional appendix in that chapter) was to pool the data. With numerical data, we can calculate a pooled mean, but that doesn't help with the unknown standard deviations. Nothing in the null hypothesis suggests that the standard deviations of the two groups should be the same. In the extremely rare situation in which one can assume equal standard deviations in the two groups, then there is a way to run a pooled t test. But this "extra" assumption of equal standard deviations is typically questionable at best.]
$$
SE = \sqrt{\frac{s_{1}^{2}}{n_{1}} + \frac{s_{2}^{2}}{n_{2}}}.
$$
However, $s_{1}$ and $s_{2}$ are not perfect estimates of $\sigma_{1}$ and $\sigma_{2}$; they are subject to sampling variability too. This extra variability means that a normal model is no longer appropriate as the sampling distribution model.
In the one-sample case, a Student t model with $df = n - 1$ was the right choice. In the two-sample case, we don't know the right answer. And I don't mean that we haven't learned it yet in our stats class. I mean, statisticians have not found a formula for the correct sampling distribution. It is a famous unsolved problem, called the Behrens-Fisher problem.
Several researchers have proposed solutions that are "close" though. One compelling one is called "Welch's t test". Welch showed that even though it's not quite right, a Student t model is very close as long as you pick the degrees of freedom carefully. Unfortunately, the way to compute the right degrees of freedom is crazy complicated. Fortunately, R is good at crazy complicated computations.
Let's go through the full rubric.
## Exploratory data analysis
### Use data documentation (help files, code books, Google, etc.) to determine as much as possible about the data provenance and structure.
Type `birthwt` at the Console to read the help file. We have the same concerns about the lack of details as we did in Chapter 16.
```{r}
birthwt
```
```{r}
glimpse(birthwt)
```
### Prepare the data for analysis.
We need to be sure `smoke` is a factor variable, so we create the new tibble `birthwt2` with the mutated variable `smoke_fct`.
```{r}
birthwt2 <- birthwt %>%
mutate(smoke_fct = factor(smoke, levels = c(0, 1), labels = c("Nonsmoker", "Smoker")))
birthwt2
```
```{r}
glimpse(birthwt2)
```
### Make tables or plots to explore the data visually.
How many women are in each group?
```{r}
tabyl(birthwt2, smoke_fct) %>%
adorn_totals()
```
With a numerical response variable and a categorical predictor variable, there are two useful plots: a side-by-side boxplot and a stacked histogram.
```{r}
ggplot(birthwt2, aes(y = bwt, x = smoke_fct)) +
geom_boxplot()
```
```{r}
ggplot(birthwt2, aes(x = bwt)) +
geom_histogram(binwidth = 250, boundary = 0) +
facet_grid(smoke_fct ~ .)
```
The histograms for both groups look sort of normal, but the nonsmoker group may be a little left skewed and the smoker group may have some low outliers. Here are the QQ plots to give us another way to ascertain normality of the data.
```{r}
ggplot(birthwt2, aes(sample = bwt)) +
geom_qq() +
geom_qq_line() +
facet_grid(smoke_fct ~ .)
```
There's a little deviation from normality, but nothing too crazy.
Commentary: The boxplots and histograms show why statistical inference is so important. It's clear that there is some difference between the two groups, but it's not obvious if that difference will turn out to be statistically significant. There appears to be a lot of variability in both groups, and both groups have a fair number of lighter and heavier babies.
## Hypotheses
### Identify the sample (or samples) and a reasonable population (or populations) of interest.
The samples consist of 115 nonsmoking mothers and 74 smoking mothers. The populations are those women who do not smoke during pregnancy and those women who do smoke during pregnancy.
### Express the null and alternative hypotheses as contextually meaningful full sentences.
$H_{0}:$ There is no difference in the birth weight of babies born to mothers who do not smoke versus mothers who do smoke.
$H_{A}:$ There is a difference in the birth weight of babies born to mothers who do not smoke versus mothers who do smoke.
### Express the null and alternative hypotheses in symbols (when possible).
$H_{0}: \mu_{Nonsmoker} - \mu_{Smoker} = 0$
$H_{A}: \mu_{Nonsmoker} - \mu_{Smoker} \neq 0$
Commentary: As mentioned before, the order in which you subtract will not change the inference, but it will affect your interpretation of the results. Also, once you've chosen a direction to subtract, be consistent about that choice throughout the rubric.
## Model
### Identify the sampling distribution model.
We use a t model with the number of degrees of freedom to be determined.
Commentary: For Welch's t test, the degrees of freedom won't usually be a whole number. Be sure you understand that the formula is no longer $df = n - 1$. That doesn't even make any sense as there isn't a single $n$ in a *two*-sample test. The `infer` package will tell us how many degrees of freedom to use later in the Mechanics section.
### Check the relevant conditions to ensure that model assumptions are met.
* Random (for both groups)
- We have very little information about these women. We hope that the 115 nonsmoking mothers at this hospital are representative of other nonsmoking mothers, at least in that region at that time. And same for the 74 smoking mothers.
* 10% (for both groups)
- 115 is less than 10% of all nonsmoking mothers and 74 is less than 10% of all smoking mothers.
* Nearly normal (for both groups)
- Since the sample sizes are more than 30 in each group, we meet the condition.
## Mechanics
### Compute the test statistic.
```{r}
obs_diff <- birthwt2 %>%
specify(response = bwt, explanatory = smoke_fct) %>%
calculate(stat = "diff in means", order = c("Nonsmoker", "Smoker"))
obs_diff
```
```{r}
obs_diff_t <- birthwt2 %>%
specify(response = bwt, explanatory = smoke_fct) %>%
calculate(stat = "t", order = c("Nonsmoker", "Smoker"))
obs_diff_t
```
### Report the test statistic in context (when possible).
The difference in the mean birth weight of babies born to nonsmoking mothers and smoking mothers is `r obs_diff %>% pull(1)` grams. This was obtained by subtracting nonsmoking mothers minus smoking mothers. In other words, the fact that this is positive indicates that nonsmoking mothers had heavier babies, on average, than smoking mothers.
The t score is `r obs_diff_t %>% pull(1)`. The sample difference in birth weights is about 2.7 standard errors higher than the null value of zero.
Commentary: Remember that whenever you are computing the difference between two quantities, you must indicate the direction of that difference you so your reader knows how to interpret the value, whether it is positive or negative.
### Plot the null distribution.
```{r}
bwt_smoke_test_t <- birthwt2 %>%
specify(response = bwt, explanatory = smoke_fct) %>%
hypothesise(null = "independence") %>%
assume("t")
bwt_smoke_test_t
```
```{r}
bwt_smoke_test_t %>%
visualize() +
shade_p_value(obs_stat = obs_diff_t, direction = "two-sided")
```
Commentary: We use the name `bwt_smoke_test_t` (using the assumption of a Student t model) as a new variable name so that it doesn't overwrite the variable `bwt_smoke_test` we performed earlier as a permutation test (the one with the shuffling). This results of using `bwt_smoke_test` versus `bwt_smoke_test_t` will be very similar.
Note that the `infer` output tells us there are 170 degrees of freedom. (It turns out to be 170.1.) Note that this number is the result of a complicated formula, and it's not just a simple function of the sample sizes 115 and 74.
Finally, note that the alternative hypothesis indicated a two-sided test, so we need to specify a "two-sided" P-value in the `shade_p_value` command.
### Calculate the P-value.
```{r}
bwt_smoke_p <- bwt_smoke_test_t %>%
get_p_value(obs_stat = obs_diff_t, direction = "two-sided")
bwt_smoke_p
```
### Interpret the P-value as a probability given the null.
The P-value is `r bwt_smoke_p %>% pull(1)`. If there were no difference in the mean birth weights between nonsmoking and smoking women, there would be a `r 100 * bwt_smoke_p %>% pull(1)`% chance of seeing data at least as extreme as what we saw.
## Conclusion
### State the statistical conclusion.
We reject the null hypothesis.
### State (but do not overstate) a contextually meaningful conclusion.
We have sufficient evidence that there is a difference in the mean birth weight of babies born to mothers who do not smoke versus mothers who do smoke.
### Express reservations or uncertainty about the generalizability of the conclusion.
As when we looked at this data before, our uncertainly about the data provenance means that we don't know if the difference observed in these samples at this one hospital at this one time are generalizable to larger populations. Also keep in mind that this data is observational, so we cannot draw any causal conclusion about the "effect" of smoking on birth weight.
### Identify the possibility of either a Type I or Type II error and state what making such an error means in the context of the hypotheses.
If we've made a Type I error, then that means that there might be no difference in the birth weights of babies from nonsmoking versus smoking mothers, but we got some unusual samples that showed a difference.
## Confidence interval
### Check the relevant conditions to ensure that model assumptions are met.
There are no additional conditions to check.
### Calculate the confidence interval.
```{r}
bwt_smoke_ci <- bwt_smoke_test_t %>%
get_confidence_interval(point_estimate = obs_diff, level = 0.95)
bwt_smoke_ci
```
Commentary: Pay close attention to when we use `obs_diff` and `obs_diff_t`. In the hypothesis test, we assumed a t distribution for the null and so we have to use the t score `obs_diff_t` to shade the P-value. However, for a confidence interval, we are building the interval centered on our sample difference `obs_diff`.
### State (but do not overstate) a contextually meaningful interpretation.
We are 95% confident that the true difference in birth weight between nonsmoking and smoking mothers is captured in the interval (`r bwt_smoke_ci$lower_ci` g, `r bwt_smoke_ci$upper_ci` g). We obtained this by subtracting nonsmokers minus smokers.
Commentary: Again, remember to indicate the direction of the difference by indicating the order of subtraction.
### If running a two-sided test, explain how the confidence interval reinforces the conclusion of the hypothesis test.
Since zero is not contained in the confidence interval, zero is not a plausible value for the true difference in birth weights between the two groups of mothers.
### When comparing two groups, comment on the effect size and the practical significance of the result.
In order to know if smoking is a risk factor for low birth weight, we would need to know what a difference of 80 g or 490 grams means for babies. Although most of us presumably don't have any special training in obstetrics, we could do a quick internet search to see that even half a kilogram is not a large amount of weight difference between two babies. Having said that, though, any difference in birth weight that might be attributable to smoking could be a concern to doctors. In any event, our data is observational, so we cannot make causal claims here.
## Your turn
Continue to use the `birthwt` data set. This time, see if a history of hypertension is associated with a difference in the mean birth weight of babies. In the "Prepare the data for analysis" section, you will need to create a new tibble---call it `birthwt3`---in which you convert the `ht` variable to a factor variable.
The rubric outline is reproduced below. You may refer to the worked example above and modify it accordingly. Remember to strip out all the commentary. That is just exposition for your benefit in understanding the steps, but is not meant to form part of the formal inference process.
Another word of warning: the copy/paste process is not a substitute for your brain. You will often need to modify more than just the names of the data frames and variables to adapt the worked examples to your own work. Do not blindly copy and paste code without understanding what it does. And you should **never** copy and paste text. All the sentences and paragraphs you write are expressions of your own analysis. They must reflect your own understanding of the inferential process.
**Also, so that your answers here don't mess up the code chunks above, use new variable names everywhere.**
##### Exploratory data analysis {-}
###### Use data documentation (help files, code books, Google, etc.) to determine as much as possible about the data provenance and structure. {-}
::: {.answer}
Please write up your answer here
```{r}
# Add code here to print the data
```
```{r}
# Add code here to glimpse the variables
```
:::
###### Prepare the data for analysis. [Not always necessary.] {-}
::: {.answer}
```{r}
# Add code here to prepare the data for analysis.
```
:::
###### Make tables or plots to explore the data visually. {-}
::: {.answer}
```{r}
# Add code here to make tables or plots.
```
:::
##### Hypotheses {-}
###### Identify the sample (or samples) and a reasonable population (or populations) of interest. {-}
::: {.answer}
Please write up your answer here.
:::
###### Express the null and alternative hypotheses as contextually meaningful full sentences. {-}
::: {.answer}
$H_{0}:$ Null hypothesis goes here.
$H_{A}:$ Alternative hypothesis goes here.
:::
###### Express the null and alternative hypotheses in symbols (when possible). {-}
::: {.answer}
$H_{0}: math$
$H_{A}: math$
:::
##### Model {-}
###### Identify the sampling distribution model. {-}
::: {.answer}
Please write up your answer here.
:::
###### Check the relevant conditions to ensure that model assumptions are met. {-}
::: {.answer}
Please write up your answer here. (Some conditions may require R code as well.)
:::
##### Mechanics {-}
###### Compute the test statistic. {-}
::: {.answer}
```{r}
# Add code here to compute the test statistic.
```
:::
###### Report the test statistic in context (when possible). {-}
::: {.answer}
Please write up your answer here.
:::
###### Plot the null distribution. {-}
::: {.answer}
```{r}
# IF CONDUCTING A SIMULATION...
set.seed(1)
# Add code here to simulate the null distribution.
```
```{r}
# Add code here to plot the null distribution.
```
:::
###### Calculate the P-value. {-}
::: {.answer}
```{r}
# Add code here to calculate the P-value.
```
:::
###### Interpret the P-value as a probability given the null. {-}
::: {.answer}
Please write up your answer here.
:::
##### Conclusion {-}
###### State the statistical conclusion. {-}
::: {.answer}
Please write up your answer here.
:::
###### State (but do not overstate) a contextually meaningful conclusion. {-}
::: {.answer}
Please write up your answer here.
:::
###### Express reservations or uncertainty about the generalizability of the conclusion. {-}
::: {.answer}
Please write up your answer here.
:::
###### Identify the possibility of either a Type I or Type II error and state what making such an error means in the context of the hypotheses. {-}
::: {.answer}
Please write up your answer here.
:::
##### Confidence interval {-}
###### Check the relevant conditions to ensure that model assumptions are met. {-}
::: {.answer}
Please write up your answer here. (Some conditions may require R code as well.)
:::
###### Calculate and graph the confidence interval. {-}
::: {.answer}
```{r}
# Add code here to calculate the confidence interval.
```
```{r}
# Add code here to graph the confidence interval.
```
:::
###### State (but do not overstate) a contextually meaningful interpretation. {-}
::: {.answer}
Please write up your answer here.
:::
###### If running a two-sided test, explain how the confidence interval reinforces the conclusion of the hypothesis test. [Not always applicable.] {-}
::: {.answer}
Please write up your answer here.
:::
###### When comparing two groups, comment on the effect size and the practical significance of the result. [Not always applicable.] {-}
::: {.answer}
Please write up your answer here.
:::
## Conclusion
A numerical variable can be split into two groups using a categorical variable. As long as the groups are independent of each other, we can use inference to determine if there is a statistically significant difference between the mean values of the response variable for each group. Such a test can be run by simulation (using a permutation test) or by meeting the conditions for and assuming a t distribution (with a complicated formula for the degrees of freedom).
### Preparing and submitting your assignment
1. From the "Run" menu, select "Restart R and Run All Chunks".
2. Deal with any code errors that crop up. Repeat steps 1–-2 until there are no more code errors.
3. Spell check your document by clicking the icon with "ABC" and a check mark.
4. Hit the "Preview" button one last time to generate the final draft of the `.nb.html` file.
5. Proofread the HTML file carefully. If there are errors, go back and fix them, then repeat steps 1--5 again.
If you have completed this chapter as part of a statistics course, follow the directions you receive from your professor to submit your assignment.