# Exploring Robustness of Mean-Difference Confidence Intervals

Requires a Wolfram Notebook System

Interact on desktop, mobile and cloud with the free Wolfram Player or other Wolfram Language products.

This Demonstration examines confidence intervals at the 95% level for the difference in means in random samples and from either a normal, uniform, Laplace, or centered exponential distribution. The distributions all have mean 0. The variance of the distribution can be varied from 1 to 10 and the variance of is fixed at 1. Several methods available in the *Mathematica* function MeanDifferenceCI are examined. The "t (Welch)" method is the default method with MeanDifferenceCI, "t (pooled)" corresponds to the option setting , while "Z" and "Z (pooled)" correspond to setting the option KnownVariance to the sample variances of and or else to the pooled variance estimate. The fifth method is a conservative approximation using the two-sample *t*-statistic with degrees of freedom equal to .

Contributed by: Ian McLeod (March 2011)

Open content licensed under CC BY-NC-SA

## Snapshots

## Details

Snapshot 1: Using "t (Welch)", setting , , 's from Laplace and 's from centered exponential distribution, we find after about 10,000 simulations ; in this case the "t (Welch)" method is robust. Using "Z" or "Z pooled", after about 10,000 simulations, so clearly the -methods are not suitable using estimated standard deviations.

Snapshot 2: As in Snapshot 1, except the roles of and are reversed; in this case after 10,000 simulations and we conclude even "t (Welch)" is not robust. Increasing the sample sizes, , we obtain so for large enough samples, "t (Welch)" is accurate.

Snapshot 3: Assuming and both normal, setting , and selecting "t (pooled)", we find after 10,000 simulations that ; instead of 95% coverage, it is less. The "t (pooled)" intervals err on the side of not being conservative, that is, they are too narrow; for this reason, the use of the "t (pooled)" method is disparaged; see p. 487 of [1].

Snapshot 4: With , , both and normal, and using the "t (conservative)" method, after 10,000 simulations, , showing that this method is indeed conservative.

Snapshot 5: Settings as in Snapshot 1 but using the -method; after 10,000 simulations , showing that this method is not conservative and not acceptable with small samples. Increasing , after 10,000 simulations, showing that for larger samples the -method is sufficiently accurate.

Some textbooks recommend using the pooled *t*-test if a simple -test for equal variances is not rejected. As noted on p. 488 of [1], this is not a good idea, because this test itself is not robust against skewness and/or outliers.

The robustness for other confidence levels could be explored by resetting the variable in the initialization code.

[1] D. S. Moore, *The Basic Practice of Statistics*, 5th. ed., New York: W. H. Freeman and Company, 2010.

## Permanent Citation