Although the exact sampling distribution for the proportion defective is a binomial distribution, in which
is the probability of an individual item being defective, this Demonstration uses the normal approximation to the binomial distribution, which is valid for large
. A rule of thumb is that
should satisfy the test
Suppose an experiment is to be conducted wherein a treatment of some sort is to be applied to a population. The investigator is interested in knowing the minimum sample size that should be randomly selected from the population to detect a change in the proportion having a certain characteristic as a result of the treatment. The statistical factors that influence the sample size are:
- the value selected for
, the type I risk (the risk of rejecting a true null hypothesis)
- the value selected for
, the type II risk (the risk of accepting the null hypothesis as true when, in reality, it is false)
- the value of the population proportion defective,
(this is often an estimate, based on judgement and experience)
- the minimum size of the change to be detected by the sample,
The type I risk is that of getting a false positive from the sample. In other words, there is a probability
that the sample will indicate that there is a difference of at least
when, in fact, there is not.
The type II risk is that of getting a false negative from the sample. In other words, there is a probability
that the sample will indicate there is not a difference of at least
when, in fact, there is such a difference. The ability to detect a difference when there actually is one is called the power of the test and is equal to
As mentioned, whether the test of the hypothesis is a one-sided or two-sided test has an effect on the sample size. A one-sided test would, for example, look either at whether the proportion defective is more than the hypothesized value OR at whether the proportion defective is less than the hypothesized value. A two-sided test would consider whether the proportion defective is significantly different from the hypothesized value, considering both tails of the distribution of the test.
The user of this Demonstration should mentally formulate a null hypothesis and an alternative hypothesis (either one- or two-sided). Then the appropriate control selections can be made and the resulting sample size can be examined. Typical values for
are 5% and 10%, respectively, resulting in a confidence of 95% and a test power of 90%.
Note that making the detected difference,
, small, drives a large sample size, as would be expected.