The Pearson correlation coefficient, , is considered in most introductory statistics courses. One of the questions students may ask is how large needs to be before it is likely to be important. Before presenting a formal significance test, it is helpful to show some simulations.

Simulations of independent normal and are used to compute the Pearson correlation coefficient . The histogram of is obtained and compared with its exact distribution (solid curve) and a normal approximation (red curve with dashing). As increases from 5 to 30, the exact distribution closely approximates the normal and it becomes much more narrowly focused on the true value. The histogram with only 100 simulations is shown. Increase the number of simulations to get a more accurate histogram density estimate.

The robustness of the distribution of may be examined by examining the histograms when the and/or has a non-normal distribution. The exponential and distributions may be selected.

Given data , the sample Pearson correlation coefficient may be simply defined (see [1]) as

and computed using the Mathematica function Correlation.

The exact probability density function for the null case when the data are normal can be written [2, equation 27]

.

Based on this exact distribution, an exact test of may be constructed [2, §34]. Under the test statistic is -distributed on degrees of freedom.

A simple but reasonable approximation [3, §34] is to use the normal distribution with mean zero and variance . This Demonstration shows that this approximation is reasonable unless is very small.

The table below compares the critical values for a 5% two-sided test based on . That is, we reject versus when the observed value of exceeds the critical value. Again this table confirms that the approximation is quite good, provided is not too small.

[1] D. S. Moore, The Basic Practice of Statistics, New York: W. H. Freeman and Company, 2010.