Visualizing Correlations

For dimension two, we have either the bivariate normal with unit variances, mean zero, and correlation parameter , or, in the contaminated case (with a 10% probability), the observation is replaced with one from the same distribution but multiplied by 3. The contaminated distribution is sometimes used to describe non-normal data with a higher proportion of outliers than the normal. The estimated correlation is shown and reflects the pattern seen in the data, but it may not be an accurate estimator of for small even in the normal case. Increasing increases the accuracy of the estimator . If is kept fixed, the variability of the estimator decreases as the absolute magnitude of is increased. This is seen by varying the seed and then experimenting with different . As we zoom out, our perception may spuriously suggest that the association between the variables increases. Using the contaminated normal distribution increases the variability in our estimate and the likelihood of an apparent spurious association when .
For dimension three, the symmetrically correlated trivariate normal distribution is used. Once again the effect of the contaminated normal is to increase variability in .

SNAPSHOTS

  • [Snapshot]
  • [Snapshot]
  • [Snapshot]
  • [Snapshot]
  • [Snapshot]

DETAILS

In the bivariate normal case, this Demonstration provides a dynamic and more accurate visualization of figure 4.5 in [1]. The illusion that the degree of association increases when we zoom out was discussed in [2]. The contaminated normal distribution was proposed as a realistic model for outliers in [3].
[1] D. S. Moore, The Basic Practice of Statistics, New York: W. H. Freeman and Company, 2010.
[2] W. S. Cleveland, P. Diaconis, and R. McGill, "Variables on Scatterplots Look More Highly Correlated When the Scales Are Increased," Science, 216(4550), 1982 pp. 1138–1141.
[3] J. W. Tukey, "A Survey of Sampling from Contaminated Distributions," Contributions to Probability and Statistics (I. Olkin, ed.), Stanford: Stanford University Press, 1960 pp. 448–485.
    • Share:

Embed Interactive Demonstration New!

Just copy and paste this snippet of JavaScript code into your website or blog to put the live Demonstration on your site. More details »

Files require Wolfram CDF Player or Mathematica.