Hidden Correlation in Regression

This Demonstration simulates the linear regression , , where , , and the are random independent variables from a continuous uniform distribution on ; is generated from a multivariate normal distribution with mean vector 0 and covariance matrix , where and , . The thumbnail shows the Poincaré plot (or scatterplot) of the lagged reordered residuals versus from the linear model fit. The Kendall rank correlation and its two-sided -value shown in the plot provide a diagnostic test for the presence of hidden correlation. From this residual plot, we clearly see that the errors violate the usual regression assumption of independence. This model misspecification is less obvious using the traditional residual dependency plot.
  • Contributed by: Ian McLeod and Yun Shi
  • Department of Statistical and Actuarial Sciences, Western University


  • [Snapshot]
  • [Snapshot]
  • [Snapshot]
  • [Snapshot]
  • [Snapshot]
  • [Snapshot]


Snapshot 1: The parameter estimates, their standard errors, and -values in the fitted regression with true parameters are shown. Due to model misspecification, the standard errors are too small, and the -values falsely suggest the coefficient is nonzero while the estimate for with a -value of about 6% is borderline.
Snapshot 2: The residual dependency plot is flat, suggesting model adequacy. Looking at this plot more carefully, we do see a nonrandom pattern, but it is less evident than in the Poincaré plot.
Snapshot 3: Comparison of the estimated and theoretical correlation functions. The parameter is estimated by nonlinear least squares.
Snapshot 4-6: In the next 3 shapshots, and the other settings remain the same. In this case the effect of model misspecification increases and is again detected better in the Poincaré plot than in the residual dependency plot. Both regression parameters are erroneously reported as very significant.
Residual dependency plots for checking regression fits are discussed in most regression textbooks as for example ([1, 2]).
Lagged scatterplots are sometimes called Poincaré plots ([3, 4]).
See [5] for further discussion of hidden correlation in regression.
[1] W. S. Cleveland, Visualizing Data. Summit, NJ: Hobart Press, 1993.
[2] S. J. Sheather, A Modern Approach to Regression with R, New York: Springer, 2009.
[3] D. Kaplan and L. Glass, Understanding Nonlinear Dynamics, New York: Springer, 1995.
[4] Wikipedia. "Poincaré plot." (Mar 20, 2013). en.wikipedia.org/wiki/Poincare_plot.
[5] E. Mahdi. Diagnostic Checking, Time Series and Regression, Ph.D Thesis, Western University, http://ir.lib.uwo.ca/etd/244.
    • Share:

Embed Interactive Demonstration New!

Just copy and paste this snippet of JavaScript code into your website or blog to put the live Demonstration on your site. More details »

Files require Wolfram CDF Player or Mathematica.

Mathematica »
The #1 tool for creating Demonstrations
and anything technical.
Wolfram|Alpha »
Explore anything with the first
computational knowledge engine.
MathWorld »
The web's most extensive
mathematics resource.
Course Assistant Apps »
An app for every course—
right in the palm of your hand.
Wolfram Blog »
Read our views on math,
science, and technology.
Computable Document Format »
The format that makes Demonstrations
(and any information) easy to share and interact with.
STEM Initiative »
Programs & resources for
educators, schools & students.
Computerbasedmath.org »
Join the initiative for modernizing
math education.
Powered by Wolfram Mathematica © 2014 Wolfram Demonstrations Project & Contributors  |  Terms of Use  |  Privacy Policy  |  RSS Give us your feedback
Note: To run this Demonstration you need Mathematica 7+ or the free Mathematica Player 7EX
Download or upgrade to Mathematica Player 7EX
I already have Mathematica Player or Mathematica 7+