9464

Spread-Location Regression Diagnostic Check

The spread-location plot from a linear regression shown on the left is a plot of versus , where , is the power transformation of the absolute residual, and is the fitted value.
The red line is a nonparametric smoother used to enhance the visualization. The purpose of this plot is to check for possible model mis-specification caused by monotonic change in variance related to the level as estimated by the fitted values. On the right, the box-whisker chart of the is shown. By adjusting the power transformation parameter to make the distribution of the more symmetric, the visualization of the spread-location relationship is improved. Some popular software programs fix , but this Demonstration illustrates that this might not always be the best choice. Vary the sample size, , and the random seed to see the impact of sample size and randomness. See Details for further discussion.

SNAPSHOTS

  • [Snapshot]
  • [Snapshot]
  • [Snapshot]
  • [Snapshot]

DETAILS

The spread-location plot was suggested in [1] and the version in this Demonstration in [2], which used Mathematica to derive the optimal symmetrizing transformation for for a variety of error distributions.
In this Demonstration, the linear regression is fitted to data generated with , and is t-distributed on four degrees of freedom, is uniformly distributed on , and is set to . So the linear regression model is mis-specified and a log transformation of the response variable is needed. The purpose of the spread-location plot is to detect this type of mis-specification. The loess smoother, shown in red, helps to show if there is a relationship between the variance as measured by and the location as measured by .
Snapshot 1: using a log-transformation, , improves the visualization in the plot of versus for the data shown in the thumbnail, with ; the box-whisker chart confirms that is more symmetrically distributed
Snapshot 2: referring again to the data used in the thumbnail, Snapshot 2 shows that does not work as well
Snapshots 3 and 4: a smaller sample, , is used; the effect of the skewness of when is less dramatic and so is the improvement in using
References
[1] W. S. Cleveland, Visualizing Data, Summit, NJ: Hobart Press, 1993.
[2] A. I. McLeod, "Improved Spread-Location Visualization," Journal of Computational and Graphical Statistics, 8(1), 1999 pp. 135–141. doi:10.1080/10618600.1999.10474806.
    • Share:

Embed Interactive Demonstration New!

Just copy and paste this snippet of JavaScript code into your website or blog to put the live Demonstration on your site. More details »

Files require Wolfram CDF Player or Mathematica.









 
RELATED RESOURCES
Mathematica »
The #1 tool for creating Demonstrations
and anything technical.
Wolfram|Alpha »
Explore anything with the first
computational knowledge engine.
MathWorld »
The web's most extensive
mathematics resource.
Course Assistant Apps »
An app for every course—
right in the palm of your hand.
Wolfram Blog »
Read our views on math,
science, and technology.
Computable Document Format »
The format that makes Demonstrations
(and any information) easy to share and interact with.
STEM Initiative »
Programs & resources for
educators, schools & students.
Computerbasedmath.org »
Join the initiative for modernizing
math education.
Powered by Wolfram Mathematica © 2014 Wolfram Demonstrations Project & Contributors  |  Terms of Use  |  Privacy Policy  |  RSS Give us your feedback
Note: To run this Demonstration you need Mathematica 7+ or the free Mathematica Player 7EX
Download or upgrade to Mathematica Player 7EX
I already have Mathematica Player or Mathematica 7+