Spread-Location Regression Diagnostic Check
Requires a Wolfram Notebook System
Interact on desktop, mobile and cloud with the free Wolfram Player or other Wolfram Language products.
The spread-location plot from a linear regression shown on the left is a plot of versus , where , is the power transformation of the absolute residual, and is the fitted value.[more]
The red line is a nonparametric smoother used to enhance the visualization. The purpose of this plot is to check for possible model mis-specification caused by monotonic change in variance related to the level as estimated by the fitted values. On the right, the box-whisker chart of the is shown. By adjusting the power transformation parameter to make the distribution of the more symmetric, the visualization of the spread-location relationship is improved. Some popular software programs fix , but this Demonstration illustrates that this might not always be the best choice. Vary the sample size, , and the random seed to see the impact of sample size and randomness. See Details for further discussion.[less]
Contributed by: Ian McLeod (December 2013)
Open content licensed under CC BY-NC-SA
The spread-location plot was suggested in  and the version in this Demonstration in , which used Mathematica to derive the optimal symmetrizing transformation for for a variety of error distributions.
In this Demonstration, the linear regression is fitted to data generated with , and is t-distributed on four degrees of freedom, is uniformly distributed on , and is set to . So the linear regression model is mis-specified and a log transformation of the response variable is needed. The purpose of the spread-location plot is to detect this type of mis-specification. The loess smoother, shown in red, helps to show if there is a relationship between the variance as measured by and the location as measured by .
Snapshot 1: using a log-transformation, , improves the visualization in the plot of versus for the data shown in the thumbnail, with ; the box-whisker chart confirms that is more symmetrically distributed
Snapshot 2: referring again to the data used in the thumbnail, Snapshot 2 shows that does not work as well
Snapshots 3 and 4: a smaller sample, , is used; the effect of the skewness of when is less dramatic and so is the improvement in using
 W. S. Cleveland, Visualizing Data, Summit, NJ: Hobart Press, 1993.
 A. I. McLeod, "Improved Spread-Location Visualization," Journal of Computational and Graphical Statistics, 8(1), 1999 pp. 135–141. doi:10.1080/10618600.1999.10474806.