Nonparametric Curve Estimation by Smoothing Splines: Unbiased-Risk-Estimate Selector and its Robust Version via Randomized Choices

This Demonstration considers a simple nonparametric regression problem: how to recover a function of one variable, here over , when only couples () are known for that satisfy the model , where and the are independent, standard normal random variables. For simplicity, assume that the variance is also known.
The setting is the same as in [1] except that the (the design) are not regularly spaced, and this Demonstration uses the well-known smoothing spline method instead of kernel smoothers (allowing fast computations; notably, see the recent forum [2] where useful code is provided). Recall that, in place of a bandwidth value, a good value has to be chosen for the famous smoothing parameter, denoted by . Recall that a very small produces a quasi-interpolation of the data, and a very large yields the well-known polynomial regression fit, here of degree 1 since classical cubic splines are considered. A very popular method for a good choice is to try several values, to compute for each one the Mallows's criterion, and to retain the , which yields a minimal (as in [1], where is denoted as UBR since it is an unbiased risk estimate of the global prediction error).
It is frequently observed that the criterion as a function of the smoothing parameter may be a rather flat function around its minimum (this is also true for the similar GCV criterion). In such a case, even if the global prediction error may itself be similarly flat (and thus the impact on the predictive quality of the fit may be weak), a that is too small can then be produced by , where "too small" means that spurious oscillations (which could be wrongly interpreted as real peaks) are present in the final estimate of .
See [3] for a recent review of several approaches to remedy such troubles. Let us recall that Mallows emphasized, in his original paper, that a careful examination of the whole curve should be preferred to a blind minimization of the pure (or GCV) criterion.
In this Demonstration, we have implemented the randomization-based method introduced in [4, section 7.2], which permits computing a "more parsimonious yet 'near-optimal' fit". Such a fit is parameterized by a percentile , which determines an upward modification of the original choice.
As in [3], this parameterized modification is called the "robust choice corresponding to the percentile ".
By playing with this Manipulate, with various underlying functions, you can observe that the results are often very satisfactory for a large range of values (the noise magnitude), in the sense that the mentioned spurious oscillations are almost always eliminated. Furthermore, it is rather easy to choose since, very often, all the values of chosen among , or (and even in many cases) yield a quite similar (at least visually) final estimate of .


  • [Snapshot]
  • [Snapshot]
  • [Snapshot]


[1] D. A. Girard, "Nonparametric Curve Estimation by Kernel Smoothers: Efficiency of Unbiased Risk Estimate and GCV Selectors," from the Wolfram Demonstrations Project—A Wolfram Web Resource. (Jan 9, 2013) demonstrations.wolfram.com/NonparametricCurveEstimationByKernelSmoothersEfficiencyOfUnb.
[2] jojosthegreat, "Implementation of Smoothing Splines Function," Mathematica Stack Exchange. (Sep 5, 2017) mathematica.stackexchange.com/questions/33206/implementation-of-smoothing-splines-function/33262.
[3] M. A. Lukas, F. R. de Hoog and R. S. Anderssen, "Practical Use of Robust GCV and Modified GCV for Spline Smoothing," Computational Statistics, 31(1), 2016 pp. 269–289. do:10.1007/s00180-015-0577-7.
[4] D. A. Girard, "Estimating the Accuracy of (Local) Cross-Validation via Randomised GCV Choices in Kernel or Smoothing Spline Regression," Journal of Nonparametric Statistics, 22(1), 2010 pp. 41–64. doi:10.1080/10485250903095820.
    • Share:

Embed Interactive Demonstration New!

Just copy and paste this snippet of JavaScript code into your website or blog to put the live Demonstration on your site. More details »

Files require Wolfram CDF Player or Mathematica.

Mathematica »
The #1 tool for creating Demonstrations
and anything technical.
Wolfram|Alpha »
Explore anything with the first
computational knowledge engine.
MathWorld »
The web's most extensive
mathematics resource.
Course Assistant Apps »
An app for every course—
right in the palm of your hand.
Wolfram Blog »
Read our views on math,
science, and technology.
Computable Document Format »
The format that makes Demonstrations
(and any information) easy to share and
interact with.
STEM Initiative »
Programs & resources for
educators, schools & students.
Computerbasedmath.org »
Join the initiative for modernizing
math education.
Step-by-Step Solutions »
Walk through homework problems one step at a time, with hints to help along the way.
Wolfram Problem Generator »
Unlimited random practice problems and answers with built-in step-by-step solutions. Practice online or make a printable study sheet.
Wolfram Language »
Knowledge-based programming for everyone.
Powered by Wolfram Mathematica © 2017 Wolfram Demonstrations Project & Contributors  |  Terms of Use  |  Privacy Policy  |  RSS Give us your feedback
Note: To run this Demonstration you need Mathematica 7+ or the free Mathematica Player 7EX
Download or upgrade to Mathematica Player 7EX
I already have Mathematica Player or Mathematica 7+