Nonparametric Additive Modeling by Smoothing Splines: Robust Unbiased-Risk-Estimate Selector and a Nonisotropic-Smoothing Improvement
This Demonstration is dedicated to the memory of François de Crécy.
An earlier series of Demonstrations [1–3] concerned the well-known cross-validation approach to optimally estimate smooth univariate regression functions. Notably,  analyzes, on simple examples, how one can construct good confidence statements (asymptotically justified) about the -optimal amount of smoothing, and it is demonstrated by  that the selector (a variant of cross-validation) can be easily robustified via randomized choices. The present Demonstration discusses an extension of the latter robustification method to a simple bivariate context. Specifically, we model and compute using the well-known backfitting algorithm, by additive cubic smoothing splines. Furthermore, this Demonstration introduces a natural nonisotropy in the smoothing operators.
In detail, this Demonstration considers the estimation of a smooth additive function of two variables, here over . It is given that only triplets () are known for , which satisfy the model , where the are independent, standard normal random variables. However, for the sake of simplicity, we also assume that the variance is known.
In a first step, solve
for all the in a grid of equally spaced (in log scale) trial values. Recall that assuming a standard mean-zero restriction on each of the components and (see ) for any , generically gives a unique solution , denoted by , the being omitted when there is no ambiguity, where the two components and are cubic spline functions.
Of course, giving the same weight (here ) to the two integrals of criterion (1) (the two so-called smoothness energies) is rather arbitrary, and there is room for improvement even without introducing differing weights, which would complicate their robust selection by percentiles of randomized unbiased-risk estimators.
As a second step, introduce a dilation of one of the two explanatory variables, say , and denoting by the dilation factor, write . At the new observation sites, we consider the second problem:
We solve this problem again for a grid of values, after having tuned by the simple formula (see the Details section for its justification):
where and are the two components given by , with selected by the robust method corresponding to the percentile. (This is similar to the univariate context in , except that here, the randomized degrees of freedom are drawn from a list that has been previously computed.)
Let be the minimizer of , where is selected, among the grid of values mentioned above, by a fast robust method. "Fast" means here that instead of precisely computing a percentile, we chose only the fifth largest among six randomized choices. This Demonstration displays the two smooth components , taking into account the tuned dilation that was used, denoted by .
Refer to Chapter 9.1.1 of  for a recent detailed presentation of the additive modeling of multivariate regression functions and the backfitting algorithm to implement it for a given value of .
The first-step estimates use a robustified selector of proposed in  (and evaluated in ). Possibly, the methods discussed in  could have been used as well.
In brief, the idea behind our proposed tuning formula (3) for is that the two smoothness energies occurring in should be roughly equal at the optimum, in the same way that the two terms of the risk (namely the squared bias and the variance) have a similar order when is well chosen. Precisely, the suggestion here is simply, in a second step, to choose so that:
where and , are the first-step estimate, that is, solution of where is given, as mentioned above, by the robust method. It is easy to see that the selected is then simply given by the formula (3).
It was François de Crécy, a dear friend and colleague over the past 25 years while he was a researcher at CEA (Commissariat à l'énergie atomique et aux énergies alternatives), who had suggested to me that for a more general multivariate additive model (i.e. the natural extension with explanatory variables, whereas in this Demonstration ), the dilation factors (those of ) could be very simply tuned by harmonizing, in a second step, all the smoothness energies in the penalization term (the one that generalizes the penalization term of ), each computed on the first-step estimate . Of course, the procedure could be iterated (i.e. a third step would determine new dilation factors by harmonizing the smoothness energies computed on the second-step estimate etc.), but such a surface estimator has not yet been evaluated.
 T. Hastie, R. Tibshirani and J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed., New York: Springer, 2009.
 D. A. Girard, "Estimating the Accuracy of (Local) Cross-Validation via Randomised GCV Choices in Kernel or Smoothing Spline Regression," Journal of Nonparametric Statistics, 22(1), 2010 pp. 41–64. doi:10.1080/10485250903095820.
 M. A. Lukas, F. R. de Hoog and R. S. Anderssen, "Practical Use of Robust GCV and Modified GCV for Spline Smoothing," Computational Statistics, 31(1), 2016 pp. 269–289. doi:10.1007/s00180-015-0577-7.