Estimating the Local Mean Function
![]() There is an extensive literature on nonlinear regression and nonparametric estimation techniques. We refer the reader to the notable book by Fan and Yao and extensive work by Bjerve and Doksum. Roughly speaking, the procedure to estimate can be implemented in the following fashion. First, take a set of evenly spaced design points over an interior interval of the empirical support of the covariate . Then, at each design point, solve a kernel-weighted least squares problem to locally fit a polynomial of order . (In this Demonstration, the local fit is parabolic.)By "kernel-weighted", we mean that the data are weighted according to the Epanechnikov kernel , where , is a design point, and is the "kernel bandwidth". The variable effectively controls the amount of nearby data that are permitted to influence the estimate of (and its derivatives) locally. There are a variety of techniques or heuristics available to choose . You can vary the size of the bandwidth. Smaller bandwidths reveal too many of the local features of the data, perhaps, and larger bandwidths oversmooth the data.By solving a kernel-weighted least squares regression at each design point, we obtain an estimate of the value of and its first two derivatives at each design point. We then have all the information we need to fit a spline.The "cubic" set of data is a simulated set formed by generating 600 realizations of with a standard normal distribution and then simulating , where are independently simulated standard normal random variables.The "sine" set of data is another simulated set, formed by first generating 600 realizations of with a uniform distribution over the interval . We then obtain simulations of according to the rule , where are independently simulated standard normal random variables.The "baseball" data consists of performance data for all regular major league baseball players during the 1999 baseball season. We compute the overall proportion of hits to at-bats, and the proportion of hits to at-bats when there is a teammate in scoring position. The two proportions are obviously positively correlated, but the nonlinear regression model offers a potentially more useful fit than the usual linear regression model. These data were taken from John Rasp's website. ![]() "Estimating the Local Mean Function" from The Wolfram Demonstrations Project http://demonstrations.wolfram.com/EstimatingTheLocalMeanFunction/ Contributed by: Jeff Hamrick | ||||||||||||||
![]() | ||
|
|
||














































Browse all topics















