9712

Estimating a Centered Ornstein-Uhlenbeck Process under Measurement Errors

The problem of estimating the two parameters of a stationary process satisfying the differential equation , where follows a standard Wiener process, from observations at equidistant points of the interval , has been well studied. This is also the classical problem of fitting an autoregressive time series of order 1 (AR1), the case " large" yielding the "near unit root" situation. This Demonstration considers the important case where the observations may have additive measurement errors: we assume that these errors are independent, normal random variables with known variance .
Recall that , assumed positive, is often referred to as the mean reversion speed (here assume the mean of the process ). In geostatistics is called the inverse-range parameter. It is well known that the autoregression coefficient in the equivalent AR1 formulation is given by , where .
Here we use the two parameters (the diffusion coefficient) and (recall that is then the marginal variance of the process; see the Details section in the help page for the OrnsteinUhlenbeckProcess function). We restrict ourselves to the case (so that is also the noise-to-signal ratio).
A simple "solution" to this fitting problem is to neglect the noise, that is, to use the most appealing estimator among those available for the non-noisy case and to substitute the noisy observations, as was studied in [2]. Here as "most appealing" we choose the celebrated maximum likelihood (ML) estimator. Indeed, it is known that this estimator can be exactly and reliably calculated by first solving a simple cubic equation in (see [3] and the references therein), the ML estimate of being then an explicit "Gibbs energy" (a quadratic form whose computation cost is of order ).
On the other hand, as soon as , the exact maximization of the correctly specified likelihood criterion (the one that takes into account the noise) is not so easy.
This Demonstration considers the recently proposed "CGEM-EV" approach [1]. In short, firstly is simply estimated by the bias-corrected empirical variance, say ; secondly an estimating equation is invoked to estimate . Precisely, is searched so that the conditional mean of the "candidate Gibbs energy" (where we substitute in place of the true so that this conditional mean is a function of only ) is equal to . It is easy to show that these two equations are unbiased, that is, they are true on average when and are set to their true values (the averaging is ensemble-averaging, i.e. from infinitely repeated simulations of the process and of the noise under the true model). Stronger properties are studied in [1].
Implementation of CGEM-EV is much simpler than exact ML, since it reduces to one-dimensional numerical root finding. A simple fixed-point algorithm is used here. It proves to be reliable (with fast convergence) for all the settings in this Demonstration.

SNAPSHOTS

  • [Snapshot]
  • [Snapshot]
  • [Snapshot]

DETAILS

Snapshot 1: Selecting as the true diffusion coefficient (the value of from which the non-noisy data is simulated, being fixed) and choosing , this setting may be thought of as "close" to a case where the noise could be forgotten; this could be confirmed by moving only from to (the underlying is then unchanged but the measurement errors are eliminated) and observing that all the estimates are almost unchanged (up to three digits). Concerning the diffusion coefficient, you can observe that the two estimation methods produce very close results. Such closeness is less pronounced for the two estimates of the variance. By changing the seed (and thus a new and new measurement errors used to generate the data) you can be convinced that this is not an accident. Furthermore, a rather large variability, from seed to seed, for the two estimates of the variance is also observed; it is much larger than the variability of the estimates of the diffusion coefficient; notice that this observation is well in agreement with the known theory about the estimation of the variance, the inverse-range parameter, and their product (see [1] and the references therein). By moving from seed to seed, neither of the two methods seems a clear winner in this "small noise" setting. Let us now consider a higher noise level. Another important point to note is that the estimates are not very influenced by the noise perturbing the data, provided the noise level stays lower than . However, by increasing to 0.05, a clear degradation of the neglecting-errors-ML is observed. In contrast, CGEM-EV still produces reasonable estimates of . Here ; however, you can change from to and the conclusions remain similar.
Snapshot 2: Staying at and selecting as true diffusion coefficient (so that we are close to the "near unit root" situation), a noise with can no longer be considered as a negligible noise. Indeed, we can observe that by diminishing the amplitude of the present noise from to , we restore significantly the quality of the estimates of using neglecting-errors-ML. And by moving from to and trying several seeds, one can be convinced that a noise-to-signal ratio of order or less is required if we want to trust the neglecting-errors-ML estimator (since we can be content with three accurate digits in the estimates). By increasing to 0.05, the neglecting-errors-ML becomes meaningless. In contrast, CGEM-EV still produces reasonable estimates of .
Snapshot 3: Selecting an "intermediate" value as true diffusion coefficient, similar conclusions can be drawn, except that the approximate upper bound on required to trust the neglecting-errors-ML estimator of seems also intermediate between and .
References
[1] D. A. Girard, "Asymptotic Near-Efficiency of the 'Gibbs-Energy and Empirical-Variance' Estimating Functions for Fitting Matérn Models to a Dense (Noisy) Series." arxiv.org/pdf/0909.1046v2.pdf.
[2] A. Gloter and J. Jacod, "Diffusions with Measurements Errors, II-Optimal Estimators,"
ESAIM-Probability and Statistics, 5, 2001 pp. 243–260.
[3] Y. Zhang, H. Yu, and A. Ian McLeod, "Developments in Maximum Likelihood Unit Root Tests," Communications in Statistics—Simulation and Computation, 42(5), 2013 pp. 1088–1103.
    • Share:

Embed Interactive Demonstration New!

Just copy and paste this snippet of JavaScript code into your website or blog to put the live Demonstration on your site. More details »

Files require Wolfram CDF Player or Mathematica.









 
RELATED RESOURCES
Mathematica »
The #1 tool for creating Demonstrations
and anything technical.
Wolfram|Alpha »
Explore anything with the first
computational knowledge engine.
MathWorld »
The web's most extensive
mathematics resource.
Course Assistant Apps »
An app for every course—
right in the palm of your hand.
Wolfram Blog »
Read our views on math,
science, and technology.
Computable Document Format »
The format that makes Demonstrations
(and any information) easy to share and
interact with.
STEM Initiative »
Programs & resources for
educators, schools & students.
Computerbasedmath.org »
Join the initiative for modernizing
math education.
Step-by-step Solutions »
Walk through homework problems one step at a time, with hints to help along the way.
Wolfram Problem Generator »
Unlimited random practice problems and answers with built-in Step-by-step solutions. Practice online or make a printable study sheet.
Wolfram Language »
Knowledge-based programming for everyone.
Powered by Wolfram Mathematica © 2014 Wolfram Demonstrations Project & Contributors  |  Terms of Use  |  Privacy Policy  |  RSS Give us your feedback
Note: To run this Demonstration you need Mathematica 7+ or the free Mathematica Player 7EX
Download or upgrade to Mathematica Player 7EX
I already have Mathematica Player or Mathematica 7+