9826

Omitted Variable Bias

In social science research, control variables are often included out of concerns about inducing bias into the coefficients of interest [1, 2]. However, short of knowing the true data-generating process—an unlikely situation—the inclusion of even relevant controls may in fact aggravate the problem.
This is shown for the case of linear (OLS) and logit (GLM) models, where the true model includes three covariates. The first misspecified model omits the second and third covariates, and the second misspecified model omits only the third covariate. According to the logic of including controls, the bias on the expected value of the coefficient for the first covariate should always be larger in the first misspecified model, unless covariates are uncorrelated. This is not true for many GLM link functions, where coefficients may be biased even if included and excluded covariates are uncorrelated [3, 4]. At the red contour line no difference in bias exists between the first and second misspecified models. In regions where dashed contour lines indicate positive values, the inclusion of controls would indeed reduce bias. (Hover the mouse over the contour line to see the tooltip.) The lighter the region, the larger the reduction. In regions where solid contour lines indicate negative values, however, the inclusion of controls would induce bias. The darker the region, the larger the induction. For exact identification of coordinates, drag the cross-hairs locator to the desired position. The notation follows [1].
  • Contributed by: Alrik Thiem
  • After work by: Kevin A. Clarke, University of Rochester (USA)

SNAPSHOTS

  • [Snapshot]
  • [Snapshot]
  • [Snapshot]

DETAILS

For the case of OLS, let be the true model, be first misspecified model, and be the second misspecified model.
: , with ,
: , and
: .
If for we have , then .
If for we have , then the bias of the expected values of for and for are given by
(1) ,
(2) .
According to the logic of including controls in order to reduce bias, the following weak inequality should always hold.

For the case of GLM, as before let be the true model, let be the first misspecified model, and let be the second misspecified model.
: , with and ,
: , and
: .
The normalized values of and are given by [1] as
and
.
According to the logic of including controls in order to reduce bias, the following weak inequality should always hold.
.
References
[1] K. A. Clarke, "Return of the Phantom Menace: Omitted Variable Bias in Political Research," Conflict Management and Peace Science, 26(1), 2009 pp. 46–66.
[2] K. A. Clarke, "The Phantom Menace: Omitted Variable Bias in Econometric Research," Conflict Management and Peace Science, 22(4), 2005 pp. 341–352.
[3] M. H. Gail, S. Wieand, and S. Piantadosi, "Biased Estimates of Treatment Effect in Randomized Experiments with Nonlinear Regressions and Omitted Covariates," Biometrika, 71(3), 1984 pp. 431–444.
[4] J. S. Cramer, Logit Models from Economics and Other Fields, Cambridge: Cambridge University Press, 2003.
    • Share:

Embed Interactive Demonstration New!

Just copy and paste this snippet of JavaScript code into your website or blog to put the live Demonstration on your site. More details »

Files require Wolfram CDF Player or Mathematica.









 
RELATED RESOURCES
Mathematica »
The #1 tool for creating Demonstrations
and anything technical.
Wolfram|Alpha »
Explore anything with the first
computational knowledge engine.
MathWorld »
The web's most extensive
mathematics resource.
Course Assistant Apps »
An app for every course—
right in the palm of your hand.
Wolfram Blog »
Read our views on math,
science, and technology.
Computable Document Format »
The format that makes Demonstrations
(and any information) easy to share and
interact with.
STEM Initiative »
Programs & resources for
educators, schools & students.
Computerbasedmath.org »
Join the initiative for modernizing
math education.
Step-by-step Solutions »
Walk through homework problems one step at a time, with hints to help along the way.
Wolfram Problem Generator »
Unlimited random practice problems and answers with built-in Step-by-step solutions. Practice online or make a printable study sheet.
Wolfram Language »
Knowledge-based programming for everyone.
Powered by Wolfram Mathematica © 2014 Wolfram Demonstrations Project & Contributors  |  Terms of Use  |  Privacy Policy  |  RSS Give us your feedback
Note: To run this Demonstration you need Mathematica 7+ or the free Mathematica Player 7EX
Download or upgrade to Mathematica Player 7EX
I already have Mathematica Player or Mathematica 7+