Reverse Engineering GPA Distributions from Honors Data

Sometimes universities decline to publish the distribution of grade point averages (GPAs) of their students. Such schools often publish, however, the grade point averages attained by students at various levels of "honors". The school might say, for example, the top 30% of students had GPAs of at least 3.3 and the top 15% of students had GPAs of at least 3.5. This Demonstration shows how the overall distribution of GPAs can be reverse engineered from such honors data. You select two data points to correspond with two levels of honors. The Demonstration responds with its best estimate of the cumulative distribution function of grades.

The Demonstration uses the prevailing American calibration of grades, which requires them to lie between 0 (an "F") and 4 (an "A").

The Demonstration assumes, incorrectly, that GPAs are normally distributed. In fact, this is unlikely to be the case. GPAs are calculated as the mean of draws from an underlying censored distribution. The underlying distribution is censored because, generally speaking, grades cannot exceed a maximum value (such as 4.0) or go below a minimum value (such as 0). The mean of these draws must, therefore, itself be censored to lie between the minimum and maximum values. Still, some experimentation suggests that the assumption that GPAs are normally distributed does not generally result in large errors.

The idea for this Demonstration is suggested in Ian Ayres, Super Crunchers, New York: Bantam Dell, 2007.