It is shown that the discrepancy can be traced to a flawed methodology, called "method of pooled department analysis" (PDA): Summaries of correlations are averages of the individual department coefficients corrected for multivariate restriction of range and weighted by the number of students in the department. This method tends to inflate multiple correlation estimates, especially for population values near zero.
In principle, there are two different methods for combining data from subsamples:
(A) Aggregate the scores of the smaller samples into a larger sample (hereafter AGR).
(B) Compute the desired statistics (in the present case, correlations) in each of the subsamples, and then average them to obtain parameter estimates for the whole dataset.
Method A is the conventional procedure. Method B is the method of pooled department analysis (PDA) used in the new GRE Board report.
The results of the simulations are graphically illustrated in the figures and show the following:
1. Bias is largest for the PDA method in all cases; conversely, pooling scores (AGR) method is uniformly superior to averaging estimates, PDA.
2. The magnitude of this superiority effect declines with increasing subsample size (SSS).
3. The bias tends to decrease with increasing subsample size (SSS).
4. The bias tends to increase as the population correlations (POP) decrease. This point is important because, in practice, long-term predictive validities tend to be small.
Observations 3 and 4 are especially relevant to the results in the GRE report. Observation 3 is relevant because 12 out of 19 (63%) of the subsamples were smaller than 50 (Table C2, p. 56 [1]). Observation 4 is important because previous investigators have found that GRE validities for long-term criteria are small.
The graphs also show that the bias of the conventional pooling method, AGR, is negligible compared to that of PDA. PDA bias ceases to be a problem for SSS > 50 and POP > 0.30. However, it is precisely the lower range of the correlation spectrum that is relevant for the GRE.
[1] N. W. Burton and M.-M. Wang, "Predicting Long-Term Success in Graduate School: A Collaborative Validity Study," GRE Board Report No. 99-14R ETS RR-05-03, Princeton, NJ: Educational Testing Service, 2005.
http://www.ets.org/Media/Research/pdf/RR-05-03.pdf.