Quality Counts and Center for Business and Economic Research both score “Novice”
It seems like mid-winter is a favorite time for various groups to publish fanciful rankings of Kentucky’s education system compared to other states, and 2016 has proved no exception.
Cases in point include new rankings from the annual Quality Counts report series from Education Week and an update to a rankings series from the Center for Business and Economics Research (CBER) at UK. Just as in the past, these rankings are clearly problematic.
Before discussing each report, let’s review some general concerns about ranking state education systems. This will be old news for our regular readers but these comments need to be repeated again since folks at both EdWeek and the CBER apparently still don’t get it.
A major problem with ranking state education systems is that the student demographics in each state vary dramatically. For example, as shown in Table 1, in the 2015 National Assessment of Educational Progress (NAEP) Grade 4 reading assessment, Kentucky’s student population was dramatically different from the US average. Kentucky was 79 percent white while across the nation public school enrollment in 2015 was only 49 percent white. Black enrollment in Kentucky was only 10 percent, while blacks comprised 15 percent of the total nationwide. Hispanics only made up 5 percent of Kentucky’s student body, but nationally they made up 26 percent of the public school enrollment. For a bit more reference, whites in California comprised only 25 percent of the state’s public school enrollment in late winter of 2015 when the NAEP was administered.
Coupled with these very dramatic differences in demographics are tremendous differences in test results for the different races, as shown in Table 2.
These major achievement gap scores create a major comparison problem when the demographics differ sharply. Given Kentucky’s overwhelming white enrollment, even though the Bluegrass State’s whites scored lower than either whites in California or across the nation, when the scores are averaged together using the enrollment percentages, Kentucky gets a big advantage.
This isn’t news, by the way. NAEP Report Cards since 2005 have discussed issues that should be considered when comparing scores across states. These include things like differing student demographics, differing rates of exclusion of students from testing, and the fact that the NAEP is a sampled assessment so all the scores have sampling errors and small differences in scores are not significant.
Of particular note, a special section of the NAEP 2009 Science Report Card that begins on Page 32 is titled, “A Closer Look at State Demographics and Performance.” It actually uses Kentucky as an example of how impressions about a state’s performance can change notably once the scores are disaggregated by race. Figure 32 from that NAEP report shows that overall Kentucky outscored the national average by a statistically significant amount. However, when the scores are considered by race, Kentucky’s whites, who comprised more than 80 percent of the state’s public school enrollment in 2009, actually scored statistically significantly lower than the national average for whites.
Let’s explore this with more recent data found in Tables 1 and 2 above. If each state’s weighted average score on the NAEP 2015 Grade 4 Reading Assessment is computed for the three racial groups shown in Tables 1 and 2, Kentucky’s fourth grade reading scores are notably higher than either the US average or California’s, as Table 3 shows. The difference is probably large enough to be statistically significant.
However, look at what happens if we use each state’s scores for whites, blacks and Hispanics but we weight those scores by the demographic percentages found in Kentucky. Table 4 shows the results.
Wow! Kentucky isn’t well ahead of the US average or California at all. The only reason Kentucky looks better in overall score comparisons is because of an unfair advantage due to our state’s very different student demographics. Keep in mind that NAEP Grade 4 Reading is where Kentucky shows its best performance. Things look a lot worse for Kentucky when we consider math, which I will discuss a little later.
By the way, for technical types, the well-understood mathematical fact of life outlined above even has a name: “Simpson’s Paradox.” Simpson’s tells us that only examining overall average scores can hide some really interesting surprises that only become apparent once the different subgroups that go into the average are separately considered.
Both Quality Counts and the CBER report fall right into the Simpson’s Paradox trap. Both reports use “all student” average scores carried out to a ridiculously fine level of one tenth of a point (Remember, the NAEP scores have sampling errors and are not nearly that accurate). As a result, both reports portray biased images of winners and losers in state-to-state education performance that just are not correct.
Here are a few specific comments about each of the reports:
CBER’s “Kentucky’s Educational Performance & Points of Leverage,” Issue Brief 19, January 2016
- The vast majority of the report is built around NAEP scores for “all students.” There is no correction for the Simpson’s Paradox issue discussed above, which biases all of the test score comparisons in Kentucky’s favor.
- Carrying out NAEP analysis to the nearest tenth of a point is dubious. NAEP just isn’t that precise.
- The NAEP science scores used by the CBER are now years out of date. Including them in a 2015 analysis further inflates the findings in Kentucky’s favor.
- Small technical issue: the CBER report says “The NAEP data reflect the percentage of public students scoring proficient or higher, and the U.S. data represents the National Public.” In fact, a spot check of the data found in the NAEP Data Explorer web tool indicates the US numbers for both fourth and eighth grade science are for the nation and include non-public school results.
- Comparison of the high school graduation rate for Kentucky to other states is dubious because the standards for diploma award are not standard from state to state. As I have explained elsewhere, there is good evidence that Kentucky is doing a lot of social promotion to a diploma, which does not indicate readiness for either college or career. The social promotion in Kentucky inflates the real graduation rate picture. I suspect many other states have similar issues, but I don’t think there is a way to consistently compute social promotion amounts in other states.
- Further evidence of the social promotion problem comes from what are labeled the “ACT % College/Career Ready (2015)” numbers in the CBER report. Note that Kentucky might graduate more students, but the percentage ready for college and careers is notably lower than the national average. On a technical note, it appears these numbers are actually reported by the ACT, Inc. as “Students Who Met All 4 ACT Benchmark Scores” for college readiness as shown in Figure 1.1 in that organization’s “ACT Profile Report” for Kentucky for 2015. This is not the same as Kentucky’s “College and Career Ready” statistic. Most importantly, there is controversy about whether the ACT is a solid indicator of career, as opposed to college, readiness. CBER should not change labels from ACT reports without explanation.
Education Week’s Quality Counts 2016
- It is remarkable how Kentucky’s ranking in Quality Counts has bounced around dramatically in just a few years. When Quality Counts ranked states in the 2013 report, somehow Education Week’s report team convinced itself that Kentucky’s education ranked 10th in the nation. Just one year later, in 2014, Education Week listed Kentucky in 35th place! In 2015 Kentucky placed 29th (though the map has an erroneous 2013 annotation on it. Note this is not the same map as the 2013 map linked from here). In 2016 Kentucky supposedly improved to 27th place. That’s quite a lot of jumping around, from 10th to 35th to 27th place between 2013 and 2016. This mostly demonstrates that Quality Counts’ ranking schemes are highly unstable from year to year and trends should be regarded as highly dubious, at best.
- Just like the CBER report, Quality Counts ranks based on “all student” NAEP scores, ensnaring this report in the Simpson’s Paradox trap.
- As with the CBER report, Quality Counts also makes too much from very small NAEP score differences, reporting scores that actually have sampling errors of several points as though they are meaningful to the nearest tenth of a point.
- Also as with the CBER report, Quality Counts includes high school graduation rates in its rankings, which is clearly problematic.
In any event, reports that don’t consider things like differing student demographics are not going to provide accurate pictures of true relative performance of state education systems. That’s just the way things are, but it seems a lot of people putting out research in this area simply don’t know, or don’t want to admit, that.
So, to close, here is a ranking example that does allow both for the sampling errors and the demographic issues in NAEP scores. Our regular readers have seen the next two figures before, but they bear repeating. It is hard to understand how Kentucky can score even in 35th place when its predominant student ethnic population, its white students, does so poorly in NAEP eighth grade math.
Figure 1 shows how Kentucky stacked up in the NAEP 2015 Grade 8 Math assessment. As you can see, the Bluegrass State’s white students were bested by whites in 42 other states plus Washington, DC schools. Kentucky’s whites only did statistically significantly better than whites in two other states. Again keep in mind that about 80 percent of Kentucky’s total public school enrollment is white.
If you are interested in trends, here is how things looked back in 2011.
Yes, that is right. In 2011 our whites outscored whites in three other states and were outscored by whites in 39 other states plus Washington, DC.
So, between 2011 and 2015, the number of states where whites outscored Kentucky’s whites rose from 39 plus Washington, DC to 42 plus Washington, DC. In turn, Kentucky outscored one fewer state in 2015 than in 2011.
Here is one last shocker for you.
Way back in 1992, Kentucky’s NAEP Grade 4 Reading proficiency rate for “all students” was 23 percent while the nationwide average was 27 percent. BUT, don’t forget those statistical sampling errors! When you do a statistical significance test for these results with the NAEP Data Explorer, it turns out that these results are not statistically significantly different. So, if we are using the “all student” scores that the CBER and Quality Counts want us to use, it turns out Kentucky’s reading performance way back in the early days of KERA was not significantly different from the national average. Given that, Kentucky hasn’t made much improvement since.
Data Source for NAEP scores in Tables 1 through 4 and production of Figures 1 and 2: NAEP Data Explorer online tool