There seems to be a slow learning curve in some sectors of our state and nation about the challenges of trying to compare NAEP data from state to state. That is definitely true with the Kentucky Department of Education’s somewhat misleading headline about the new NAEP results for 2015. The news release headline blares, “Kentucky Students Continue to Outperform National Peers.”
Is that really true, or is the department ignoring some important facts of life about what it takes to fairly compare NAEP results across the states?
If you want to know the answer, click the “Read more” link.
The challenges of comparing NAEP data across states and between states and the national average are not recent discoveries.
One past problem has been very uneven exclusion rates between states for students with learning disabilities and those still learning English. For example, the NAEP 2009 Science Report Card cautions on Page 6:
“Variations in exclusion and accommodation rates, due to differences in policies and practices for identifying and including SD and ELL students, should be considered when comparing student performance across states. States and jurisdictions also vary in their proportions of special-needs students, particularly ELL students. While the effect of exclusion is not precisely known, comparisons of performance results could be affected if exclusion rates are markedly different among states.”
Kentucky has been a major exclusion state on past NAEP assessments, which inflated the state’s scores in some years. Fortunately, this does not seem to be a notable problem for Kentucky in 2015, but an even bigger impediment to comparing Kentucky’s performance to other states is still very much in play. This issue involves the rapid and large shifts in student racial demographics in many other states. Ignore the demographic impact on NAEP and Kentucky only fools itself.
Demographic shifts elsewhere have been truly huge. California, for example, had a white mix in its public school classrooms of about 51 percent when the NAEP started state testing in 1990. In the brand new NAEP, whites only made up 25 percent of California’s public school enrollment. That puts California at a real disadvantage in any poorly done analyses because of the large achievement gaps for most minority racial groups. Simply because California has a lot more lower-scoring minorities, the state winds up with lower “All Student” scores even though the scores for each of its minority groups might actually be better than the overall score indicates.
How about a current Kentucky – California example? California’s 2015 All Student NAEP Grade 4 math score is MUCH lower than Kentucky’s, 232 versus 242. But, California’s white students essentially tied Kentucky’s, scoring 246 versus the Bluegrass State’s 244.
NAEP reports have even used Kentucky as a formal example of the demographic problem. The NAEP 2009 Science Report Card says on Page 32:
“Some might assume that states that score above the national average would have student groups that exhibit similar performance, but that is not necessarily true.”
The report card continues:
“…while the average score for Kentucky was higher than the score for the nation, White students (85 percent of the state’s eighth-graders) scored lower than their peers nationally.”
One more point needs to be mentioned. The NAEP is a sampled assessment, and there are plus and minus errors associated with all the scores. It takes a notable score difference before you can properly claim that one NAEP score truly “Outperforms” another.
So, here is the take away: if you only look at overall scores from the NAEP for “All Students,” and if you ignore the technical limitations in the NAEP’s accuracy due to its sampling methods, you might get the wrong impression about how our state really performs.
In fact, due to the very large racial demographic differences and other demographic variations from state to state, only looking at overall scores can lead to very incorrect impressions, especially for a state like Kentucky that still has a very high enrollment of whites in its public schools. The racial achievement gaps can even impact analysis of some subgroup scores, say for lunch eligible versus non-eligible students.
Let me show you one interesting example of how you can get easily misled if you don’t break Kentucky’s scores out by race before doing NAEP comparisons. This example uses data from the brand new, 2015 NAEP Grade 4 Math Assessment that I downloaded from the NAEP Data Explorer.
The download of the All Student scores and the scores broken out by race across all the nation’s public schools and for Kentucky produced the data shown in the light green shaded area of the table (click on the table to enlarge it, if necessary).
Notice in the green, official data part of the table that the National Public All Student average NAEP Scale Score on the Grade 4 Math Assessment in 2015 was 240 while Kentucky scored a somewhat higher 242. These scores were not significantly different, by the way. So, Kentucky’s students absolutely did not “outperform” the nation under any condition.
The table also shows scores by race. For example, Kentucky’s whites scored a 244 while whites across the nation scored four points higher at 248. Tools in the NAEP Data Explorer show this was a statistically significant difference in scores, so for whites, Kentucky not only did not score above the nation, but actually scored BELOW the national average.
Kentucky’s blacks appear to have outscored their peers by two points, but this also is not a statistically significant difference in scores. So, it is incorrect to claim our blacks outscored their national peers.
The score difference for Hispanics in Kentucky and across the nation is larger. However, the number of Hispanics in Kentucky is still quite low, and the sampling error in the scores is therefore rather large. Once again, the NAEP Data Explorer indicates the Hispanic difference is not statistically significant. The Kentucky Department of Education is playing games with the quality of NAEP statistics by trying to claim something the data cannot support.
As you can see, the representations of the other racial groups in Kentucky are all small and these scores are all statistical ties, as well.
Bottom line: it is highly misleading to use NAEP to try to claim Kentucky’s fourth grade students “Outperform” their peers across the nation.
Let’s come at this from a somewhat different direction. What if the nation kept its same scores for each racial group, but we do a weighted average of those scores using Kentucky’s student demographics? The blue shaded part of the table explores that. As you can see, the national average would change from being two points lower than Kentucky’s to being three points higher. Put Kentucky and the nation on a level playing field and we have to admit we might be behind.
In fact, if the reported errors in the actual All Student scores from the NAEP Data Explorer are used to analyze the Kentucky-weighted national average score shown in the blue section of the table, the three point difference in scores with Kentucky would be statistically significant.
So, blame this blog on the Kentucky Department of Education (KDE). Their news release yesterday pretty much totally ignores some important facts of life about doing comparisons between states and the nation with the NAEP. Let me add a bit more about that.
The news release shows separate comparisons of math and reading scores for Kentucky to other states, but these only cover the All Student scores. As such, they compare whites in Kentucky to many minority kids elsewhere and the numbers are not very reflective of our real educational performance.
The straight presentations of All Student scale scores likewise create incorrect impressions, again comparing lots of Kentucky whites to lots of other racial mixes found elsewhere. These are not apples to apples comparisons.
Presentations of scores by gender and by free or reduced-price lunch eligibility still wind up comparing a lot of our whites to a lot of Hispanic and black kids in other states.
The comparisons of scores by race are a bit more revealing, but the news release’s failure to point out that close score differences are only ties with the NAEP is a technical error that I don’t think our education experts in Frankfort should be making. I know some of the folks at the department are capable of really good technical work. How about insuring they get a chance to check future releases involving technical issues like testing?