Over the past two days I have been writing about the new results from the National Assessment of Educational Progress (NAEP). I have been stressing that the NAEP has a fair amount of statistical sampling error in all of its data, and that often turns what appear to be “wins” into nothing more than ties. Today, you will see how this important NAEP fact of life impacts what we can really learn from this assessment about the achievement gaps in Jefferson County Public Schools (JCPS).

We can look at JCPS NAEP data because this very large school district has participated in what the NAEP calls its Trial Urban District Assessment program since 2009. However, the sample sizes collected are fairly small, and that generates a considerable amount of statistical sampling error in the scores. That sampling error limits our ability to detect real changes in the district’s performance. Very simply, it takes more than a few points of difference in scores before we can validly conclude that a true change has occurred.

Sadly, an understanding of the statistical limits in the NAEP seem to have escaped some staffers at JCPS, because they made public claims about gap improvements based on the NAEP that are not really accurate. With one exception, what looks like “wins” in achievement gap improvements in Jefferson County are actually only ties with the gaps previously posted. As far as we can validly determine from the NAEP, Jefferson County cannot claim much progress with its achievement gap problems.

To really start this discussion, I need to show you how the score gaps look and how the tests of statistical significance in those gaps are reported in the NAEP Data Explorer.

**How to read the scores and gaps tables**

The table below covers the Grade 4 NAEP Math Scale Scores for Jefferson County’s white and black students for the years from 2009 to 2015. The far right column in the table shows the resulting Scale Score gaps for each year. The table lists scores and gaps to the nearest full point.

For example, in 2009 Jefferson County whites had a NAEP Scale Score average of 243 and the blacks posted a score of only 216. The gap was 27 points. In 2015, the gap was reduced to 20 points, a figure 7 points lower.

The asterisk next to the number 27 indicates that this gap is statistically significantly different from the gap for 2015 of 20 points. I’ll point out now that this is the only statistical significance signaling asterisk you are going to see in any scores and gap table in this blog. Let me tell you how I determined the 2009 gap warrants that asterisk.

**How to read the statistical significance test tables**

The next graphic is from a download of gap statistical significance testing from the NAEP Data Explorer. To begin, notice that the year 2015 is on the far left of the bottom row. To the right of this number are other table cells with various entries.

As you read across the bottom line, notice that the cell under the 2013 blue-shaded legend entry includes a small x notation. This indicates the difference between the gap for 2015 (shown in the first table above as 20 points) and the 2013 gap (shown in the first table above as 25 points) is not statistically significant. This difference is a statistical tie even though the actual difference – which the NAEP Data Explorer carried out to the nearest tenth of a point as 4.6 points – seems fairly large. While we are talking about these “Diff” calculations, we will see later that in a few cases the numbers don’t exactly agree with the gaps shown in the Scale Scores and Gaps tables. This is just a rounding issue.

Reading further across the statistical significance test table, we see that the gap for 2015 is also not statistically significantly different from the gap in 2011. The first table shows the gap in 2015 was 20 points and it was slightly larger at 22 points in 2011.

The last cell in the statistical significance table compares the difference in the 2015 gap to the gap from 2009. The difference here is 6.9 points, which would round to 7 points if you refer back to the numbers in the first table. This is a statistically significant difference, shown by the use of the Less Than “<” symbol. So for the case of NAEP Grade 4 math, Jefferson County’s white versus black achievement gap did significantly improve by an amount NAEP can confirm after 2009. Results are flat since that year. Here are all of the tables for math and reading from the NAEP for Jefferson County collected in a format that makes cross-comparison easier.

We pretty well covered the NAEP Grade 4 Math situation in our explanations of the tables. This is the only case you will find where some improvement in Jefferson County achievement gaps can be claimed using the NAEP.

Regarding Grade 4 NAEP Reading in Jefferson County, there is no discernable difference in the gap for 2015 and any previous year once we allow for the statistical sampling errors in the NAEP.

Note: the difference in gaps for 2011 and 2015 on the left of the Grade 4 Reading table rounded to the nearest point is 2 points but the more precise calculation in the NAEP Data Explorer’s gap significance tool shows it as 1.2 points. This is just a rounding issue.

No matter how we consider the Grade 8 NAEP Math situation in Jefferson County, there has been no discernable improvement.

No matter how we consider the Grade 8 NAEP Reading situation in Jefferson County, there has been no discernable improvement.