It’s no surprise that those pushing federally driven, one-size-fits-all academic standards also heavily endorse assessment tools that not only have a track record of past failure in Kentucky, but also offer little chance of success anywhere else in the nation.
It’s not that performance assessments are wrong. They’re not. They can be very beneficial and have been used for years in certain classrooms to achieve valuable academic experiences.
Those chemistry-lab experiments you had in high school taught important lessons such as how matter can change from one state to another. You learned, for example, that while many types of matter move through a three-stage transition from solid to liquid and then gas, some things like dry ice can change directly from solid to gas.
Such performance events – conducted under a capable teacher’s tutelage and informed evaluations – yield remarkable lessons with lifetime benefits.
But the need for – and types of – such exercises vary from subject to subject and cannot be forced into every classroom in all schools.
Trying to shoe-horn the successful, teacher-driven-and-directed performance event from that individual chemistry classroom into a forced, nationwide assessment policy opens up a whole can of wiggly policy worms.
Bluegrass Institute education analyst Richard Innes points to Kentucky’s testing history and says it’s extremely difficult, if not impossible, to employ complex performance events in a statewide assessment program in ways that yield credible information about whether or not students are on the path to success in key academic areas.
Innes in his new report “Selling ‘Performance’ Assessments with Inaccurate Pictures from Kentucky” describes a “typical” performance event from the 1990s-era Kentucky Instructional Results Information System (KIRIS) in which fourth-graders were given rulers, compasses and protractors and told to work in teams to determine the number of life-sized ladybug images on standard sheets of paper.
After developing their answers, students were required to write individual reports that were sent to a central location for grading.
Those who developed this exercise no doubt expected students would do something like divide their papers into fourths with each team member counting a portion of the images then putting their counts together to determine final answers.
But, as Innes notes, a team could just as quickly count the entire number of images on a page, checking off each as it was counted. There’s not much higher-order thinking involved with that approach, of course, but wouldn’t it be just as effective?
While the intention of this performance event was to evaluate math problem-solving ability, it offered the very real prospect of turning into little more than an evaluation of writing skills conducted by disconnected graders who didn’t even observe the students as they worked on the problem.
Good writers in an exercise like this who are terrible at math might pull down excellent grades even if the problem actually was solved by another team member.
Innes also points out that Kentucky’s old, performance event-dominated testing system provided inflated results.
During the years of the failed KIRIS assessment, which was filled with unreliable assessment elements like writing and math portfolios and performance events, education leaders and politicians cited exploding proficiency rates to claim Kentucky’s education program was making great progress. However, the mirage collapsed when the federal “Nation’s Report Card” test results showed far-weaker performances.
If Americans want accurate assessments of students’ performance, then state education leaders nationwide must be wary of attempts to once again force unreliable, performance-type events into Common Core testing.
Of course, if it’s just about cool-sounding but ineffective experiments that make mediocre schools and failing systems look good, then educrats will ignore past lessons from failed state assessments that used such unreliable and invalid testing tools.