NEW YORK - President Obama's "Race to the Top" initiative promotes an accountability in education that links teacher evaluations with student test scores. While many will point to the potential problems this poses for the classroom, there's something else to think about. This position exhibits considerable confidence in the results returned to students by the standardized testing industry.
That's not a faith I share.The 15 years I spent scoring student essays and written responses to state K-12 tests revealed a process that was less "scientifically based research" (a heady phrase dropped more than 100 times in No Child Left Behind) than a theater of the absurd.
On one of the first tests I scored, I was hired (after barely flashing my college diploma at a human resources representative) to assess high school writing. It was a job granted with the caveat that as temporary workers, we could keep the position only if we passed a "qualifying" test.
We'd have to score enough student essays accurately during a training session to prove we could do the job in a standardized way, a process that allowed the testing company that hired us to claim just how capable, consistent, and qualified its employees were.
It was a nerve-racking experience to know our jobs hinged on those "qualifying" tests, but in the end we part-timers needn't have worried so much.
After two days of training, nearly half the 100 people applying for the job failed the tests and were fired. Our unemployment lasted only about 12 hours. The next morning nearly every one of us flunkies was hired right back, an employment rebirth that occurred just as soon as the testing company realized it was short on personnel.
I asked the HR representative about the 70 percent accuracy on the qualifying test we were required to get in order to keep the job. To which she informed me that they had decided 60 percent was good enough after all.
During my time in testing I was witness to (or party to) every conceivable manner of corporate chicanery and statistical tomfoolery, any shortcut imaginable that would help those for-profit companies get tests scored, deadlines met, and money made.
On one scoring project I managed, for instance, the government agency in charge passed down an edict stating that all scorers had to go through a remedial retraining (group discussions with their peers about scoring rubrics and training papers) after any work stoppage of 30 minutes or more, including their scheduled half-hour lunch break. The government agency in charge said such retrainings would help ensure the student responses were scored within the proper context of "psychometric rigor."
Those retrainings, however, also invariably devolved into time-consuming and contentious debates among the scorers about what score Anchor Paper No. 3 or Practice Paper No. 9 really deserved, theoretical arguments that completely interfered with the project's real goal of slapping scores on student tests.
Because the testing company that hired us was more concerned with scoring tests than achieving any sort of "psychometric rigor," they solved the problem of any extra retraining simply by changing the work schedule, eliminating from our workday any 30-minute breaks.
"From now on," I told the group of temps I was supervising, "we'll take a 29-minute lunch. Enjoy."
Obviously I can't just be pointing fingers here at big, bad corporate America because I am guilty, too, and during my time in testing there were few corners I didn't cut and few rules I didn't bend. I fudged the numbers like everyone else. Much of my career managing scoring projects was spent manipulating statistics to give off the appearance of a test-scoring process that was consistent and standardized even when it clearly was not.
And while I don't mean to imply such statistical skullduggery was the testing industry's official stance, neither could it have been unknown that such things regularly occurred.
Just before I left, for instance, I saw a temporary supervisor chastised and demoted for too obviously cheating on his scoring group's accountability stats, although that harsh reprimand barely slowed down the accused in his charge right up the company ladder. Within a month that same guy – the one whose blatant manipulation of test stats had earned him both a pay cut and a verbal redressing from company management – was given a full-time job as a project manager.
While accountability in education may be an important goal, it's critical to realize how difficult that might be to pin down. The lesson of my career should be that trusting massive corporations that answer to a bottom line to make decisions about American schools is a whole lot different than trusting those men and women who stand every day at the front of the classroom.
Too often in my career the test results we returned had to be viewed not as exemplars of educational progress, but rather as numbers produced in a mad rush to get things done, statistics best viewed solely through the prism of profit. Caveat emptor, Mr. President.
Todd Farley is the author of "Making the Grades: My Misadventures in the Standardized Testing Industry."
"A child's learning is the function more of the characteristics of his classmates than those of the teacher." James Coleman, 1972
. . .a pupil attitude factor, which appears to have a stronger relationship to achievement than do all the “school” factors together, is the extent to which an individual feels that he has some control over his own destiny. James Coleman, 1966
Wednesday, October 28, 2009
From a Scorer with 15 Years Experience Inside the Testing Industry
From an op-ed (ht to Monty Neill) in today's Christian Science Monitor by Todd Farley, author of Making the Grade: My Misadventures in the Standardized Testing Industry: