Schools Matter: "Failing" Schools and "Trustworthy" Tests

Andy Rotherham's woozy sycophant, Alex Russo, recently quoted a letter from Bill Sanders warning against the use of any multiple measures in a reauthorized NCLB that are not standardized tests:

“Most of the measures usually advocated under the banner of ‘multiple measures’ have so little reliability that any attempt to use them in summative assessment is certain to provide results so untrustworthy that essentially no distinction among schools can be made."

Let's try to forget for a moment that Sanders is owned by SAS, Inc., and that he is doing his best to help SAS make good on their investment in him and his mysterious proprietary statistical formulae that purports to eliminate all noise from test data that might intrude from those otherwise annoying peripheral domains such as bias, poverty, culture, or even bullets.

Untrustworthy multiple measures of learning? Give me a break! One must wonder why Sanders has never expressed the same concerns for the thousands of junk tests (James Popham estimates 90% junk) whose invalid and unreliable results Sanders has spent his years sifting through like an alchemist whose optimism knows no limit. If our abusive, maniacal, and stupefying NCLB policy had ever been dependent upon valid and reliable measures, it would have disappeared years ago.

Let me share, then, a brief article on this issue of trustworthiness by Popham that appeared in the March 2005 issue of The School Administrator:

At a time when just about everyone on the planet seems to be urging educators to pursue challenging curricular goals, I may seem somewhat demented because I’ve recently been begging my language arts colleagues to devote more instructional attention to punctuation — and in particular to the use of quotation marks. But I have a reason.
Recently I wrote a general-market book about the No Child Left Behind Act entitled America’s “Failing” Schools. The book’s publishers, understandably eager to sell scads of books, signed me up to do a flock of in-studio radio and telephone interviews regarding the book.

Well, most of the interviews went quite well, for the interviewers already had read at least parts of the book. However, other interviewers had apparently just scanned the book’s title because their first question to me was: “Why do you believe our nation’s schools are doing such a rotten job?” My most accurate response would have been an emphatic, “I don’t!” But such a confrontational reply fails to engender interviewer-interviewee rapport. Thus I explained, hopefully without condescension, that the quotation marks around “Failing” indicated my disagreement with the typical meaning of that word.

So just to set the record straight, I do not think America’s schools are failing. Indeed, the main message of my book was that, because of NCLB, many schools (and school districts) are now regarded as failing although, in fact, such negative appraisals are inaccurate.

Insensitive Testing

Let’s consider for a moment how it is that a public school gets placed on the NCLB-authorized loser list. If any school fails to improve students’ scores sufficiently each year on its state-designated NCLB tests, the school ther eby fails to make adequate yearly progress. According to the law, a school that fails to make AYP is identified as being in need of improvement because the school failed to make AYP for its students as a whole or for one of several NCLB-designated subgroups. Yet most such schools will simply be viewed as “failing.” Just because a school is thought to be failing according to a federal law, this does not make the school, in fact, a failing school. Here’s why.

The essence of NCLB’s school evaluation strategy is tied to improvements in students’ scores on state-chosen NCLB tests. But what if a state’s chosen NCLB tests are incapable of detecting instructional improvement— even if such improvement takes place? In that case, NCLB’s test-based evaluative approach makes no sense. Schools labeled as failures may not be. Schools not identified as having failed AYP may be doing a dismal instructional job. Regrettably, in all but a few of our states, NCLB tests chosen by their state education agencies are more influenced by students’ socioeconomic status than by a school’s instructional success.

One sort of instructionally insensitive test is the nationally standardized achievement test, including those with added items so they’re better aligned with a state’s curriculum. The chief measurement mission of these tests is to provide comparative interpretations of students’ scores. To do so, the items on these tests must yield a reasonable degree of score-spread. It turns out that one of the most dependable ways to get an item to produce score-spread is to link the item to socioeconomic status so students from more affluent backgrounds are apt to answer the item correctly. But the inclusion of such items, of course, makes the test instructionally insensitive because its scores are too heavily influenced by test-takers’ SES.

Often a state’s custom-built standards-based NCLB tests are also instructionally insensitive. These tests supposedly measure students’ mastery of a state’s curricular aims, that is, its content standards. Unfortunately, because most states have identified so many curricular aims that no one can tell what’s actually going to be assessed, teachers are unable to target their instruction sensibly.

Moreover, because of the massive numbers of curricular aims, these tests’ score reports provide such general feedback that teachers can’t tell which parts of their instruction were effective. Thus after a few years of trying to make sense out of such senseless assessments, teachers begin to pay scant attention to the tests. Thereafter, of course, what turns out to influence a school’s test scores is, you guessed it, students’ SES.

Impending Perils

If your state employs instructionally insensitive NCLB tests, then it is almost certain that NCLB labeling of school quality will be inaccurate. Schools should be judged using as much relevant and accurate evaluative data as can be assembled. For instance, it is important to consider a variety of students’ significant work samples, especially those collected in a credible pre-instruction versus post-instruction manner. Additional indicators of quality (which NCLB permits) might include attendance rates, tardiness indices and suitably measured evidence of students’ attitudes and interests.

School administrators, many of whom will most certainly be evaluated on the basis of their schools’ NCLB-determined success, need to learn enough about these issues to (1) try to improve any serious shortcomings in their state’s NCLB tests and (2) inform parents and pertinent policymakers about the perils of uncritically regarding a “failing” school as one that’s actually failed.

Jim Popham is an emeritus professor in the UCLA Graduate School of Education and Information Studies. He can be reached at 1706 Keoniloa Place , Koloa, HI 96756 . E-mail: wpopham@ucla.edu. He is the author most recently of America’s “Failing” Schools: How Parents and Teachers Can Cope With No Child Left Behind (RoutledgeFalmer).

Schools Matter

Friday, August 03, 2007

"Failing" Schools and "Trustworthy" Tests

Insensitive Testing

Impending Perils

No comments:

Post a Comment