If I or Jim Popham, or Bob Linn or anyone who has worked in the field of psychometrics constructed a test of "assessment literacy," most of the Fourth Estate would fail. Today's low grade goes to Sam Dillon of the NYT (who recently received a compliment from me for noting that comparing a huge, diverse nation like the US with tiny city-states like Hong Kong and Singapore doesn't make any sense). A low mark to anyone else who reported the following from a conference call with Secretary of Education, Arne Duncan: "student math and reading scores on local tests, as well as on the National Assessment of Educational Progress, a federal test that is more difficult."
No. They will not be reporting SCORES since SCORES on state tests and NAEP would not be comparable (but, to cut Sam some slack, it is possible that it is Mr. Cliche Man, Duncan, said that; Duncan managed to conduct a 38 minute interview with Ed. Week editors and reporters and another with the Washington Post's Lois Romano without saying ANYTHING).
What the states will be reporting are the percent of students reaching the various NAEP achievement levels and the equivalent proficiency rates, passing rates, meets standards, etc., on state tests. These are not SCORES. As I have shown over and over, the NAEP achievement levels are ridiculous (see for example, "Oh, those NAEP achievement levels" in the September 2005 Principal Leadership or "A test everyone will fail" in the May, 3, 2007 Washington Post, also at www.huffingtonpost.com/gerald-bracey).
This is not the paranoid position of a lone crank. I am in the company of the National Academy of Sciences, the National Academy of Education, the Government Accounting Office and the Center for Research in Evaluation, Standards and Student Testing as well as individual psychometricians. Before the Bush administration declared war on science, the NAEP reports themselves carried disclaimers including quotes from the National Academy of Sciences like this: "NAEP's current achievment setting procedures remain fundamentally flawed. The judgment tasks are difficult and confusing; raters' judgments of different item types are internally inconsistent, appropriate validity for the cut scores is lacking; and the process has produced unreasonable results."
VALIDITY lacking? UNREASONABLE results? Fundamentally flawed? Sounds worse to me than credit default swaps. Why didn't IES at USDOE ban them as not being derived from "scientifically based research?" (It actually might be a good thing to make Reid Lyon head of the National Assessment Governing Board, the agency produced these monstrosities). The achievement levels were supposed to be used only until something better could be produced. That was almost 20 years ago. But, of course, no one is working on anything better. Why? Because the results from the current procedures build such wonderful cudgels to bash schools with.
Among the "unreasonable results" are that American students scored high on some international tests of reading, math, and science, but only a small percentage ever reaches the "proficient" level on NAEP In the most recent NAEP reading assessment, only 33% of 4th graders were deemed proficient or better. Richard Rothstein and colleagues, in a perfectly straightforward procedure estimated that the highest scoring nation in one reading assessment, Sweden, would have about 33% of its kids proficient in reading if Swedish kids had to sit for NAEP (at the time of Rothstein's study, 31% of American kids reached this level).
So states will or will not become eligible for stimulus money based on invalid criteria. This is not change we can believe in. But don't look for them to change soon. In his speech to the Hispanic Chamber of Commerce, President Obama said "just a third of our 13- and 14-year-olds can read as well as they should." For extra credit, guess where that statistic came from.
Port Townsend, Washington
Thursday, April 02, 2009
Bracey Offers the Answer Sheet on NAEP
Posted at ARN: