Tuesday, August 18, 2015

Petrilli and Ravitch Blame Team Obama for What Bush's Hacks Cooked Up

Even though much has changed about Diane Ravitch's rhetoric over the past few years, some things have not.  For instance, she has never explained the role of her conservative policy pals in the miseducating testing accountability policies that began with Richard Nixon, gained steam with Reagan, held firm through Clinton, and soared to new heights with Bush. 

Instead, she prefers to focus on Team Obama's role, as if Obama's administration represents something more than the final delivery of rotted education policies that Ravitch, herself, helped to craft over the three decades before Obama came to Washington.  

Ravitch found companionship for her revisionist and selective policy account this week when fellow neolib, Michael Petrilli, confessed that VAM-based teacher evaluation was a policy mistake that, hold on to your Scantron,  Arne Duncan was responsible for.  

Somehow, both Ravitch and Petrilli have chosen to forget that using VAM to hold teachers and schools "accountable" for differences in scores that poverty levels create had been been around for two decades when Duncan was named chief water carrier for plans that had been formulated in the early days of Bush's NCLB.  

Besides, the war on teachers was not simply a matter of bribing states to adopt VAM teacher evaluations.  Under Bush's Rod Paige and, later, Margaret Spellings, policies were put in place to make it harder for experienced teachers to be "highly qualified," and easier for unprepared beginners to be named as "highly qualified." 

Many other teachers were run out of teaching by unethical and abusive testing practices, the proliferation of scripted teaching and curriculum, and a ramped-up policy elite rhetoric that blamed teachers for low test performance.  All of these were put into place back when Ravitch was still openly riding the corporate bandwagon.

Sure, sure, Duncan had his role.  He used the ill-fated Race to the Top grants to bribe states into adopting VAM teacher eval, Common Core, unlimited charters, and Big Data, but VAM was viewed as the new tool of education industry and conservative ed ideologues back when Arne Duncan was learning his corporate trade as CEO of Chicago schools.  

To be sure, the VAM bandwagon had been christened and launched by Bush's Margaret Spellings in 2005, as explained in this short excerpt from The Mismeasure of Education.  The Growth Model Pilot Project had Bill Sanders' version of VAM at its center.

To pretend that Arne Duncan and Team Obama were largely responsible for the coming of VAM is equivalent to blaming the pizza delivery man for the awfulness of the pie.  In fact, Arne was just delivering what had already been cooked up before his shift ever started.


The Growth Model Pilot Project (GMPP) and Sanders’ Testimony in Washington
Following a 2004 letter pleading for flexibility in NCLB accountability requirements from sixteen “state school chiefs” (Olsen, 2004), Secretary of Education, Margaret Spellings, announced the Growth Model Pilot Project in November 2005, as predicted by Dr. Sanders in his testimony in 2004 to the Tennessee House Education Committee.  Under pressure from states and municipalities faced with increasingly-impossible proficiency targets that NCLB required students from poorer districts where students were farther behind, the U.S. Department of Education developed a Peer Review Committee to evaluate state growth model proposals.
As indicated in Part I, the potential effects of NCLB’s unachievable proficiency targets were not a secret, even prior to passage. In her policy history of NCLB, Debray (2006) cites Dr. Joseph Johnson’s comments in a public address prior to NCLB passage: “[p]eople are looking at the data and saying, ‘This is going to be catastrophic because there are going to be so many low-performing schools and this isn’t going to work’” (p. 138).  Debray notes, however, that the Bush Administration, which had included a school voucher provision that was eventually struck from the final version of the Act, “had a political reason to want to see nonimproving schools identified so that NCLB’s options to exit such schools for better ones or receive private supplemental instruction would produce visible results of Bush’s educational innovations in the first term.  There was political interest in identifying lots of failing schools” (p. 115).  This would also be a boon for tutoring companies, canned remedial intervention programs, and other “learning corporations” to hawk their wares, including value added testing models for assessing test score improvement over time, in a mass market of desperate educators trying to achieve unrealistic testing targets.
Originally, NCLB disallowed states the use of value-added models and nationally norm-referenced tests to measure the effects of teachers, schools, and schools on student test performance.  Instead, the USDOE directed states to use proficiency benchmarks based on criterion-referenced tests aligned with their own state standards.  In a nod to growing criticism, however, the Spellings Growth Model Pilot Project allowed states to use “projection models” that could predict student performance on future assessments, thus answering the question: “Is the student on an academic trajectory to be proficient or advanced by 2014?” On May 17, 2006, Tennessee and North Carolina, the two states where the Sanders Model was in use, were approved to use their value-added projection models to track individual student progress in meeting NCLB academic goals.  The GMPP allowed projection models as an acceptable “safe harbor” option in providing evidence that a state was making significant progress toward AYP proficiency targets.
During the implementation of the GMPP, Sanders testified to the U.S. House Committee on Education and Workforce on “No Child Left Behind: Can Growth Models Ensure Improved Education for All Students” (July 27, 2006). On March 6, 2007, Sanders presented at the U.S. Senate Committee on Health, Education, Labor, and Pensions as part of a roundtable discussion entitled “NCLB Reauthorization: Strategies for Attracting, Supporting, and Retaining High Quality Educators” (March 6, 2007).  He called on Congress to replace the existing “safe harbor” options of NCLB with projection value-added models like his, promising that “effective schooling will trump socio-economic influences.”  Even though the focus of each hearing was different,  Sanders used both opportunities to promote his own brand of value-added modeling that could separate educational influences from “exogenous factors (if not completely, then nearly so) allowing an objective measure of the influence of the district, school, and teacher on the rate of academic progress.” Without naming any of them, Sanders described his growth model competitors as having “been shown to produce simplistic, potentially biased, and unreliable estimates.” However, later in his testimony, he admitted that he “had to engineer the flexibility to accommodate other ‘real world’ situations encountered when providing effectiveness measures at the classroom level: the capability to accommodate different modes of instruction (i.e. self-contained classrooms, team teaching, etc.), ‘fractured’ student records, and data from a diversity of non-vertically scaled tests.” Sanders expressed no doubt that his engineered flexibility was up to the task, even if other growth models on the market could not “and should be rejected because of serious biases.” 
Sanders provided the members of the U.S. Senate Committee on Health, Education, Labor, and Pensions a summary of the research finding that he attributed entirely to his “millions of longitudinal student records.”  According to Dr. Sanders, by “addressing research questions that heretofore were not easily addressed,” queries of his databases had yielded that beginning teachers are less effective than veteran teachers, that inner city schools have a “disproportionate number of beginning teachers,” that turnover rates were higher in inner city schools, that high poverty schools have a lower percentage of highly effective teachers as measured by test scores, that math teachers in inner city middle schools were less likely to have high school math certification, that high poverty students assigned to effective teachers “make comparable academic progress” with low poverty students.  From talking to highly effective teachers across the country, Sanders said that he learned that these teachers knew how to “differentiate” instruction, how to use feedback from formative assessment to make instructional decisions, and how to maximize their instructional time.  Never in his testimony did Dr. Sanders indicate that most of research questions had been asked and answered before his developing and marketing of value-added assessment took place. 
Also known were the obstacles teachers dealt with daily in applying teaching skills consistently across classrooms and schools, especially in high poverty classrooms and schools. This fact, however, is minimized by Dr. Sanders’ misleading and obfuscating claims that “differences in teaching effectiveness is the dominant factor affecting student academic progress” and “the evidence is overwhelming that students within high poverty schools respond to highly effective teaching.” Clearly, there are a couple of important qualifiers that Sanders fails to mention.  First, the more obvious one: Sanders does not make explicit that his claim regarding the importance of teaching effectiveness is based solely on gauging academic progress of individuals on tests as measured by the Sanders algorithm. By omitting this most important point in his presentation to the senators and their staffs, Sanders allows the false impression to be drawn and/or perpetuated that teaching effectiveness is more important than all the other factors, whether in school or outside school, that determine the variability in student achievement across income levels, family education levels, diversity levels, social capital levels, or any of the other variables that researchers have demonstrated are more important than teacher quality in determining variability in levels of achievement among students.  As noted earlier in Part II, an impressive group of researchers (Nye, Konstantopoulos, & Hedges, 2004) just two years before Dr. Sanders’ testimony noted in a most reputable peer-reviewed journal that, among seventeen studies the researcher examined, “7% to 21% of the variance in achievement gains is associated with variation in teacher effectiveness” (p. 240). 
The second point is not so easy to tease out or to discount, for there is commonsense and empirical veracity to claiming “the evidence is overwhelming that students within high poverty schools respond to highly effective teaching.”  When set atop the previous claim, however, the weight of potential misconception becomes too heavy to ignore.  All sentient beings, we suggest, are more responsive to effective teaching than to ineffective teaching.  Since Dr. Sanders never tells us what effective teaching is; we can only assume it is the kind that produces greater test score gains than would less effective teaching. Thus, higher test score gains are produced by more effective teachers, and we know they are more effective teachers because they have higher test score gains.  We are not the first to point out the obvious circularity of this definition, but the resulting unquestioned tautology is worth keeping in mind when the term “effective teaching” is bandied about.  The more serious difficulty with the Sanders claims about highly effective teaching in high poverty schools comes from the unstated conclusion to this unfinished syllogism that high-ranking politicians rush to, when given one thoroughly misleading premise and another that is full of emotional appeal and that can’t be argued with: If teacher quality largely determines student achievement, and if poor and hungry children respond with higher achievement to good teaching, then teaching holds the key to closing that achievement gap left gaping from the last round of less than effective reforms. 
This conclusion has proved appealing to both liberal political elites and conservative political elites: to the former because of a long-held suspicion that teachers are lazy and are just not trying hard enough, and to the latter because of the long-held suspicion that teachers are self-serving louts protected by their big unions.  In either case, the political solution must be better teachers, and any policy to help advance that priority, then it must be a good policy.  And if Dr. Sanders has a tool that can help tell us know who is doing a good job and who is not, then we have an intervention worth investing in that is much less expensive and with a wider appeal, by the way, than trying to do something substantive about poverty, which has for a hundred years remained the inseparable shadow of the testing achievement gap, from whatever direction it is viewed.
Just days before Dr. Sanders offered his testimonial to the Senate Committee in March 2007, the U. S. Chamber of Commerce (2007) published a state-by-state assessment that compared state test results to achievement levels of NAEP.  Tennessee did not fare well, earning an F for “truth in advertising about student proficiency” (p. 52). By Tennessee’s own standards, however, and by Dr. Sanders’ value-added calculations, the state seemed well on its way to meeting its NCLB benchmarks. The 2005 Tennessee Report Card, which provided one score for achievement proficiency and another score for value-added gains, showed 48 percent of 3-8th grade students proficient in Math and 40 percent advanced. In reading, Tennessee 3-8th grade students were 53 percent proficient and 38 percent advanced.  Based on Tennessee’s own scale and timeline for achieving NCLB targets[1], the state gave itself a B for achievement and a B for value-added in both subjects.
For a state with 52 percent of its students economically disadvantaged, 25 percent African American, and 16 percent with special needs, Tennessee’s self-generated report card results looked respectable until set alongside results from the National Assessment of Educational Progress (NAEP), which showed Tennessee’s proficiency rates dropping in 2005, rather than moving up as measured by state proficiency scores and value-added scores.  In 2005, Tennessee claimed 87 percent of its 4th and 8th grade students were proficient in math, while NAEP proficiency levels for 4th and 8th graders were 27.7 and 20.6 percent, respectively. In reading the discrepancy was no less startling; state proficiency scores for fourth and eighth grade were 87 percent, and the NAEP proficiency scores were 26.7 and 26.2 percent, respectively.
       In April 2010, The U.S. Department of Education issued an interim Growth Model Pilot Project (GMPP) Report that reviewed the approved growth models, including Tennessee’s.  In comparing the use of the NCLB’s status model that measured the percentage of proficient students each year and the pilot states’ growth model projections, the U.S. Department of Education came to the following conclusions:
1) Simply stated, “making AYP under growth,” as defined by the GMPP, does not mean that all students are on-track to proficiency (p. 54).
2) There was little evidence that the type of model selected had an impact on the extent to which schools were identified as making AYP by growth within the GMPP framework (p. 54).
3) Schools enrolling higher proportions of low-income and minority students were more likely to make AYP under growth in the status-plus-growth framework than were schools enrolling higher proportions of more affluent and nonminority students. However, if growth were the sole criterion for determining AYP, schools enrolling higher proportions of low-income and minority students would be more likely to move from making AYP to not making AYP (p. 55).
      
       In January 2011, the U.S. Department of Education published the final GMPP Report that analyzed two years of growth data from the participating states.  The findings from the second year of the study were similar to those of the first.  For Tennessee, that meant very few additional schools made AYP (22 in 2007-2008) using the projection growth model developed by Sanders. The 2011 Report also found that the type of model does make some difference in the number of students identified to be “on-track” to reach the 2014 NCLB target of 100 percent proficiency in math and reading, and with the Tennessee projection growth model, “relatively few students with records of low achievement but evidence of improvement are predicted to meet or exceed future proficiency standards, while students with records of high achievement but evidence of slipping are very likely to be predicted to meet or exceed future proficiency standards” (p. xix).  In short, the Sanders projection model did little to address the essential unfairness perpetuated by NCLB proficiency requirements that required those further behind and with fewer resources to achieve more than privileged schools whose gains are required to be much smaller to reach the same proficiency point.  
       An interesting finding tucked away in Appendix A of the 2011 Report suggested how the parameters for using growth models could be adjusted to help identify more schools making AYP annual targets:  “…results from growth measures used for state accountability purposes suggest that many more schools would make AYP if the first of the seven core principles of the ESEA project was relaxed” (p. 112). The first of the seven core principles “requires that the growth model, like the status model, be applied to each targeted subgroup as well as all students in the school.”  This means that growth outcomes are to be monitored separately, or “disaggregated,” for major racial and ethnic groups, limited English proficient (LEP) students, special education students, and low-income students. To sacrifice or to “relax” the core principle of disaggregation so that the value-added estimates would identify more schools as making AYP would appear to neutralize NCLB’s purported goal of bringing attention to those subpopulations of students who traditionally have not made adequate progress by any measure. It seems, too, that elimination of the core principle requiring disaggregation could serve to mask the amount of growth that low SES and minority students need to make in order to become proficient by means other than the relaxing of first principles. 

       Even though the U.S. Department of Education had approved the use of value-added projection models to demonstrate which schools and districts were making AYP, skepticism remains deep among highly respected statisticians, psychometricians, and economists when VAM is used for this and other high-stakes purposes. Some of those critical reviews will be examined in Part III.


[1] The Tennessee NCLB target (AYP) for math in 2005 was 79 percent proficient/advanced and 83 percent for reading.



No comments:

Post a Comment