According to the latest Gates Foundation study, "Gathering Feedback for Teaching," student surveys are more reliable than outside observers in evaluating teachers. I agree with the researchers that such a finding should be no surprise. Even though I trust my students' judgments more than those of principals, or even trained outside evaluators, however, I would think twice about incorporating those surveys into high-stakes evaluations. Do we want to institutionalize a conflict of interest on educators, tempting some to lower their standards in the hope that their students would reciprocate when evaluating them? Do we want to take the risk that popularity issues will undercut collaboration among teachers?
The New Teacher Project (TNTP) makes a big issue of the Gates finding that value-added results are the better predictor of future value-added results. Well duh! Since they are mostly studying the students' trajectories within their school, it is no surprise that "Gathering Feedback for Teaching" also concludes that 2/3rds of the variance it found was due to factors other than the teacher. (Emphasis is their's) But, why doesn't the TNTP recognize that such a finding contradicts its faith in teacher quality? The students current test score trajectories are based on the sum of their in-school and out-of-school experiences, not just that teacher's skill. The human observers are just trying to evaluate that teacher's instruction, and it constitutes only about 15 to 20% of the students' outcomes. The question which the TNTP should ask is whether the test score trajectories of students from the entire district provide evidence for evaluating the value-added of teachers in schools where it is harder to raise student performance.
Secondly, why would the TNTP think that such a finding has anything to do with the policies they advocate? The question is whether test score growth models are valid enough for high stakes decisions. More precisely, the question is how many teachers would be wrongly indicted as ineffective? Why would an educator commit to the toughest schools if he or she faced a 20% or a 10% or 25% chance EACH YEAR of being subject to humiliation, constant stress, and perhaps dismissal due to circumstances beyond their control? How many teachers per hall need to be inaccurately charged as ineffective before the morale of the entire building is compromised?
Why can the TNTP not see the three prime lessons of the new Gates research?
- The MET's findings "suggest that the classroom practices of the majority of teachers, as many as 85 percent, are remarkably similar." If the goal is not the mass scapegoating of teachers, then stakes do not need to be attached to data for professional development.
- Teachers score the lowest on the all-important factors of emotional connections, communication, and teaching analysis, problem-solving, and inquiry. But the MET does nothing but ask districts to stop imposing counter-productive test prep and abusive evaluations by untrained, overworked, stressed-out administrators.
- The most "vexing question" is whether value-added experiments are biased due to the "infinite number of additional student and peer characteristics." The MET promises to address a few of the easiest of those issues in the next report.
Typically, I am less annoyed by Gates researchers than by the true believing teacher-bashers at the TNTP or extremists like it's founder, Michelle Rhee, but I want to challenge their single most illogical soundbite. As usual, "Gathering Feedback for Teaching" cites the TNTP's polemical "The Widget Effect," and concludes that it's expensive and dangerous experiment "significantly outperforms traditional measures." No! They outperform the NON-USE of traditional measures. Nowhere in the MET literature do they ask what it would take to reform more normative procedures or why principals have not even terminated ineffective probationary teachers who have no due process rights. The MET shows how easy it can be to identify the bottom performers, but it does not mention how hard it is to find qualified replacements.
The TNTP has done its damage, however, by starting the "teacher quality" movement down a disgusting road, so now we should concentrate on non-ideological educators, such as those who have advised the MET. We could then build on two other findings.
Firstly, both evaluations and professional development require the training of objective evaluators. We should make the investments necessary to nurture principals and others so they can gain the ability to recognize good teaching and mentor others. We should welcome dissent and heed the Gates' call for considering a robust variety of perceptions. Secondly, we must invest in creating learning cultures and professional development so that teachers can better communicate, challenge, engage, and nurture their students. The best way to do that, of course, is to use diagnostic data, which is more accurate because it has not been tarnished by gamesmanship created by high stakes.
Dr. John Thompson was an award-winning historian, lobbyist, and guerilla-gardener who became an award-winning inner city teacher after crack and gangs hit his neighborhood. He blogs at thisweekineducation.com, and huffingtonpost.com, and is writing a book on 18 years of idealistic politics in the classroom and realistic politics outside.