In February 2005, changes were entered into the Federal Register that essentially re-defined knowledge for purposes of educational policy and decision making. The gold standard established by Grover Whitehurst and Co. has now been defined as the same experimental and quasi-experimental design that is used in medical research. Here are a few clips from that document (click here for [PDF]):
The definition of scientifically based research in section 9201(37) of NCLB includes other research designs in addition to the random assignment and quasi-experimental designs that are the subject of this priority. However, the Secretary considers random assignment and quasi-experimental designs to be the most rigorous methods to address the question of project effectiveness. While this action is of particular importance for programs authorized by NCLB, it is also an important tool for other programs and, for this reason, is being established for all Department programs. Establishing the priority on a Department-wide basis will permit any office to use the priority for a program for which it is appropriate.In February 2005, changes were announced in the Federal Registry that essentially re-defined knowledge for purposes of educational policy and decision making. The gold standard 3586 A FR Vol. 70, No. 15 Tuesday, January 25, 2005 established by Grover Whitehurst and Co. has now been defined as the same experimental and quasi-experimental design that is used in medical research.Now if you are testing a new drug to determine its effects, that kind of double-blind research sounds entirely appropriate. Control groups, placebos, the whole bit, yes?
The Secretary establishes a priority for projects proposing an evaluation plan that is based on rigorous scientifically based research methods to assess the effectiveness of a particular intervention. The Secretary intends that this priority will allow program participants and the Department to determine whether the project produces meaningful effects on student achievement or teacher performance.
Evaluation methods using an experimental design are best for determining project effectiveness. Thus, when feasible, the project must use an experimental design under which participants--e.g., students, teachers, classrooms, or schools--are randomly assigned to participate in the project activities being evaluated or to a control group that does not participate in the project activities being evaluated.
If random assignment is not feasible, the project may use a quasi-experimental design with carefully matched comparison conditions. This alternative design attempts to approximate a randomly assigned control group by matching participants--e.g., students, teachers, classrooms, or schools--with non-participants having similar pre-program characteristics.
In cases where random assignment is not possible and participation in the intervention is determined by a specified cutting point on a quantified continuum of scores, regression discontinuity designs may be employed.
For projects that are focused on special populations in which sufficient numbers of participants are not available to support random assignment or matched comparison group designs, single-subject designs such as multiple baseline or treatment-reversal or interrupted time series that are capable of demonstrating causal relationships can be employed.
Proposed evaluation strategies that use neither experimental designs with random assignment nor quasi-experimental designs using a matched comparison group nor regression discontinuity designs will not be considered responsive to the priority when sufficient numbers of participants are available to support these designs. Evaluation strategies that involve too small a number of participants to support group designs must be capable of demonstrating the causal effects of an intervention or program on those participants.
The question becomes, is this model feasible for education settings? And if it is feasible, what kinds of research can and cannot be deemed scientific, and thus, acceptable? That’s right, we can’t measure weight loss, blood pressure, or cell count, but we can measure test scores in a way that, by definition, makes them "scientific."
In this epistemologically-arrogant re-definition of what constitutes knowledge, Grover Whitehurst at ED has, in effect, eliminated all of the rich qualitative data that aim at understanding educational experiences within the natural contexts of schools. With this new re-definition of knowledge, on must wonder, in fact, how long that ERIC, now under corporate management, will continue to warehouse all of the qualitative research that does not meet the new criteria for scientific respectability.
Now today AIR (Amerian Institute for Research) announced the results of a multi-million dollar study (Study Rates 22 Widely Used Comprehensive School Reform Models) funded by, who else--ED, that concludes that Direct Instruction (remember the Great Carnine and his Oregon empire built on Bush connections?) is the superior "comprehensive reform model" now in use in U. S. schools. (For those of you who did not see these clips of direct instruction in action, check out these video clips).
What shiftless outfit, we wonder, is at the bottom of the list? That's right, the Coalition of Essential Schools, that old-fashioned bunch of democracy supporters that focuses on developing schools built around the needs of persons and communities. (Remember Coalition principal, Deborah Meier, the only educator to ever win a MacArthur Genius Award?)
Okay, okay, you say, but now AIR shows scientifically that Direct Instruction (drill, fill, and kill) is the best thing since the re-education camp. After all, this is what the newswire says:
WASHINGTON, Nov. 29 /U.S. Newswire/ -- A new guide using strict scientific criteria to evaluate the quality and effectiveness of 22 widely adopted comprehensive elementary school reform models rates 15 as "limited" to "moderately strong" in demonstrating positive effects on student achievement.Well, it seems that such conclusions may be premature, even though ED coughed up millions to have this study done. Remember the demands for "scientifically-based" research you read above in Grover's proclamation? See if there is any fit with this little admission (listed under Limitations) from AIR's "research:"
Although this report builds on the strong prior work of others (e.g., Borman et al., 2002; Herman, et al., 1999) and the best thinking of the education research community regarding how to conduct consumer-friendly evidence reviews, it falls short of the ideal in a number of areas. We hope that over time—with the feedback of education consumers, researchers, and model providers—we will be able to issue future reports that are increasingly accurate and useful. Relying on existing evidence in providing ratings was a major limitation of this report. Our descriptive information was based on a review of publicly available information, often provided by the models themselves. Given limited resources, verifying the claims made by all service providers was impossible. We did attempt to gather independent information through conversations with a small group of randomly selected principals of schools served by the models reviewed.
However, these were informal conversations, conducted with only a very small number of individuals. Given our limitations, other participants and stakeholders involved in CSR—such as teachers, students, parents, and school board members—could not be reached. During the model selection process, we encourage consumers to probe more deeply for further information to support their final choice of a model. For example, schools and districts are in a better position to request detailed cost information for proposed or additional services from a model provider as part of a contracting process.
Likewise, our quantitative information was limited to a review of available prior research that had been conducted on the 22 models. While we searched extensively to uncover all sources of existing evidence, we were not able to conduct original research or to apply common evaluation measures across all models to ease comparability. Also, because models are evolving and refining their design, we can’t be certain whether the “high” or “low” ratings given to a model are truly representative of the current version of that model. Many models may be “new and improved” but may not yet have rigorous evidence to demonstrate such a claim.
I am glad to know that DI now has the backing of science, as required by Grover Whitehurst and the other great minds at ED. But then, so does intelligent design--at least the backing of creation science, anyway.