Post a comment about the budget cuts at your school on our interactive comment map. more »
Two hot-air balloonists get lost, and they’re floating aimlessly. They spot someone down below them, and call out, “Hello!” The person on the ground replies, “Hello!” “Where are we?” one calls down. Up comes the reply: “You’re in a balloon!” They continue to drift, and one of the balloonists says to the other, “Who was that?” And the other responds, “That was obviously an economist.” “An economist? How can you tell?” the first asked. “Because what he said was precise, but irrelevant,” the other replied.
“Precise, but irrelevant” is my three-word assessment of the recent study of traditional and alternative teacher certification conducted by Mathematica Policy Research for the Institute of Education Sciences in the U.S. Department of Education. (And the study really isn’t very precise, but that’s a more technical story.) The design of this study successfully precludes it from addressing the most salient policy questions about alternative teacher certification–but we get a pretty clean estimate of the relative effectiveness of pairs of traditional-route and alternate-route teachers that are not representative of any population of teacher education programs, teachers, or schools.
The biggest weakness of the study, in skoolboy’s opinion, is that it fails to take seriously the idea that the elements of teacher education programs differ from one another, and that there is variability in the quality of programs–within both the population of traditional teacher certification programs and the population of alternative route teacher certification programs. The design of the Mathematica study doesn’t evaluate the operations and outcomes of particular traditional or alternative programs. And yet most of the relevant policy questions pertain to investment in particular programs or the hiring of graduates of particular programs. The study design cannot address these questions.
Here’s what the Mathematica researchers did: They compiled a list of 165 alternative teacher certification programs across the country that are not as selective as high-profile programs such as Teach for America, some sponsored by institutions of higher education and others by school districts or regional education agencies, and drew a stratified random sample of 63 programs. Then they looked for elementary schools that had hired an alternatively-certified teacher from one of these programs within the past three to five years. They then filtered these schools to the subset that had hired a relative novice traditionally-certified teacher in the same grade, creating a pair of teachers-one traditionally certified, and one alternatively certified, both with less than five years of experience, in the same grade in the same school. Students in that grade were then randomly assigned either to the traditionally-certified teacher’s classroom or the alternatively-certified teacher’s classroom. The use of random assignment created what Mathematica refers to as a mini-experiment, which led the researchers to attribute any achievement differences observed at the end of the school year to the effect of having an alternatively-certified vs. traditionally-certified teacher. The researchers observed each teacher, rating them on classroom instruction practices, and distinguished between alternative-certified teachers from high-coursework programs and those from low-coursework programs. The resulting study involved 2,600 students in 63 schools in 20 districts, spread across 7 states. The key conclusion: “The study found no benefit, on average, to student achievement from placing an [alternatively-certified] teacher in the classroom when the alternative was a [traditionally-certified] teacher, but there was no evidence of harm, either.”
Sounds pretty good, right? No way to address the fact that prospective teachers self-select into traditional or alternative certification programs, of course, but we can’t rely on random assignment to solve that problem, and the researchers acknowledge this, saying, “Because of likely differences in the types of people who attend various certification programs, the results cannot be used to rigorously address how a graduate of one type of program would fare if he or she had attended another type.” A large, diverse sample … random assignment to control selection into a teacher’s classroom … what’s to complain about?
How about this: the alternatively-certified teachers and traditionally-certified teachers in the study were not necessarily representative of graduates of the programs they attended. In fact, both the alternatively-certified teachers and traditionally-certified teachers in the study were persisters; any program graduate who had left teaching in the early career–and there are concerns about early attrition within the alternate-route population–would not appear in the sample. Moreover, the alternative certification programs and traditional certification programs in the study were not necessarily representative of all alternative certification programs and traditional certification programs. The schools in the study were not necessarily representative of all schools that hire both alternatively-certified and traditionally-certified teachers. And finally, the skills and competencies appearing on the California Achievement Test (CAT-5) may not be aligned with, and hence representative of, state and district curricular priorities.
Some evidence of the problems that this causes: One-half of all of the teachers in the study are from a single state, Texas. 71% of all of the classrooms in the study are kindergarten through second grade. The average number of students per classroom at the start of the year was 15.
So: if you’re a Texas principal interested in hiring an early elementary grade teacher with a few years of experience into a small classroom based on generic standardized test scores, this is the study for you. On the other hand, if you are interested in the quality of particular traditional certification programs or alternative certification programs, at either the elementary or secondary levels … or if you are a policymaker wondering whether to increase investment in alternative certification programs … or if you’re a principal wondering whether your next hire should be a traditionally-certified or alternatively-certified teacher … keep looking. The Mathematica study won’t tell you what you want to know.
Hey Aaron,
Thanks for this post: it summarizes a long paper in a clear manner and gets to the heart of the results. I wasn’t aware of some of the specifics of the sample set.
You write: “… if you’re a Texas principal interested in hiring an early elementary grade teacher with a few years of experience into a small classroom based on generic standardized test scores, this is the study for you.” The study could also be useful to policymakers, principals, and others that are willing to consider the data to be relevant to, if not the same as, their own circumstances. Your list of particular facts about the data can help them to judge the relevance.
Regardless, if I read it correctly, the study gives an example of traditional certification and extra coursework having no impact on effectiveness. My understanding (perhaps incorrect) is that the vast majority of studies on this general subject reach similar conclusions. I would be curious to hear about your favorite studies that reach different conclusions.
Ken,
Many scholars are now reexamining certification and coursework studies in light of two considerations: 1) that the issue of measurement error has confounded estimates of the effects of teacher observable characteristics on student outcomes, and 2) that the selection of teacher candidates into different kinds of training and education experiences makes identifying the effects of these factors difficult.
Here is the critical part of an abstract from a study that is addressing issue 1:
When teacher attributes [i.e. certification, college attended, SAT scores, experience] are considered jointly, based on the teacher attribute combinations commonly observed, the overall effect of teacher attributes is roughly half a standard deviation of universe score gains – even larger when teaching experience is also allowed to vary. The bottom line is that there are important differences in teacher effectiveness that are systematically related to observed teacher attributes. Such effects are important from a policy perspective, and should be taken into account in the formulation and implementation of personnel policies.
The quality of research in education remains appalling.
This study was too poorly designed to draw conclusions.
With a sense of integrity, a researcher would have declined to publish. But they know that there are advocates out there who will treat the null conclusion as a conclusion, and run with it.
Ken: Policymakers, principals and other people rely on all sorts of things to inform their decisions, and research is just one of them. I haven’t been able to think of a circumstance in which I think it would be appropriate to rely heavily on this study to make an important decision, but I suppose I can’t rule out the possibility.
I don’t think that ed schools preparing teachers have, as a group, been very successful in making the case either that they transform the knowledge, skills and dispositions of prospective teachers, or that these qualities increase the likelihood that children and youth will learn what society expects. The lack of compelling evidence makes ed schools vulnerable to criticism. But I’m not sure that the existing literature is all that relevant to key policy issues regarding teacher supply and demand. Alternative certification is here to stay.
eduwonkette: I haven’t figured out how I feel about the effect size argument made by the Pathways group. Sure, if you adjust for random measurement error, you’ll see bigger effects–there’s nothing exotic about that. But there’s a difference between modeling and predicting outcomes, and it seems to me that the predicted and observed outcomes will still reflect the dampening effects of measurement error.
Jonathan: There’s more than enough blame to go around on the Mathematica study, but one thing I’d point out is that the report was prepared under a contract from the Institute of Education Sciences. This means that Mathematica was contractually obligated to deliver a report to IES, which could then release the report at its discretion. In my view, the main problems with this study were in its design more so than its execution, but it can be difficult to reconstruct how and why particular design decisions were made.
RESPONSE:
I would like to focus on the point that “if you are a principal wondering whether your next hire should be a traditionally-certified or alternatively-certified teacher…keep looking.” Principals in urban school systems have no choice - which is why we have alternative routes. The very persistence of alternative routes is due to “early attrition.” But that leaves out the question WHY? Why are urban schools a revolving door for both teachers and principals? Studies by Susanna Loeb et al. (Teacher Policy Research) list many reasons for teacher attrition but they too skirt around the issue by writing that teachers leave disadvantaged schools to teach higher achieving students. I would argue that it is less about level of achievement and more about the social capital in the classroom that allows one to teach. John Merrow’s PBS series on education, (the News Hour program), has highlighted (very boldly) the many problems that face urban schools, which includes the story of a principal attempting to turn around a school. Schools reflect their social environment which in turn constrains the potential and options of policy and individuals who truly want to make a difference.
I don’t think an honest researcher should or would accept working on this or a similar study, no matter where the funding was from. It is not reasonable to assume that the effect being looked for would be measurable above the noise, even if the sample were larger.
All a study like this can do, since it cannot be completed on a scientific level, is hand people like Mr. Hirsch something to point to, as evidence for what they have already decided. Despite, of course, the study showing nothing.
Yes, the conflict of interest is immediately obvious. But the only good research I have seen recently in education has been Eduwonkette’s anti-research, showing the world that the data the DoE (and others) pump out is full of holes.
6 Comments
Subscribe to comments with RSS or TrackBack