Post a comment about the budget cuts at your school on our interactive comment map. more »
Writing in the Autumn, 2009 issue of the City Journal, Marcus Winters seeks to blame the “narrow political interests” of teachers’ unions for resisting the linkage of test scores to teachers, and thereby blocking New York access to the Race to the Top honeypot. He’s seen the future, and it’s a data revolution resting on standardized tests. This data revolution “promises to move education policy away from politics,” Winters writes. “Numbers don’t have agendas or run for reelection.”
No, of course they don’t. But the people who produce those numbers do. We would all be wise to recognize that the veneer of scientific objectivity coating most standardized tests is paper-thin. Politics infuses the form that standardized tests take; their length; how they are scored, and by whom; the content standards that appear on the tests; and the judgments about which levels of performance are to be labeled proficient.
Here’s what I saw at the data revolution: (more…)
Last night, at the GothamSchools party, I had the opportunity to say hello to David Cantor, Press Secretary for the New York City Department of Education. As he turned to talk with an angry parent, a piece of paper fell out of his pocket, and I picked it up. It looked like a draft of the press release he issued for the release of the 2009 NYC NAEP math scores, but it was all marked up. Could I have found his annotations as he was drafting the press release?
Chancellor Klein Applauds New York City Public School Students For Six Years of Sustained and Significant Gains in Math on National Exam (Let’s get that “six years” in at the start, to make it look like the growth has been steady, rather than stalled over the past two years.)
City Students Outperform the Rest of the State and Nation on the National Assessment of Educational Progress (”Outperform”? Only in the sense that NYC fourth-graders scored almost as high as students in the nation overall, and were significantly lower than eighth-graders nationally. But it’s a headline, and who pays attention to them, anyway?)
Record Number of Students Performing at or Above Proficiency
Chancellor Calls on State to Adopt More Rigorous Standards to Ensure Further Progress
Schools Chancellor Joel I. Klein today applauded consistent and sustained gains by New York City public school students on the 2009 National Assessment of Educational Progress (NAEP) math exam. (Consistent and sustained might be a stretch, but maybe it’ll pass.) (more…)
What is it about the Harlem Children’s Zone that causes pundits and reporters to suspend disbelief? Perhaps it’s the deep desire for evidence that the large and persistent racial gap in educational achievement can be overcome. The enduring racial inequalities in educational and social outcomes in the U.S. are a blight on our society, and evidence that these inequalities can be eliminated, however, tenuous, can be elevated into the feel-good story of the year.
Last night, Anderson Cooper reported on the Harlem Children’s Zone for the CBS newsmagazine 60 Minutes. “For years, educators have tried and failed to get poor kids from the inner city to do just as well in school as kids from America’s more affluent suburbs,” he began. “Black kids still routinely score well below white kids on national standardized tests. But a man named Geoffrey Canada may have figured out a way to close that racial achievement gap.” Cooper asked Canada, “So you’re trying to level the playing field between kids here in Harlem and middle class kids in a suburb?” “That’s exactly what we have to do,” Canada replied.
As is customary, Cooper spoke with Harvard economist Roland Fryer, who has analyzed the achievement of students attending the HCZ Promise Academy charter schools. Fryer said, “At the elementary school level, he closed the achievement gap in both subjects, math and reading.”
“Actually eliminating the gap in elementary school?” Cooper asked.
“We’ve never seen anything like that. Absolutely eliminating the gap. The gap is gone, and that is absolutely incredible,” Fryer said. (more…)
Monday afternoon, I had the opportunity to respond to Merryl Tisch, Chancellor of the Board of Regents, and David Steiner, the New York State Commissioner of Education, as they talked about the future of P-16 education in New York State at the Phyllis L. Kossoff Policy Lecture at Teachers College, Columbia University. I wasn’t sure what they’d say, so prepared some remarks responding to the proposals regarding teacher education in New York State that the Commissioner presented to the Board of Regents a few weeks ago. For the handful of readers who might be interested, here’s what I wrote. (Due to time constraints, I didn’t say all of this at the event.) Chancellor Tisch and Commissioner Steiner were quite willing to hear and engage with the critiques that my colleague Lin Goodwin and I offered, and I look forward to continuing this conversation with them.
It’s no surprise that the State Education Department and the Board of Regents have taken up the cause of ensuring an equitable distribution of highly-qualified teachers across New York State. The key justification for such a goal is the fact that the K-12 education system is shortchanging our children. Although some students are highly successful, many more are not, and the problems are concentrated in urban school systems serving large numbers of poor children of color.
If that’s the problem, is improving the education of teachers the solution? It’s certainly part of the solution, given what we know about the centrality of teaching to student learning. But it’s by no means the entire solution, as a great many other forces shape student outcomes. For example, a great teacher can’t compensate for a child coming to school hungry, and great teaching of an out-of-date curriculum only results in great mastery of out-of-date knowledge. I trust that Chancellor Tisch and Commissioner Steiner are not seduced by claims that the single most important determinant of a child’s achievement is the quality of his or her teachers, because that’s simply not true. Family background continues to be the dominant factor. But the quality of teachers is, at least in theory, something that is manipulable via education policy initiatives, and it’s a lot more tractable than addressing the fact that one in five children under the age of 18 in New York State live below the poverty line. (more…)
I’m not sure how much credibility the Progress Reports at the heart of the NYC Department of Education’s accountability system have left. The elementary and middle school Reports issued earlier this fall were ridiculed for their inability to distinguish one school from another, since 97% of the school’s received A’s or B’s (and 84% received A’s). Moreover, I showed that the student progress measures that make up 60% of a school’s overall score were highly unreliable from one year to the next. As long as these reports are tied to year-to-year changes in state test scores, they’re likely to be fatally flawed.
On Monday, the Department released the 2008-09 Progress Reports for high schools. Anna Phillips reported that Chancellor Joel Klein said that the high school Progress Reports were more stable and accurate than those for elementary and middle schools because they’re based on multiple measures. Huh? Welcome to the party, Chancellor Klein. I hate to tell you that measures such as credit accumulation are not necessarily accurate measures of a school’s contribution to student learning and development.
But the high school Progress Reports have a bigger problem. Three-quarters of a school’s score comes from a school’s location in relation to a group of 40 peer schools. The idea of comparing a school to peer schools is to create an “apples to apples” comparison. It’s actually a good feature of the Progress Reports that they seek to compare a given school to how schools across the city are doing as well as to how schools that serve similar students are performing. (more…)
A few years ago, the New York State lottery’s slogan was “Hey, you never know.” In its original formulation, the slogan sought to motivate New Yorkers to play the lottery, a game of chance, on the grounds that you never know unless you play if you are a winner. But the slogan is a double entendre when applied to Caroline Hoxby’s highly-publicized study of the effects of attending a charter school in New York City. Propelled by Hoxby’s forceful claims about the superiority of lottery-based research on charter schools, much of the mainstream media has concluded that we now know definitively that New York City charter schools outperform their traditional counterparts—in spite of the fact that her study has not undergone a rigorous peer review process that might identify problems in the study and ways of addressing them. Today, however, an equally forceful critique prepared by Sean Reardon of Stanford University argues that Hoxby’s research is anything but definitive. Citing flaws in the statistical analysis of the report, Reardon writes that it “likely overstates the effects of New York City charter schools on students’ cumulative achievement … It may be that New York City’s charter schools do indeed have positive effects on student achievement, but those effects are likely smaller than the report claims.”
Reardon is careful to point out that it’s not possible, based on the information provided in Hoxby’s report and associated documents, to judge the extent of the bias in Hoxby’s estimates of charter school effects on student achievement. More than anything, he calls for reserving judgment until more information about the study, its data and methods are available, and until the study has undergone rigorous peer review. Until then, he maintains, it would be unwise to rely on the statistics reported in the study, and the inferences Hoxby and her colleagues draw about charter school effects in New York City.
Here I’ll mention two of the features of Reardon’s critique that I find particularly persuasive. The first is that Hoxby used an inappropriate set of statistical models to analyze the data, which likely distorts the charter school effects. You might be surprised to learn that Hoxby used statistical models at all. If her results are based on comparing students who won a charter school lottery with students who lost the lottery, and the lottery was fair, balanced and random, why would a model be needed? It seems like the charter school effect would simply be the difference in the outcomes observed for the lottery winners and the lottery losers. But comparing lottery winners and losers isn’t really estimating an individual causal effect, because an individual student can’t simultaneously be enrolled in a charter school and a traditional public school. Even in the context of a lottery, or any other kind of study that can capitalize on a randomization process, such as a clinical drug trial, statistical models come into play to allow for inferences about cause-and-effect relationships. These inferences are always made in relation to a particular statistical model, and all such models have assumptions. (more…)
Is there anything that gets people’s dander up faster these days than comparisons of charter schools and traditional public schools? On Thursday, reporter Meredith Kolodner filed a story in the Daily News on the relative performance of charter schools and what the NYC Department of Education calls “district” schools. A fall, 2009 presentation emanating from the Department’s Office of Charter Schools, and posted on its website, reported on the charter school landscape in New York City, including the growth and location of charter schools, the composition of students attending them, the DOE’s accountability framework for evaluating charter schools, and some evidence on how charter schools were faring on the School Progress Reports, the crown jewel in the DOE’s accountability system. (Regular readers may know that I’ve been critical of key features of the Progress Reports for elementary and middle schools.)
Kolodner drew attention to the fact that although elementary and middle school charter students had higher rates of proficiency on the state math and English Language Arts assessments this year, charter schools on average had a lower score on the progress component of the School Progress Reports. And since the progress component makes up 60% of the overall score, charter schools also had lower overall scores on the Progress Reports than did district schools. She quoted Patrick Sullivan, an appointed member of the Panel for Educational Policy that the DOE describes as its governance body, on the meaning of this pattern. “Either the progress reports are invalid,” Sullivan said, “or charter schools are lagging.”
The Daily News article and a subsequent posting by Sullivan on the NYC Public School Parents blog prompted a quick reply from Peter Murphy, Director of Policy & Communications for the New York Charter Schools Association (NYCSA), here and here. Murphy called into question the metric used by the DOE in its Progress Reports, especially the fact that student performance only counts for 25% of the overall score, whereas student progress counts for 60%. This, he contended, is “woefully lopsided,” and unfairly penalizes schools that have had students scoring high for several years running. If I read his second posting correctly, he concludes that the progress reports indeed are invalid. (more…)
Down in DC yesterday, Chancellor Michelle Rhee faced sharp questioning from the D.C. Council about her office’s handling of hirings and layoffs of teachers and other staff members over the past several months. The DC Public Schools hired 934 teachers during the spring and summer, with an average age of 32. Faced with a budget shortfall of $43.9 million in the 2010 budget, Rhee announced the layoffs of 266 teachers and other staff on October 2nd.
Critics wondered why this budget shortfall wasn’t identified earlier, before such widespread hiring, and some have questioned whether this pattern of hirings and layoffs was intentionally orchestrated to replace older, veteran teachers with younger, less-costly ones. DCPS, under Chancellor Rhee’s name, posted on October 7th a list of Frequently Asked Questions Concerning the Budget Shortfall and Staffing Reductions. One of the questions was: Did you target veteran teachers? (more…)
I missed Secretary of Education Arne Duncan’s speech at Teachers College on Thursday because I was working on his behalf in Washington. I was one of about 17 researchers on a panel evaluating a batch of research proposals on school reform for the Institute of Education Sciences (IES), the research arm of the federal Department of Education. IES seeks to identify malleable factors (e.g., education programs, policies and practices) that can improve education outcomes. To do so, IES has developed a progressive goal structure for research projects. Goal One projects are exploratory, and intended to inform the development of interventions by examining existing relationships between policies and practices and educational outcomes. Goal Two projects are intended to develop innovative educational interventions that can be implemented in school settings, and to collect some preliminary data on the educational outcomes observed in a pilot implementation of the intervention. Goal Three projects use rigorous methods to examine the efficacy of fully-developed interventions, as well as the feasibility of implementation, in at least one local site. And finally, Goal Four projects attempt to evaluate whether interventions proven to be successful in a local site, with help from the program developers, can be scaled up to be effective under different conditions, and without the direct involvement of the program developers. (There’s also a Goal Five, for research on measurement, but that’s a different animal.) Over the years that IES has had this a goal structure, more than 70% of the projects funded under Goals One through Four have been Goal One or Goal Two projects; about one-quarter have been Goal Three projects, and only 3% have been Goal Four projects.
The reasons for this are pretty clear. To be a good prospect for scaling up in a Goal Four project, an intervention must previously have been shown to be effective in at least one site, using rigorous methods for assessing cause-and-effect relationships. Relatively few interventions meet this threshold, because most policies and programs don’t have educationally meaningful effects, even if it seems like they ought to. Similarly, projects that are good candidates for Goal Three funding must previously have shown at least some evidence of effects on student outcomes in pilot studies in which the intervention received a tentative tryout, but not a full-blown test using rigorous experimental or quasi-experimental research methods.
I was struck by a thought experiment: what if my panel of distinguished researchers (the other members, at least) had been presented with a proposal based on the Race to the Top criteria that Secretary Duncan talked about at Teachers College, and which have been acclaimed by opinion writers such as Nick Kristof and David Brooks, as well as the editorial page writers for major newspapers in New York City and around the country? The draft Race to the Top criteria for funding state proposals provide incentives for linking teachers to their students’ standardized test scores, and in his remarks on Thursday, Secretary Duncan drew attention to Race to the Top incentives for states and districts to link student performance to the teacher preparation programs from which students’ teachers had emerged. Only Louisiana currently does this, the Secretary said. What if a scale-up proposal for this intervention had been presented to a panel charged with applying the IES criteria to evaluate its fundability? (more…)
Today’s New York Daily News published a bold editorial on the progress of New York City schoolchildren under the administration of Mayor Mike Bloomberg and Chancellor Joel Klein. “You would be better off arguing that the world is flat, or that the sun revolves around the Earth, than to dispute that New York City kids are performing better and better in school,” writes the Daily News, crowing that there are “fresh and incontrovertible data” pointing to what the newspaper refers to as a “sea change” in New York City.
They might have wanted to wait a day.
This morning, the U.S. Department of Education released the 2009 results of the National Assessment of Educational Progress assessments of fourth-grade and eighth-grade mathematics in each state and for the nation overall. Nationally, fourth-grade performance held steady from 2007 to 2009, and there was a slight but statistically significance over this period in eighth-grade math performance. In New York State, the small declines in fourth-grade and gains in eighth grade were not statistically significant, leading to the conclusion that there has been no change in the performance of New York students on the NAEP math assessment from 2007 to 2009.
This is a very different story than the one told by New York’s own assessment system, on which the Bloomberg and Klein administration has staked its claims about the great progress in student achievement. The average scale score in fourth-grade mathematics increased from 680 in 2007 to 689 in 2009, a hefty 9 points; the jump in eighth-grade scores was even more dramatic, as the average scale score rose from 657 in 2007 to 675 in 2009, a remarkable increase of 18 points.
To put these two sets of numbers in context, the chart below shows the gains in fourth-grade and eighth-grade math performance from 2007 to 2009 expressed in standard deviation units (i.e., the amount of variation among individual students in 2007). According to NAEP, fourth-graders’ performance fell .07 standard deviations from 2007 to 2009, a difference that is not significantly different from zero. In contrast, fourth-graders gained .23 standard deviations on the New York State assessment from 2007 to 2009. Similarly, the NAEP results indicate that eighth-graders in New York gained .08 standard deviations from 2007 to 2009 in math performance, a difference that is not significantly different from zero, but they gained .47 standard deviations over this period on the New York State test.
Another way of comparing the implications of the two different sets of test results is to think about where the average student in 2009 would have scored in 2007. Based on these standard deviations, and assuming that the scores follow a bell-curve distribution, the New York State scores indicate that the average fourth-grader in 2009 scored at the 59th percentile of the 2007 fourth-grade distribution, which is a pretty big jump. The increment for eighth-graders is even more striking: the average eighth-grader in 2009 scored at the 68th percentile of the 2007 eighth-grade distribution, based on the New York State tests. In contrast, the NAEP data indicate that the average New York fourth-grader in 2009 scored at the 47th percentile of the 2007 distribution of fourth-grade math performance in New York State, and the average eighth-grader in 2009 scored at the 53rd percentile of the 2007 eighth-grade distribution.
How can we explain these differences? There are lots of possible explanations, but most of them don’t hold up under close scrutiny. The two tests are taken by similar populations of students under similar conditions, and the grade-level mathematics standards on which the two assessments are based do not differ dramatically. The NAEP test is a low-stakes test, which might result in students not taking it seriously, but the statisticians who oversee the NAEP testing program look for patterns suggesting this, and find little evidence of it. It’s extremely unlikely that there’s rampant cheating going on in the New York State testing system that could explain the differences.
It’s possible that the New York State tests have been getting easier over time. I have yet to see definitive evidence ruling this out. There also is strong suggestive evidence of “score inflation” in the New York State tests, because there are predictable patterns in the standards which appear on the state tests year after year, with some standards showing up repeatedly each year, and some standards having never been tested at all during the life of the testing program. Schools and teachers can make use of these patterns, which also show up in the format of test questions covering particular standards, to focus their instruction on the subset of standards that crop up again and again. Because the New York State tests never test some standards, we have no idea about whether students have mastered them. In contrast, the design of the NAEP assessment allows for a much broader picture of mathematics performance, because so many more standards and test item formats are incorporated into the test.
Whatever the reason, the discrepancy between the NAEP trends and trends in the NewYork State test scores raises serious questions about what the New York tests are telling us about the academic performance of students in New York State. The same, of course, goes for New York City. We’ll see NAEP scores for New York City in a month or so, but it’s unlikely that they will yield a different story than what I’m describing here.
Is the Earth flat? No. But New York State test scores, and probably New York City scores, are.