GothamSchools — daily independent reporting on NYC public schools

data-driven decisionmaking

Why we won’t publish individual teachers’ value-added scores

Tomorrow’s planned release of 12,000 New York City teacher ratings raises questions for the courts, parents, principals, bureaucrats, teachers — and one other party: news organizations. The journalists who requested the release of the data in the first place now must decide what to do with it all.

At GothamSchools, we joined other reporters in requesting to see the Teacher Data Reports back in 2010. But you will not see the database here, tomorrow or ever, as long as it is attached to individual teachers’ names.

The fact is that we feel a strong responsibility to report on the quality of the work the 80,000 New York City public school teachers do every day. This is a core part of our job and our mission.

But before we publish any piece of information, we always have to ask a question. Does the information we have do a fair job of describing the subject we want to write about? If it doesn’t, is there any additional information — context, anecdotes, quantitative data — that we can provide to paint a fuller picture?

In the case of the Teacher Data Reports, “value-added” assessments of teachers’ effectiveness that were produced in 2009 and 2010 for reading and math teachers in grades 3 to 8, the answer to both those questions was no.

We determined that the data were flawed, that the public might easily be misled by the ratings, and that no amount of context could justify attaching teachers’ names to the statistics. When the city released the reports, we decided, we would write about them, and maybe even release Excel files with names wiped out. But we would not enable our readers to generate lists of the city’s “best” and “worst” teachers or to search for individual teachers at all.

It’s true that the ratings the city is releasing might turn out to be powerful measures of a teacher’s success at helping students learn. The problem lies in that word: might.

Value-added measures do, by many readings, appear to do the job that no measure of a teacher’s quality has done before: They estimate the amount of learning by students for which a teacher, and no one else, is responsible, and they do this with impressive reliability. That is, a teacher judged to be more effective one year by value-added is likely to continue to be judged effective the next year, and the year after that.

But this is not true for every teacher — hardly. Many teachers will be mislabeled; no one disputes this. Value-added scores may be more reliable than existing alternatives, but they are still far from perfectly reliable. It’s completely possible, for instance, that a teacher judged as less effective one year will be judged as very effective the next, and vice versa.

As we reported two years ago, when the NYU economist Sean Corcoran looked at New York City’s value-added data, he found that 31 percent of English teachers who ranked in the bottom quintile of teachers in 2007 had jumped to one of the top two quintile by 2008. About 23 percent of math teachers made the same jump.

The fluctuation is acknowledged by even the strongest supporters of using value-added measures to evaluate teachers. One of the creators of the city’s original value-added model, the Columbia economist Jonah Rockoff, compares value-added scores to baseball players’ batting averages. One of his reasons: In each case, the year-to-year fluctuations of an individual’s score are about the same.

“If someone hit, you know, .280 last year, that doesn’t guarantee they’re going to hit .280 next year,” Rockoff said today. “However, if you hit .210 last year and I hit .300, there’s a very high likelhood I’m going to hit more than you next year, too. Whereas if you hit .280 and I hit .278, we’re basically the same.”

Another challenge is that many researchers still aren’t convinced that value-added scores are measuring the right sort of teacher impact. The challenge lies in the flaws of the measures on which value-added scores depend — standardized state test scores.

Tests are supposed to measure what a student has learned about a subject, but they can also reflect other things, like how well her teacher prepared her for the test, or how well she mastered the narrow band of the subject the test assessed.

The test-prep concern is magnified by findings that a single teacher can generate two different value-added scores if evaluators use two different student tests to determine them. The Gates Foundation’s Measures of Effective Teaching study calculated value-added scores for teachers based on both state tests and more conceptual tests. They found substantial differences between the two, according to an analysis by the economist Jesse Rothstein of the University of California at Berkeley.

“If it’s right that some teachers are good at raising the state test scores and other teachers are good at raising other test scores, then we have to decide which tests we care about,” Rothstein said today. “If we’re not sure that this is the test that captures what good teaching is, then we might be getting our estimates of teaching quality very wrong.”

Flags about exactly what high value-added ratings reward are also raised by studies that ask how the ratings match up with measures of what teachers actually say and do in the classroom. Heather Hill,  professor at Harvard’s Graduate School of Education, rated math teachers’ teaching quality based on an observation rubric called the Mathematical Quality of Instruction, which looks at factors like whether the teacher made mathematical errors and the quality of her explanations. Then Hill compared the math teaching rating to value-added measures.

Two individual cases stood out: One teacher had made a slew of math errors in her teaching, and the other had failed to connect a class activity to math concepts. But teachers’ value-added scores put them at the top of their cohort.

There is some reason to think that value-added measures reflect more than test prep. Rockoff points out that while different tests can produce different value-added scores for the same teacher, the two measures are still correlated. Using different tests, he said, is akin to looking at slugging percentage rather than batting average. “I’m sure those two things are positively correlated, but probably not one for one,” he said.

More persuasively, a recent study by Rockoff and two other colleagues concluded that value-added measures can actually predict long-term life success outcomes, including higher cumulative lifelong income, reduced chance of teen pregnancy, and living in a high-quality neighborhood as an adult. The study examined an anonymous very large urban school district that bears several similarities to New York City.

That study targeted another concern about value-added measures: that teachers score consistently well year after year not because of something they are doing, but because they consistently teach students with certain advantages.

Rothstein has used value-added models to conclude that fifth-grade teachers have strong effects on their students’ performances in third-grade — something they could not possibly influence, unless value-added scores reflect not just teachers’ influence but also advantages brought by students.

Rockoff and his colleagues evaluated the possibility by testing a question. If high-value added teachers do well because they get the “better” students of those in their grade, then their students’ high test score growth would be linked with mediocre performance in other classrooms. That would mean that, when researchers looked at growth for the entire grade, the “better” students’ growth would be canceled out by their less lucky peers. But the scores were not canceled out, suggesting that effective teachers did more than just have unusually good students.

None of this means that we won’t write about what the data dump includes or that we might not publish an adapted database that strips out information linking the city’s data to individual teachers. With more than 90 columns in the Excel sheet the city has developed — and more than 17,000 rows, representing the number of reports issued over their two-year lifespan — the release might well enable us to examine the city’s value-added experiment in new ways.

Value-added measures certainly aren’t going away. City officials only stopped producing Teacher Data Reports because they knew the State Education Department is preparing its own. The measures, which are expected to come out in 2013, will make up 25% of the evaluation for teachers of math and English in tested grades.

  • NUFF SAID

    By the way didn’t the DOE say they would be releasing the names of “18,000″ teachers over 2 years of data-the first year alone was 12,000–also there are approx 80,000 teachers not 40

  • Jane S. Gabin

    Kudos to GothamSchools for not releasing names of individual teachers.  These ‘ratings’ make as much sense as the 100 ‘best’ or ‘top 10′ colleges — there are so many subjective factors that are not evident.

  • NYCParent

    Thank you for being the first grown-ups among the press on this issue!  And shame on the editorial boards (probably at the behest of Bloomberg) for pushing for their release in the first place.  What’s next, pushing for publishing students’ report cards?

  • Ellen

    Good for you…..

  • Ellen

      Don’t buy any newspapers tomorrow.  Ya gotta hit ‘em in the pockets and those of us who do not believe in the Mayor’s approach should boycott the Daily News, The Post and the NY Times (even if you do have to give up the daily crossword puzzle)

  • Michael M. (parent still)

    Less sense.  What college has ever been threatened with being closed down — er “turned around” — over such listings?

  • Michael M. (parent still)

     Switch to AMNewYork.  Get your kids hooked on KenKen.

  • NUFF SAID

    Reports are supposed to be released in the morning, it will be interesting to see if they are delayed until a 4pm data-dump time—just curious

  • Michael M. (parent still)

    Huzzahs.  GS staffers validate my faith in journalism.  It doesn’t have to be (per an SF radio station): “Digging down deep to get to the bottom to stay on top.”

  • http://www.accountabletalk.com/ Mr. A. Talk

    I already expressed my kudos to you on my blog, and I am glad to do so again here.

    As for Rockoff, he must be off his rocker. The analogy between batting averages is asinine. Two batters in the majors are all subject to the same rules, must face the same pitchers (or at least pitchers of similar quality), and play in the same parks. As teachers, we face different students from different backgrounds, as we teach different subjects in different neighborhoods with different poverty rates to students of different cultures. 

  • allan w.

    you made the right call.  incidentally, how come it’s always the teachers?   where are the calls for ranking of police officers or fire fighters?  or merit pay for them?

  • Invictus

    Wait until Obama and company have their way with colleges that receive federal aid…it will make Cuomo’s grandstanding and legislative bullying to pass the New Teacher’s evaluation deal look like little kids bullying one another for candy during recess.  

  • Clay

    Elizabeth,

    Here’s an idea for a story.

    Dennis Walcott’s daughter is a newer phys ed teacher in Queens. She was hired during the time that her license fell under a hiring freeze in her district. How and why was this allowed?

  • SuSaw

     or principals or superintendents? How and by whom are they evaluated?

  • Mama Bear

    GS, I like your standards and ethics.

    What I think is more maddening than the questionable ratings: charter school teachers’ names and ratings aren’t included. 

  • Ellen

    I think that Rockoff read “Moneyball” and decided he’d adapt James’ research on performance….which, by the way, blew away all of the old assumptions about baseball and baseball players because James looked at consistent performance and the ability to move a player from one base to the next  not homers and strikes….I’m just sayin’

  • I noticed that…

    To Phillisa and Elizabeth:  From the bottom of my heart and for all the teachers whose TDRs do not truly show the great work they do on a daily basis, thank you, thank you, thank you.  On behalf of my colleagues, the tremendous gratitude I have for both will never be greater than the integrity and ethics you are showing to the public by standing with the teachers. 

  • I noticed that…

    Ellen brought it out.  Let’s boycott the Post, DN, and the Times!  BOYCOTT, BOYCOTT,BOYCOTT,BOYCOTT,BOYCOTT,BOYCOTT,BOYCOTT,BOYCOTT,BOYCOTT,BOYCOTT!

  • Pogue

    I love your idea.  It’s an action like this that UFT leadership could very simply get behind and have an effect.

    They could say, “We are tired of being attacked and abused in the media.  We are tired of having little to no say in education issues, while those outside our institution dictate our actions.  These newspapers seem to be against the teachers, children and parents of our NYC education system and we choose not to support them anymore.”

    Bingo.

  • Natalia Warner

    Teachers are responsible for the education of the country’s youth. This is obviously a very important job. That’s why we need to evaluate them. Is that surprising?

    I am not sure anyone is opposed to evaluating firefighters and police officers (and principals) as well. 

  • Natalia Warner

    I don’t necessarily disagree with the decision by GS, but just to be clear, when someone stands with the teachers, whom are they standing against?

    While I am sure you will say it’s Bloomberg and others, the reality is that there are thousands of parents who know that their kids deserve better than they are getting and want to see this data, too.

  • Natalia Warner

    So what do you suggest? Should we not evaluate teachers?

  • Natalia Warner

    This is a ridiculous comment. Teachers are paid by the public. The public deserves to know what it is getting for its money. Teachers need to be held accountable, like a person in pretty much any other profession.

    There is no reasonable comparison to students’ report cards.

  • Guest

    I already don’t buy their papers.  

  • Guest

    I notice you keep making comments about supporting  Bloomberg and his bashing of teachers.

    As a teacher I deserve students who leave their homes fed, clothed correctly, loved and prepared (money isn’t the issue, some of the kids with the terrible home lives are kids with wealthy parents).  My students deserve these things, too.  Sadly, many of my kids come to school hungry, without winter clothes, without sleep and without love.  I have tons of kids who stay in school for hours and hours after-school because home is not a good place.  Many aren’t physically abused, but ignored and abandoned.  I wish we could post all the parents we call and who never return a call.  I wish we could post all parents who tell the teachers they don’t like their own children or ask me how the teacher can control them in school, but they can’t at home.  I wish we could post all the names of parents who tell the teacher how they are too busy to check up on their child or they are old enough to deal with it on their own (14 year olds).

    I wish we could post how our school budgets are cut so classes are huge and there is very little help for the students.  I wish we could post how principals spend resources on their pet projects and their pets and everyone else must fend for themselves.  

    Kids deserve better than they have, but don’t put this all on the teachers.  For the most part, their teachers are not part of the problem, but are part of the solution.

  • I noticed that…

    To Natalie:  If the data that indicates a child’s growth is faulty, do you feel it’s fair to child and to his/her teacher?  Don’t you want to see reliable data where you can see your child’s strengths and weaknesses?  Why are you not supporting the teachers who worked so diligently with your child?  You would rather believe the distorted data than the teachers whom you’ve spoken to regarding your child’s progress.  I feel that you are not seeing the big picture.  If data were to be produced in the newspaper about parenting skills, do you want the public to know that your parenting skills did not meet the standard?  Just imagine how the teachers feel and the grateful support that teachers are getting from the community and from other parents.  Speak to Leonie Haimson and she’ll explain why she’s against the TDR.

  • Jim

    It’s good that gothamschools is not publishing names, but it appears that you are going to post data in some manner.  Will you be listing teacher ratings without names but identifying schools? Because the data includes a teacher’s number of years of experience, whether they had a co-teacher, what years for which the teacher has data, etc, it is likely that people associated with a particular school will nonetheless be able to identify many of the teachers  (especially at smaller schools).  So while the damage won’t be as severe as, say, what the Times is planning to do (to say nothing of the Post), if you include school name you still aren’t doing enough to protect anonymity.

  • aD15parent

    Natalia,

    I have two children in high school. My 12th grader escaped much of the poor Bloomberg policies. However, my 9th grader has been scarred. My 9th grader has numerous learning disabilities but for whatever reason his schools pushed off an evaluation. The primary reason for this was that he was able to pass the standardized tests. I questioned the validity of the tests and tried to point out that there was no correlation between the scores and his class work. When the tests were made more difficult, his scores suffered. I blame the DOE and the legislators for granting mayoral control. The answer I seemed to get was that all my child needed was more test prep when what is actually needed was for him to see a learning specialist. I could not care less about test scores. What matters to me is the day to day class work. Teachers should not be penalized for test scores as I am sure there are many other students with undiagnosed learning disabilities among other things.

  • NUFF SAID

    It would help if you published the entire AYP analysis report from U. of Wisconsin for the NYC DOE—it is truly mind-boggling     

  • Michael Fiorillo

    Gotham Schools is to be commended for not publishing the data reports, but Elizabeth Green’s article makes a broad value judgement on the subject that her own reporting immediately refutes.

    How is it remotely possible that Value Added Measures “… appear to do the job that no measure of a teacher’s quality has done before: they estimate the amount of learning by students for which a teacher, and no one else, is responsible, and they do this with impressive reliability,” when two paragraphs later she refers to Sean Corcoran’s research showing their extreme unreliability? Am I missing something here?

    Leaving aside for a moment the very revealing and questionable assumptions embedded in the term Value Added – that children are “products” to be “enhanced” before their “sale” to a “customer” (www.investorwords.com/5210/value_added.html) – only in the Through the Looking Glass world of corporate ed reform could numbers like these be taken seriously. The proper response from the UFT or any honest observer would be to laugh in the face of someone who proposed them.

    Michael Bloomberg, Bill Gates, Eli Broad and all the rest would never make decisions for their own companies based on such junk data, yet teacher’s reputations and livelihoods are to be dependent on this politically and economically-driven pseudo science.

    It would take a satirist on the level of Swift or Twain to do full justice to the venality and absurdity of this entire charade.

  • Frogmugsy

    Teachers are held
    accountable. Come visit us during Parent Teacher conferences. Ask all the
    questions you want about us and your child. A lot of teachers are more than
    happy to meet their child’s parent and explain exactly what goes on in the
    classroom. 

  • NUFF SAID

    schools.nyc.gov/NR/…/TDINYCTechnicalReportFinal072010.pdf
    complete AYP analysis–good luck!!

  • allan w.

    what is surprising is that teachers’ importance is always referenced in the negative, eg.  let’s release deeply flawed reports; let’s have evaluations which rely too  heavily on test scores; and when the mayor closes schools let’s make sure the teachers get the lion’s share of the blame.  

  • Noryeln

    To stand with is not to stand against.  It would be a shame if that were your reasoning for every person who stood with someone for a principle.

  • http://perdidostreetschool.blogspot.com/ reality-based educator

    Why is anybody thanking Gotham Schools for not publishing the names of the teachers attached to the data when they are still going to publish the data which, in Elizabeth Green’s words “… appear to do the job that no measure of a teacher’s quality has
    done before: they estimate the amount of learning by students for which a
    teacher, and no one else, is responsible, and they do this with
    impressive reliability,” when the TDR’s and the VAM they are based upon do no such thing and Green even acknowledges such DIRECTLY afterwards

    Sorry, Gotham Schools is trying to float above the tabloids and media outlets that will name names on some cloud of moral sanctimony over this issue of naming names, but the fact is, they FOILed the TDR’s, they plan to publish the TDR’s and every reason they give for why they are not going to attach names to the TDR articles they post can be used as arguments for why they shouldn’t post any TDR articles at all.

  • Michael M. (parent still)

    If my kid has a fever, I’d like to know his or her temperature. 
    That doesn’t at all mean I would use a broken thermometer for lack of a working one.
    Nor would I fire the doctor should the broken thermometer give an abnormal reading.

  • bee

    The problem, Natalia, is that TDR doesn’t accurately tell the “public” what it is getting for its money. It is unreasonable to judge teachers based on flawed information, it’s even more unreasonable to publish this pack of lies. Don’t forget, Natalia, teachers are part of the public. Life isn’t black and white, there are many shades of grey. Unfortunately, many people fail to see the finer nuances involved in these issues and are quick to rush to judgement without being fully informed.

  • Flerplunk

    Would you like to know if the doctor’s thermometer’s broken?

  • http://perdidostreetschool.blogspot.com/ reality-based educator

    Yoav Gonen, ed reporter at Murdoch’s Post – TDR’s have maximum MOE of 75% for math rankings, 87% for ELA rankings.

    http://twitter.com/#!/yoavgonen/status/173080732535758849

    That would not be a measurement that, as Elizabeth Green wrote “… appear(s) to do the job that no measure of a teacher’s quality has done before: they estimate the amount of learning by students for which a teacher, and no one else, is responsible, and they do this with impressive reliability,”

    Unless Ms. Green believes MOE’s of 75% and 87% are, you know, accurate and reliable.

  • Vote NO!

     Do  you  really  believe  publicly  ridiculing  your  child’s  teacher  using faulty   data  is  going  to  result  in  your  child  getting  getting  “better”  instruction?”  Do  you  REALLY  believe  that?

  • Pogue

    People should  be wary of ineffective reporting that have high Margin of Error percentages, too.

  • kt

     This is not an evaluation.  An evaluation is done by someone who is familiar with the context of the work being done and is knowledgeable about the different aspects of the job.  An evaluation considers both quantitative and qualitative outcomes.  This is just a bit of inaccurate data that is being published out of context because it’s easier or more controversial or juicier to print lists of data than it is to try to really report on schools.

    The public is welcome to come sit in my classroom at any time — I have parents visit my class frequently and I suppose if a member of the general public wanted to make an appointment and visit, they’d be welcome too.  “The public” also delegates my evaluation as a teacher to people who are trained to do so — my principal and APs (I know not every teacher is lucky enough to work under good leadership…related issue!).  I pay taxes to pay the salaries of other teachers as well as fire, police, soldiers, road construction crews, senators, senators’ staff members, people who work for the DMV, and so on.  With the exception of senators, who I can vote for, I don’t directly supervise any of them.  I am more than happy to let the chief of police evaluate his/her own officers, and if I have a PARTICULAR problem with a PARTICULAR officer’s work or behavior, I’ll speak up then.  But I’m not demanding to see individual officers’ crime stats, because I have other things to do – namely, my own job.  None of us is able to have the time and the expertise to do everyone else’s jobs for them.  I care a lot about crime.  I may get aggravated, anxious, and upset when it seems that my neighborhood is having more muggings than usual.  But that doesn’t change the fact that I am not a police officer and don’t really have any clue about how the police could improve their work.  So my frustration at outcomes doesn’t magically make me an expert who could do the job better.

    So, yes, Natalia, teachers should, and do, get evaluated.  What I object to (and this is more academic for me, since I teach in NJ and as yet am not subject to my name being printed) is PARTS OF but not the WHOLE picture of my evaluation being published in the newspaper, out of context.  If you want to take the time to gain expertise in assessing educators, and you want to visit my classroom, study my data (all the data), read my students’ portfolios, talk to my students and their parents, THEN I would love to hear your thoughts on my job performance.  But then, if you did all of that, you would be my principal and couldn’t do your own job. 

  • Guest

    Gotham Schools please reconsider your decision.  The results expose the truth!   Teaching is about more than standardized testing.  http://www.ny1.com/content/top_stories/156599/now-available–nyc-teacher-performance-data-released-friday#ny1reports
    The same teacher can be rated Above Average and Below Average at the same time.

  • http://twitter.com/leoniehaimson leonie haimson

     link doesn’t work; is it down now or can you repost?

  • NUFF SAID

    twitter your email and i’ll send link
    schools.nyc.gov/NR/…/TDINYCTechnicalReportFinal072010.pdf
    You +1′d this publicly. Undo

  • NUFF SAID

    sent link to your twitter

  • Flerplunk
  • http://profile.yahoo.com/2LMNX4XXUJTFV2HW2PYBGPBAVY Ro

    Since when is the teacher solely responsible for student success? Are they with the student at home after school or on the weekends, summers and holidays? Are teachers providing the necessities of life for students and the support they need for emotional growth and self-esteem? Are teachers, in the 45 or 90 minutes a day, 5-days/week able to fully influence the life that students experience in order to be successful?

  • http://www.gothamschools.org Elizabeth Green

    We agree with you that not connecting value-added ratings to teachers’ identities is not as simple as wiping out names, and we’re working on figuring out what to do.

  • Dianne

    I agree with Ro.  It takes a village to educate the whole child.  This issue is not about rating teachers, it is about politicians taking over education.  This is not about improving the quality of education or allowing us to better compete with other nations, it is about political grandstanding by people who have little background in education.  Parents, educators and everyone concerned about the future of this nation must contact their local representatives and let their opinions be known.  Let’s focus on improving education not blaming one group for all the ills of society.

Tips, questions, feedback?

Contact us at .

Word from Our Sponsor

Follow GothamSchools

RSS
Subscribe to the daily email digest:

Chalk It Up

Recent Comments

10 comments so far today

Archives

May 2013
M T W T F S S
« Apr  
 12345
6789101112
13141516171819
20212223242526
2728293031