GothamSchools — daily independent reporting on NYC public schools

Eye on Education

Reasonable Doubt

I’ve been relatively quiet in the ongoing debate about how best to evaluate teachers in New York City and across New York State. I’m not close to the negotiations and can claim no expertise on the political machinations outside of public view. At its heart, this seems to me a dispute over jurisdiction: Who has the legitimate authority to regulate the work of an occupation that seeks the status of a profession—but one that is in a labor-management relationship?

The laws of New York recognize the labor-management fault line, but they do little to guide a collective-bargaining process toward agreements in the many districts in which teacher-evaluation systems are contested. Each side brings a powerful public value to bear on the disagreement.

For the employers, it’s all about efficiency. It’s in the public interest, they argue, to recruit, retain and reward the best teachers, in order to maximize the collective achievement of students. A teacher-evaluation system that fails to identify those teachers who are effective, and those who are ineffective, can neither weed out consistent low-performers nor target those who might best benefit from intensive help. Rewarding high-performing teachers can, in the short run, help keep them in their classrooms, they claim, and, in the long run, can help expand the pool of talented individuals who enter the occupation.

For teachers, the key concern is fairness. Fairness is primarily a procedural issue: Teachers, and the unions that represent them, seek an evaluation process that is neither arbitrary nor capricious, relying on stable and valid criteria that they believe accurately characterize the quality of their work. In this view, an evaluation process is unfair to the extent that it can be manipulated by a building administrator or school district to yield a particular rating for a teacher’s performance. It is also unfair if random factors beyond a teacher’s control unduly influence the evaluation of his or her performance.

The values of efficiency and fairness collide head-on in New York’s Education Law §3012-c, passed as part of the state’s efforts to bolster its chances in the 2010 Race to the Top competition. The law requires annual professional performance reviews (APPRs) that sort teachers into four categories—“highly effective,” “effective,” “developing” and “ineffective”—based on multiple measures of effectiveness, including student growth on state and locally selected assessments and a teacher’s performance according to a teacher practice rubric.

The fundamental problem is that it’s hard to assess the efficiency or fairness of an evaluation system that doesn’t exist yet. There are too many unknowns to be able to judge, which is one of the arguments for piloting an evaluation system before bringing it to scale. The properties of the state tests that are to be used to assess teachers’ contributions to student learning are a moving target; the tests have been changing in recent years in response to concerns about their difficulty, predictability and coverage of state curricular standards. And in a couple of years, those standards and assessments will change, as New York and many other states phase in the Common Core standards and new assessments designed to measure mastery of them. The models to estimate a teacher’s position relative to other teachers in contributing to students’ test performance are imprecise at the level of the individual teacher, and different models yield different results for a given teacher. There’s been little to no discussion of how to incorporate this uncertainty into the single numerical score a teacher will receive.

The evaluation of teachers’ practices via classroom observations using New York State Education Department (NYSED)-approved rubrics, such as Charlotte Danielson’s Framework for Teaching or Robert Pianta’s Classroom Assessment Scoring System, is another unknown. There’s evidence that with proper training, observers can reliably rate teachers’ classroom practices, but the nature of the training is critical, and there is no evidence to date of New York City’s ability to prepare more than 1,500 principals, or the principals’ “designees,” to carry out multiple observations of many teachers, teaching many different school subjects, each year.

Amazingly, there is even uncertainty about whether the evaluations can or should be based solely on a teacher’s performance in a single year. The statute creating the new evaluation system in New York describes it as an “annual professional performance review.” But is this a professional performance review that occurs annually, or a review of annual professional performance—that is, a teacher’s performance in the most recent year? The guidance provided by the NYSED suggests that it has no idea. “For 2011-12, only one year of teacher or principal student growth percentile scores will factor into each educator’s evaluation,” the guidance states. “When more years of data are available, NYSED will consider whether each evaluation year should include more than one year of educator student growth results. Empirical and policy considerations will determine the decision.”

Well, that certainly clarifies matters. In other words, a “bad” year where a teacher is ranked relatively low compared to other teachers might reverberate, affecting his or her ranking in subsequent years. But a good observational rating in a given year seemingly will have no spillover effect into subsequent years. If, as has been true in Washington, D.C.’s IMPACT teacher-evaluation system, teachers generally score higher on observational ratings than on their value-added or growth-score rankings relative to other teachers, the carryover for value-added performance—but not observations of teachers’ professional practices—appears unfair. And in D.C., this evaluation system has resulted in the termination of hundreds of teachers based on one or two years of performance.

Teacher-evaluation systems have multiple purposes, which might include certifying teachers as competent or selecting some for particular forms of professional development to enhance their professional practice. For most of these purposes, it’s essential that those with a stake in the education system view these evaluation systems as legitimate—and the perceived efficiency and fairness of an evaluation system are central to such judgments. It’s not hard to see why a great many teachers, in New York City and across the state, have serious doubts about the fairness of New York State’s APPR process. And if future teachers do as well, the process could have the unintended consequence of reducing, rather than increasing, the pool of individuals willing to consider teaching as a vocation. This, coupled with the more than 1,300 principals across the state who have raised questions about the efficiency of the process, illuminates the challenges confronting the state as it seeks to implement the APPR system and avoid a scolding from U.S. Secretary of Education Arne Duncan.

William Blackstone, an 18th-century English legal scholar, wrote “better that ten guilty persons escape than that one innocent suffer.” Benjamin Franklin, one of the founders of our country, later upped the ante to 100 to one. The principle captures squarely the trade-off between the value of efficiency and the value of fairness. A legal system that lets the guilty go free is inefficient, as these offenders are free to continue to transgress against the common good. But to Franklin and others, that was still preferable to a legal system that did not provide adequate procedural protections for all, whether innocent or guilty, because such a system would be inconsistent with the principle of fairness so central to the American polity.

It’s important to note that Blackstone and Franklin were concerned with the workings of government; fairness in the private sector was not a central concern, and efficiency was taken for granted as a consequence of market forces. Civil servants, as agents and employees of the state, arguably are subject to a different set of rights and responsibilities than those working in the private sector, and teachers are one of the largest groups of such public servants. What’s an acceptable tradeoff between efficiency and fairness in the mix of teachers’ rights and responsibilities? It’s a lot easier to speculate about percentages in the abstract than to confront the possibility that you, or someone close to you, might be out of a job because of an untested teacher-evaluation system that cuts corners on fairness.

This post also appears on Eye on Education, Aaron Pallas’s Hechinger Report blog.

  • http://www.facebook.com/people/Leonie-Haimson/1094324158 Leonie Haimson

    Where’s the evidence for this claim — which you deem “efficient”: “Rewarding high-performing teachers can, in the short run, help keep them
    in their classrooms, they claim, and, in the long run, can help expand
    the pool of talented individuals who enter the occupation.”  I don’t see any research to back up the support for merit pay.  And the fact that teachers themselves overwhelmingly reject the notion of merit doesn’t help.

  • Aaron Pallas

    Leonie,
    I’m stating that proponents of teacher evaluation systems make these kinds of arguments, not asserting that there’s evidence to support them. The fact that there are no definitive evaluations of the impact of such teacher evaluation systems on teacher recruitment and retention, teaching practice, or student learning is an argument against racing to implement them at full-scale.

  • loyal Times reader

    Hmm… could this mean that Sam Dillon was making it up when he said that “many experts” believe that lack of merit pay is the reason for high teacher turnover? Perhaps he was referring to experts (such as Joel Klein, Michelle Rhee, or himself) that don’t need evidence to support their opinions

  • http://www.facebook.com/people/Leonie-Haimson/1094324158 Leonie Haimson

     Indeed, there is little evidence that the teacher evaluation system that is being proposed is either efficient or fair.  Clearly its proponents might argue otherwise, but others might point out that it will take many hours of observation and probably many new tests, with uncertain results.  Check out Carol Burris’ piece in the WaPost called ” Forging ahead with nutty teacher evaluation plan” at http://goo.gl/3KwQU for the view of many principals on this point. 

  • Gaetano

    The variable of the student does not seem to be discussed in this piece, nor more importantly in state law or contractual agreements.  Can teachers working with different students and different groups of students be evaluated as if they are all working with the same thing, here students?  Text books may read the same, and written tests may be written and graded the same, and even if two people, here teachers, could be the same, the end result would be different as the individuals in the classes are different from each other by biology, aptitude, rearing, and home and value system.  This is so within a school, between school years, and even more dramatically so across neighborhoods, communities and societies.

    In short, and in fact, there are far too many variables in and between students and student populations  to make it possible to fairly or accurately evaluate teachers based upon student tests scores.  It is, and always will be, an occupation, profession, vocation whose effect on lives and the society from which those lives come cannot be addressed numerically nor with a checklist.  Only time and individuals can speak with authority on the quality of the teachers in their lives.

  • http://www.thewastedgenius.com/ bill greene

    The innumerable difficulties involved in designing a formal and mandated evaluation process may well suggest that the task is not worth pursuing. All attempts to establish standards, record them, publish reports, evaluate them, and act on them, will yield flawed results and will divert valuable time from actual teaching activities to wasted and perhaps harmful bureaucratic paper pushing.   

  • http://www.thewastedgenius.com/ bill greene

    When I served on a VocTec school committee, they needed a full-time ass’t. sup’t. to document the misdeeds of one very poor teacher.  No evaluations were needed. The administration knew this individual was a very poor teacher, but union rules prevented them from firing the guy. It took a couple years and a documented and proven 3rd offense to gain traction. Still, the union sued the Board. When I finished my term, the teacher was at home, collecting full pay, while the trials continued to suck up adminstrator’s time! Give the principals the power to hire and fire and the evaluation problem will disappear.

  • http://twitter.com/nycdoenuts nycdoenuts

    This is certainly a very compelling story. But the problem described within it cannot simply be attributable to one poor teacher. 
    At some point, that teacher was a probationary teacher. He received good reviews from his administrators -enough that he was given tenure (the status that  teachers have in order for a removal proceeding to have to go through those documentation requirements that you described). 

    So the story also implies poor administrators as well. They must be at some point accountable when reflecting on stories like this. Because those same principals (that you assert should just be able to hire and fire someone at will) also gave this guy three years worth of good reviews and approved his tenure. That cannot be lost on a scenarios like this.

    Now imagine a principal just like that; one who failed to make a proper determination about a probationary teacher to begin with. Do you A)give that principal support and the autonomy to identify and remove teachers like these? Or B) Take all of that autonomy away and convert the entire system into a score of 1- 100, with strange mathematical assumptions (like 10% of his teachers are going to be ineffective, period).

    It seems as though your answer is the former. That’s great. Because if the entire population of building leaders are going to be infantilized to the extent that their job is to just check things off on a rubric, then how is anyone going to be qualified enough to decide what a good teacher or isn’t in, say five years. Or ten? Or 20?

  • MG

    That’s a very interesting historical nugget — the 10 versus 100 go free question.  

    I wonder…Do you believe these tradeoff values should also be applied to teacher evaluations of their students?  Particularly the decision on whether a kid passes for the year or not?  I.e., would you support overruling all teachers who hold kids back, because better to let 10 or 100 kids get socially promoted than to advance a student who was simply misjudged?  

  • Celia Oyler

    MG: there’s not a scrap of evidence (from 40 years of educational research) that grade retention results in better outcomes for students. Rather: it does not increase learning (by any measure) and it results in a higher drop-out rate and poorer post-secondary outcomes. 

  • http://www.thewastedgenius.com/ bill greene

    The “merit” motivation for all employees, teachers being no exception, should be  simply that if your superiors think you are a valuable employee you might get a raise, or not be let go. Instead, teachers get double raises every year regardless of merit! An automatic “step” increase every year for their added year of “experience,” and a cost of living raise to match inflation. In the private section there is not even one automatic increase 

Tips, questions, feedback?

Contact us at .

Follow GothamSchools

RSS

Chalk It Up

Recent Comments

0 comments so far today

Events Calendar

Our Twitter Updates

Archives

May 2012
M T W T F S S
« Apr  
 123456
78910111213
14151617181920
21222324252627
28293031