<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: New York vs. New York</title>
	<atom:link href="http://gothamschools.org/2009/03/17/new-york-vs-new-york/feed/" rel="self" type="application/rss+xml" />
	<link>http://gothamschools.org/2009/03/17/new-york-vs-new-york/</link>
	<description></description>
	<lastBuildDate>Thu, 09 Feb 2012 22:54:00 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
	<item>
		<title>By: Michael M.</title>
		<link>http://gothamschools.org/2009/03/17/new-york-vs-new-york/comment-page-1/#comment-76072</link>
		<dc:creator>Michael M.</dc:creator>
		<pubDate>Mon, 23 Mar 2009 03:57:30 +0000</pubDate>
		<guid isPermaLink="false">http://gothamschools.org/?p=11363#comment-76072</guid>
		<description>Oop.  &quot;Pyrrhic.&quot;  Darn public school education.  ;-)</description>
		<content:encoded><![CDATA[<p>Oop.  &#8220;Pyrrhic.&#8221;  Darn public school education.  <img src='http://gothamschools.org/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /> </p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Michael M.</title>
		<link>http://gothamschools.org/2009/03/17/new-york-vs-new-york/comment-page-1/#comment-76064</link>
		<dc:creator>Michael M.</dc:creator>
		<pubDate>Mon, 23 Mar 2009 03:54:32 +0000</pubDate>
		<guid isPermaLink="false">http://gothamschools.org/?p=11363#comment-76064</guid>
		<description>Re Deputy Chancellor Cerf&#039;s (per Aaron Pallas), &quot;...a pay-for-performance system for all 40 community and high school superintendents...&quot;
Please see CECD2&#039;s Policy Analysis on the role of the Superintendent (see link).

Excerpt: &quot;As is the case in all districts throughout the Department of Education, our Superintendent has been deployed to schools outside of her home district -- District 2 -- to serve as Senior Achievement Facilitator (“SAF”).  Upwards of 90% of the Superintendent’s time is spent on her SAF duties, coaching inquiry teams outside of our district on how to use data-driven instruction to improve instruction, student achievement, and test scores, leaving little time to fulfill her duties in our district.&quot;

So I ask: based on what performance is any superintendent&#039;s pay based -- software trainer?

The Chancellor has long tried to abolish District Superintendents.  I believe this has been challenged in court.  As I understand, the plaintiffs won.  But it was a Pyrhhic victory -- Klein has simply assigned them other duties 90% of the time, and effectively replaced them with School Support Organizations that he forces the Principals to pay for.</description>
		<content:encoded><![CDATA[<p>Re Deputy Chancellor Cerf&#8217;s (per Aaron Pallas), &#8220;&#8230;a pay-for-performance system for all 40 community and high school superintendents&#8230;&#8221;<br />
Please see CECD2&#8242;s Policy Analysis on the role of the Superintendent (see link).</p>
<p>Excerpt: &#8220;As is the case in all districts throughout the Department of Education, our Superintendent has been deployed to schools outside of her home district &#8212; District 2 &#8212; to serve as Senior Achievement Facilitator (“SAF”).  Upwards of 90% of the Superintendent’s time is spent on her SAF duties, coaching inquiry teams outside of our district on how to use data-driven instruction to improve instruction, student achievement, and test scores, leaving little time to fulfill her duties in our district.&#8221;</p>
<p>So I ask: based on what performance is any superintendent&#8217;s pay based &#8212; software trainer?</p>
<p>The Chancellor has long tried to abolish District Superintendents.  I believe this has been challenged in court.  As I understand, the plaintiffs won.  But it was a Pyrhhic victory &#8212; Klein has simply assigned them other duties 90% of the time, and effectively replaced them with School Support Organizations that he forces the Principals to pay for.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: leonie haimson</title>
		<link>http://gothamschools.org/2009/03/17/new-york-vs-new-york/comment-page-1/#comment-76055</link>
		<dc:creator>leonie haimson</dc:creator>
		<pubDate>Mon, 23 Mar 2009 03:49:36 +0000</pubDate>
		<guid isPermaLink="false">http://gothamschools.org/?p=11363#comment-76055</guid>
		<description>If its indeed true that the 4th and 8th grade NY state tests changed substantially in 2006, this is yet another reason to rely on the NAEPs to give us a more reliable evaluation of progress, since the NAEPs have been stable over time.  And we know what the NAEPs show in terms of NYC&#039;s lack of progress  since 2003.</description>
		<content:encoded><![CDATA[<p>If its indeed true that the 4th and 8th grade NY state tests changed substantially in 2006, this is yet another reason to rely on the NAEPs to give us a more reliable evaluation of progress, since the NAEPs have been stable over time.  And we know what the NAEPs show in terms of NYC&#8217;s lack of progress  since 2003.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Aaron Pallas</title>
		<link>http://gothamschools.org/2009/03/17/new-york-vs-new-york/comment-page-1/#comment-75861</link>
		<dc:creator>Aaron Pallas</dc:creator>
		<pubDate>Mon, 23 Mar 2009 01:57:35 +0000</pubDate>
		<guid isPermaLink="false">http://gothamschools.org/?p=11363#comment-75861</guid>
		<description>I had hoped to pull together some new analyses before responding to David Cantor, but it’s taking a bit longer than I expected, and, not surprisingly, blogs abhor a vacuum.  So here’s an interim response.

Cantor is correct that the changes that were introduced into the state’s 4th grade and 8th grade tests in 2006 create a discontinuity in the tests administered before and after 2006.  Until 2006, the 4th grade tests included some content that was at the 2nd and 3rd grade levels.  With the introduction of state tests in grades 3, 5, 6 and 7 in 2006, the 4th grade test became more difficult.  Analyses indicate that the 8th grade test did not become more difficult in 2006.  It’s true that there was no direct equating of the 2006 and 2005 assessments, but the scores were linked through an equipercentile linking procedure.  Because the means and standard deviations of the 2005 tests and the 2006 tests on the 2005 scale were virtually identical, I don’t think that the discontinuity has distorted the NYC-NYS comparisons.  I’m more worried about variability from year to year in the standard deviation of the scale scores, which I suspect stems from the use of number-right scoring to derive the scale scores.  But I’ll concede the point, and will seek to address it in my next analysis of these data.

As for the separation of New York City from New York State, Mr. Cantor should be careful for what he wishes for.  Since New York City scores are lower than New York State scores overall—for every grade level, at every time point—a little bit of algebra will show that the New York State scores with NYC removed will be even higher than the NYC scores at every time point.  And a little bit more algebra will show that, unless the fraction of New York State 4th and 8th graders from NYC has changed substantially over time, this will result in a smaller shrinkage over time in the gap between NYC and the rest of the state.  I’ll be addressing this too.    

As for the claim that my analysis using 2003 as a baseline is premised on a “non-statistical, political judgment,” let me clarify my reasoning.  This choice is certainly non-statistical—I don’t know what a statistical judgment about a baseline for an impact assessment would consist of.  Is the judgment political?  I don’t think so, if by this Cantor means that my political values are dictating this analytic choice.  Rather, I contend that the choice is rooted in my professional judgment as someone who has been studying the impact of school reform policies and practices for more than 20 years.  Over this period, I have come to believe that understanding the impact of reform initiatives depends heavily on two things:  (a) knowledge about the timing and extent of the implementation of the reforms, and (b) a plausible theory of how the reforms might be expected to produce the observed outcomes.

Diane Ravitch and my intellectual compadre Socrates have already made arguments about the first of these above—whatever the reforms that Joel Klein brought to New York City, there was not enough time for them to be implemented effectively between his start and the mid-year administration of the standardized tests at issue here.  But I’d like to take things one step further, and examine the content of the 2002-03 reforms.  In a spirited debate with Sol Stern played out on the eduwonk blog last summer, Deputy Chancellor Chris Cerf described these reforms as follows:

 “The changes they instituted include creating the Office of School Safety; the appointment of a new senior staff; a pay-for-performance system for all 40 community and high school superintendents; and the appointment of two community superintendents in predominantly African-American communities who generated significantly improved results. And while I’m not sure how to parse the effect on student achievement, the Mayor and Chancellor introduced an idiom of high expectations and accountability to discussions of student learning.”

If there is existing evidence that reforms of this sort have been found to produce relatively immediate, sharp upswings in students’ performance on standardized tests, I would hope that David Cantor would point me to it.  In the absence of such evidence, I’ll stick with my professional judgment—not statistical, but also not political—that 2003 is an appropriate baseline for evaluating the impact of the school reform policies and practices introduced by Mike Bloomberg and Joel Klein.

A final comment for Mr. Cantor:  stating that an analysis is flawed is not the same as demonstrating that the conclusions of the analysis are incorrect.  Researchers make analytic choices all the time, and I can’t think of a piece of educational or social research that I would characterize as being without flaws.   What’s at issue is whether other analytic choices that a scholarly community deems to be as good or better lead to different conclusions.  The DOE response would be more powerful if it demonstrated that other ways of addressing issues such as the 2006 changes in the state tests or the comparison of NYC with the rest of New York State led to substantially different results.</description>
		<content:encoded><![CDATA[<p>I had hoped to pull together some new analyses before responding to David Cantor, but it’s taking a bit longer than I expected, and, not surprisingly, blogs abhor a vacuum.  So here’s an interim response.</p>
<p>Cantor is correct that the changes that were introduced into the state’s 4th grade and 8th grade tests in 2006 create a discontinuity in the tests administered before and after 2006.  Until 2006, the 4th grade tests included some content that was at the 2nd and 3rd grade levels.  With the introduction of state tests in grades 3, 5, 6 and 7 in 2006, the 4th grade test became more difficult.  Analyses indicate that the 8th grade test did not become more difficult in 2006.  It’s true that there was no direct equating of the 2006 and 2005 assessments, but the scores were linked through an equipercentile linking procedure.  Because the means and standard deviations of the 2005 tests and the 2006 tests on the 2005 scale were virtually identical, I don’t think that the discontinuity has distorted the NYC-NYS comparisons.  I’m more worried about variability from year to year in the standard deviation of the scale scores, which I suspect stems from the use of number-right scoring to derive the scale scores.  But I’ll concede the point, and will seek to address it in my next analysis of these data.</p>
<p>As for the separation of New York City from New York State, Mr. Cantor should be careful for what he wishes for.  Since New York City scores are lower than New York State scores overall—for every grade level, at every time point—a little bit of algebra will show that the New York State scores with NYC removed will be even higher than the NYC scores at every time point.  And a little bit more algebra will show that, unless the fraction of New York State 4th and 8th graders from NYC has changed substantially over time, this will result in a smaller shrinkage over time in the gap between NYC and the rest of the state.  I’ll be addressing this too.    </p>
<p>As for the claim that my analysis using 2003 as a baseline is premised on a “non-statistical, political judgment,” let me clarify my reasoning.  This choice is certainly non-statistical—I don’t know what a statistical judgment about a baseline for an impact assessment would consist of.  Is the judgment political?  I don’t think so, if by this Cantor means that my political values are dictating this analytic choice.  Rather, I contend that the choice is rooted in my professional judgment as someone who has been studying the impact of school reform policies and practices for more than 20 years.  Over this period, I have come to believe that understanding the impact of reform initiatives depends heavily on two things:  (a) knowledge about the timing and extent of the implementation of the reforms, and (b) a plausible theory of how the reforms might be expected to produce the observed outcomes.</p>
<p>Diane Ravitch and my intellectual compadre Socrates have already made arguments about the first of these above—whatever the reforms that Joel Klein brought to New York City, there was not enough time for them to be implemented effectively between his start and the mid-year administration of the standardized tests at issue here.  But I’d like to take things one step further, and examine the content of the 2002-03 reforms.  In a spirited debate with Sol Stern played out on the eduwonk blog last summer, Deputy Chancellor Chris Cerf described these reforms as follows:</p>
<p> “The changes they instituted include creating the Office of School Safety; the appointment of a new senior staff; a pay-for-performance system for all 40 community and high school superintendents; and the appointment of two community superintendents in predominantly African-American communities who generated significantly improved results. And while I’m not sure how to parse the effect on student achievement, the Mayor and Chancellor introduced an idiom of high expectations and accountability to discussions of student learning.”</p>
<p>If there is existing evidence that reforms of this sort have been found to produce relatively immediate, sharp upswings in students’ performance on standardized tests, I would hope that David Cantor would point me to it.  In the absence of such evidence, I’ll stick with my professional judgment—not statistical, but also not political—that 2003 is an appropriate baseline for evaluating the impact of the school reform policies and practices introduced by Mike Bloomberg and Joel Klein.</p>
<p>A final comment for Mr. Cantor:  stating that an analysis is flawed is not the same as demonstrating that the conclusions of the analysis are incorrect.  Researchers make analytic choices all the time, and I can’t think of a piece of educational or social research that I would characterize as being without flaws.   What’s at issue is whether other analytic choices that a scholarly community deems to be as good or better lead to different conclusions.  The DOE response would be more powerful if it demonstrated that other ways of addressing issues such as the 2006 changes in the state tests or the comparison of NYC with the rest of New York State led to substantially different results.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Michael M.</title>
		<link>http://gothamschools.org/2009/03/17/new-york-vs-new-york/comment-page-1/#comment-75680</link>
		<dc:creator>Michael M.</dc:creator>
		<pubDate>Sun, 22 Mar 2009 23:37:16 +0000</pubDate>
		<guid isPermaLink="false">http://gothamschools.org/?p=11363#comment-75680</guid>
		<description>Eduwonkette notes the ad hominem nature of DOE rebuttals.  And another commenter piles on.  Sheesh.</description>
		<content:encoded><![CDATA[<p>Eduwonkette notes the ad hominem nature of DOE rebuttals.  And another commenter piles on.  Sheesh.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Greg</title>
		<link>http://gothamschools.org/2009/03/17/new-york-vs-new-york/comment-page-1/#comment-75085</link>
		<dc:creator>Greg</dc:creator>
		<pubDate>Sun, 22 Mar 2009 11:37:13 +0000</pubDate>
		<guid isPermaLink="false">http://gothamschools.org/?p=11363#comment-75085</guid>
		<description>Eduwonkette&#039;s position is fascinating &quot;The burden is on the critic to pony up&quot;  Really?  Then what do SkoolBoy and Eduwonkette propose we do to improve public schools in the city?  go back to a system with a chancellor every 2 years an an unaccountable school board, 32 local sups, and 2% voting rates for local school boards? how would we even start to measure that system!  the fact that we can actually look at the Chancellor&#039;s reforms (be they 2002 or 2003) as a longitudinal set of reforms is in and of itself a victory. 

Seriously,  What do Aaron and Jennifer propose to improve the schools given limited financial and human talent resources?  I&#039;m sure their statistical analysis is excellent, but what has it shown them in terms of policy proposals that will make real change for real kids?  Again if &quot;the burden is on the critic to pony up,&quot; then what do you suggest we do instead of Children First?</description>
		<content:encoded><![CDATA[<p>Eduwonkette&#8217;s position is fascinating &#8220;The burden is on the critic to pony up&#8221;  Really?  Then what do SkoolBoy and Eduwonkette propose we do to improve public schools in the city?  go back to a system with a chancellor every 2 years an an unaccountable school board, 32 local sups, and 2% voting rates for local school boards? how would we even start to measure that system!  the fact that we can actually look at the Chancellor&#8217;s reforms (be they 2002 or 2003) as a longitudinal set of reforms is in and of itself a victory. </p>
<p>Seriously,  What do Aaron and Jennifer propose to improve the schools given limited financial and human talent resources?  I&#8217;m sure their statistical analysis is excellent, but what has it shown them in terms of policy proposals that will make real change for real kids?  Again if &#8220;the burden is on the critic to pony up,&#8221; then what do you suggest we do instead of Children First?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: eduwonkette</title>
		<link>http://gothamschools.org/2009/03/17/new-york-vs-new-york/comment-page-1/#comment-74585</link>
		<dc:creator>eduwonkette</dc:creator>
		<pubDate>Sat, 21 Mar 2009 16:22:56 +0000</pubDate>
		<guid isPermaLink="false">http://gothamschools.org/?p=11363#comment-74585</guid>
		<description>Hey Socrates, I really have to disagree with you on this scale score point, as most psychometricians and testing researchers advocate for using scale scores rather than proficiency ratings  to assess educational progress (see, for example, Andrew Ho, Dan Koretz, Bob Linn, Paul Barton at ETS - Andrew Ho has a great article that clearly lays out the problem with proficiency rates - if you email me, I&#039;ll send it to you.) 

Also, remember that this is a comparison of NYC with NY State&#039;s scale scores. If the test cannot capture different levels of skill well above the cut score for a 3, then one would think that this analysis would favor NYC as there are many parts of the state that are more likely to have students close to the ceiling. And furthermore, the NAEP *is* designed to capture a wide range of achievement and has always been tracked longitudinally using scale scores; when you compare NYC with the state or even other cities in the NAEP Trial Urban District Assessment as Cantor would prefer, there are cities that have made substantial progress since 2003 while NYC has been stagnant. 

In general, what I find frustrating about the way that the DOE responds to criticism is that they argue &quot;the analysis is flawed&quot; without ever presenting analyses that demonstrate that the results would be different if, for example, you compared NYC with NY state exclusive of NYC, or if you used standardized score (z scores) instead of continuous scale scores. The burden is on the critic  to pony up, but consistently DOE has taken the intellectually lazy position of personally attacking critics.</description>
		<content:encoded><![CDATA[<p>Hey Socrates, I really have to disagree with you on this scale score point, as most psychometricians and testing researchers advocate for using scale scores rather than proficiency ratings  to assess educational progress (see, for example, Andrew Ho, Dan Koretz, Bob Linn, Paul Barton at ETS &#8211; Andrew Ho has a great article that clearly lays out the problem with proficiency rates &#8211; if you email me, I&#8217;ll send it to you.) </p>
<p>Also, remember that this is a comparison of NYC with NY State&#8217;s scale scores. If the test cannot capture different levels of skill well above the cut score for a 3, then one would think that this analysis would favor NYC as there are many parts of the state that are more likely to have students close to the ceiling. And furthermore, the NAEP *is* designed to capture a wide range of achievement and has always been tracked longitudinally using scale scores; when you compare NYC with the state or even other cities in the NAEP Trial Urban District Assessment as Cantor would prefer, there are cities that have made substantial progress since 2003 while NYC has been stagnant. </p>
<p>In general, what I find frustrating about the way that the DOE responds to criticism is that they argue &#8220;the analysis is flawed&#8221; without ever presenting analyses that demonstrate that the results would be different if, for example, you compared NYC with NY state exclusive of NYC, or if you used standardized score (z scores) instead of continuous scale scores. The burden is on the critic  to pony up, but consistently DOE has taken the intellectually lazy position of personally attacking critics.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Socrates</title>
		<link>http://gothamschools.org/2009/03/17/new-york-vs-new-york/comment-page-1/#comment-74575</link>
		<dc:creator>Socrates</dc:creator>
		<pubDate>Sat, 21 Mar 2009 15:46:14 +0000</pubDate>
		<guid isPermaLink="false">http://gothamschools.org/?p=11363#comment-74575</guid>
		<description>Cantor appears to be correct on the scale score issue, though eduwonkette is right to point out a potential inconsistency.  Pallas&#039; skill in interpreting data has always been disappointing to me, precisely because his errors always skew in support of his ideological bias.  As Cantor asserts, skoolboy seems to practice advocacy more than analysis.

Cantor, however, doesn&#039;t do himself any favors with this &quot;the chancellor took over in 2002&quot; nonsense.  Ravitch&#039;s quotes prove that the chancellor (or his people, at the least) agreed at some point with what common sense tells the rest of us:  A chancellor can&#039;t implement reforms at the end of the &#039;02-03 school year and then claim credit for the scores during that year.  If scores were so sensitive to a chancellor&#039;s reforms (in a system with 1 million+ students, no less) that in just a matter of months (or less) the scores dropped that dramatically, we should have seen even more dramatic jumps in each subsequent year, when the reforms had really started to sink in.

I actually like most of Klein&#039;s reforms, and I think they will pay off once they&#039;ve really started to sink in.  It&#039;s understandable that in our impatient political climate the folks at Tweed would feel the need to prove their reforms work, but it&#039;s absurd to think that big systemic changes like the ones they&#039;ve made would turn around a city&#039;s schools in 8 years, let alone half a year.  Cantor does his boss no favors by clinging to the absurd 2002 starting point.</description>
		<content:encoded><![CDATA[<p>Cantor appears to be correct on the scale score issue, though eduwonkette is right to point out a potential inconsistency.  Pallas&#8217; skill in interpreting data has always been disappointing to me, precisely because his errors always skew in support of his ideological bias.  As Cantor asserts, skoolboy seems to practice advocacy more than analysis.</p>
<p>Cantor, however, doesn&#8217;t do himself any favors with this &#8220;the chancellor took over in 2002&#8243; nonsense.  Ravitch&#8217;s quotes prove that the chancellor (or his people, at the least) agreed at some point with what common sense tells the rest of us:  A chancellor can&#8217;t implement reforms at the end of the &#8217;02-03 school year and then claim credit for the scores during that year.  If scores were so sensitive to a chancellor&#8217;s reforms (in a system with 1 million+ students, no less) that in just a matter of months (or less) the scores dropped that dramatically, we should have seen even more dramatic jumps in each subsequent year, when the reforms had really started to sink in.</p>
<p>I actually like most of Klein&#8217;s reforms, and I think they will pay off once they&#8217;ve really started to sink in.  It&#8217;s understandable that in our impatient political climate the folks at Tweed would feel the need to prove their reforms work, but it&#8217;s absurd to think that big systemic changes like the ones they&#8217;ve made would turn around a city&#8217;s schools in 8 years, let alone half a year.  Cantor does his boss no favors by clinging to the absurd 2002 starting point.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: eduwonkette</title>
		<link>http://gothamschools.org/2009/03/17/new-york-vs-new-york/comment-page-1/#comment-74565</link>
		<dc:creator>eduwonkette</dc:creator>
		<pubDate>Sat, 21 Mar 2009 15:02:50 +0000</pubDate>
		<guid isPermaLink="false">http://gothamschools.org/?p=11363#comment-74565</guid>
		<description>David, 
I don&#039;t understand how your critique of scale scores can square with DOE&#039;s own progress report system, which chops up scale scores into faux proficiency units (i.e. 3.1, 3.2, 3.3, etc). DOE has argued many times that these distinctions - which are derived from scale scores, including scale scores above the cut score for proficient - are meaningful, even going as far to say that a student who scored a &quot;3.3&quot; last year and a &quot;3.1&quot; this year did not make a full year of progress. How can scale scores be useful for the progress reports and not useful for comparing NYC&#039;s progress with the state?</description>
		<content:encoded><![CDATA[<p>David,<br />
I don&#8217;t understand how your critique of scale scores can square with DOE&#8217;s own progress report system, which chops up scale scores into faux proficiency units (i.e. 3.1, 3.2, 3.3, etc). DOE has argued many times that these distinctions &#8211; which are derived from scale scores, including scale scores above the cut score for proficient &#8211; are meaningful, even going as far to say that a student who scored a &#8220;3.3&#8243; last year and a &#8220;3.1&#8243; this year did not make a full year of progress. How can scale scores be useful for the progress reports and not useful for comparing NYC&#8217;s progress with the state?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Michael M.</title>
		<link>http://gothamschools.org/2009/03/17/new-york-vs-new-york/comment-page-1/#comment-74544</link>
		<dc:creator>Michael M.</dc:creator>
		<pubDate>Sat, 21 Mar 2009 13:27:45 +0000</pubDate>
		<guid isPermaLink="false">http://gothamschools.org/?p=11363#comment-74544</guid>
		<description>First, what a terrific blog, and terrific string!  Thank you all.

In layman, non-wonk, terms, I am struck by the following:

1) The population of NYC may be 8M, and the population of NYS may be 19M.  But if improvement in NYC pushes the carrot farther out the state stick, for the reading gap to stay relatively constant in recent years implies Klein is doing no better than the average Chancellor-equivalent elsewhere, and implicitly, Klein is more than &quot;gaining&quot; on his peers in math. 

2) I get the 2002 vs. 2003 topic, but what of 2004 to present?  There is no apparent closing of the reading gap, and in math it&#039;s the same improvement &quot;slope&quot; recently as in the pre-Klein past.  (I won&#039;t belabor all the changes in DOE since 2004, but they simply don&#039;t show an impact in the trends.)

3)   Peer review is a healthy part of not only science, but public policy too.  I call on DOE to open its books in full to such outside independent review, such as that &quot;advocated&quot; on an albeit different topic by Comptroller Thompson in his May 2008 report on DOE growth planning, or the lack thereof (See link to “Growing Pains”) .  Until then, it&#039;s hard to see the statements of DOE itself as anything OTHER than &quot;advocacy.&quot;

4) Like most parents, I just want the straight poop.  Personally, I can no longer trust DOE to issue anything but self-congratulatory reviews.  I am an advocate for the kids and have no vested interest in any &quot;side&quot; but theirs.</description>
		<content:encoded><![CDATA[<p>First, what a terrific blog, and terrific string!  Thank you all.</p>
<p>In layman, non-wonk, terms, I am struck by the following:</p>
<p>1) The population of NYC may be 8M, and the population of NYS may be 19M.  But if improvement in NYC pushes the carrot farther out the state stick, for the reading gap to stay relatively constant in recent years implies Klein is doing no better than the average Chancellor-equivalent elsewhere, and implicitly, Klein is more than &#8220;gaining&#8221; on his peers in math. </p>
<p>2) I get the 2002 vs. 2003 topic, but what of 2004 to present?  There is no apparent closing of the reading gap, and in math it&#8217;s the same improvement &#8220;slope&#8221; recently as in the pre-Klein past.  (I won&#8217;t belabor all the changes in DOE since 2004, but they simply don&#8217;t show an impact in the trends.)</p>
<p>3)   Peer review is a healthy part of not only science, but public policy too.  I call on DOE to open its books in full to such outside independent review, such as that &#8220;advocated&#8221; on an albeit different topic by Comptroller Thompson in his May 2008 report on DOE growth planning, or the lack thereof (See link to “Growing Pains”) .  Until then, it&#8217;s hard to see the statements of DOE itself as anything OTHER than &#8220;advocacy.&#8221;</p>
<p>4) Like most parents, I just want the straight poop.  Personally, I can no longer trust DOE to issue anything but self-congratulatory reviews.  I am an advocate for the kids and have no vested interest in any &#8220;side&#8221; but theirs.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Diane Ravitch</title>
		<link>http://gothamschools.org/2009/03/17/new-york-vs-new-york/comment-page-1/#comment-74529</link>
		<dc:creator>Diane Ravitch</dc:creator>
		<pubDate>Sat, 21 Mar 2009 12:05:31 +0000</pubDate>
		<guid isPermaLink="false">http://gothamschools.org/?p=11363#comment-74529</guid>
		<description>David Cantor is wrong to claim that 2002 is the correct baseline for analyzing test score trends. The DOE would like to use 2002 as the baseline because the test scores in the spring of 2003 showed a dramatic increase. But these increases registered in spring 2003 occurred before the introduction of any of the Children First reforms. Consider the timeline: The Mayor announces the Children First reforms in January 2003, as the children are taking their tests. The reforms are implemented in September 2003. And now Cantor would like to claim credit for the big increases that occurred before any of the reforms were introduced! Here is a quote from the New York Times (May 21, 2003) describing the response of DOE leaders to the &quot;sharp gains&quot; on the ELA test: &quot;City officials, who might otherwise have been jubilant about yesterday&#039;s results, offered a muted reaction, saying that the gains were not broad enough and that the school system as a whole was still failing at least half the city&#039;s children.&quot; And here from the Times&#039; story of October 22, 2003 about the math gains: &quot;Fourth graders across the state made stunning gains in their math scores last spring, with even sharper increases in New York City...In the city, news of the gains...elicited cheers among teachers and principals. But not everyone greeted the news so enthusiastically. The suggestion that city schools are on the upswing put Chancellor Joel I. Klein, who is overhauling them, in a tricky position. While the chancellor&#039;s critics pounced upon the higher scores as evidence that the school system did not need such an overhaul, some of his allies acknowledged that he would now be under even more pressure to show gains next spring. Mr. Klein&#039;s reaction to the good news was muted, as it was to news of higher reading scores in the spring.&quot; It is interesting that the gains of 2003, preceding the launch of the Children First reforms, were the only ones to be confirmed by the NAEP tests, as NAEP showed acheivement in NYC to be flat from 2003-2007. No wonder David Cantor and Chancellor Klein are now trying to claim credit for the scores recorded before their reforms were implemented.</description>
		<content:encoded><![CDATA[<p>David Cantor is wrong to claim that 2002 is the correct baseline for analyzing test score trends. The DOE would like to use 2002 as the baseline because the test scores in the spring of 2003 showed a dramatic increase. But these increases registered in spring 2003 occurred before the introduction of any of the Children First reforms. Consider the timeline: The Mayor announces the Children First reforms in January 2003, as the children are taking their tests. The reforms are implemented in September 2003. And now Cantor would like to claim credit for the big increases that occurred before any of the reforms were introduced! Here is a quote from the New York Times (May 21, 2003) describing the response of DOE leaders to the &#8220;sharp gains&#8221; on the ELA test: &#8220;City officials, who might otherwise have been jubilant about yesterday&#8217;s results, offered a muted reaction, saying that the gains were not broad enough and that the school system as a whole was still failing at least half the city&#8217;s children.&#8221; And here from the Times&#8217; story of October 22, 2003 about the math gains: &#8220;Fourth graders across the state made stunning gains in their math scores last spring, with even sharper increases in New York City&#8230;In the city, news of the gains&#8230;elicited cheers among teachers and principals. But not everyone greeted the news so enthusiastically. The suggestion that city schools are on the upswing put Chancellor Joel I. Klein, who is overhauling them, in a tricky position. While the chancellor&#8217;s critics pounced upon the higher scores as evidence that the school system did not need such an overhaul, some of his allies acknowledged that he would now be under even more pressure to show gains next spring. Mr. Klein&#8217;s reaction to the good news was muted, as it was to news of higher reading scores in the spring.&#8221; It is interesting that the gains of 2003, preceding the launch of the Children First reforms, were the only ones to be confirmed by the NAEP tests, as NAEP showed acheivement in NYC to be flat from 2003-2007. No wonder David Cantor and Chancellor Klein are now trying to claim credit for the scores recorded before their reforms were implemented.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: David Cantor, NYC Department of Education Press Secretary</title>
		<link>http://gothamschools.org/2009/03/17/new-york-vs-new-york/comment-page-1/#comment-74001</link>
		<dc:creator>David Cantor, NYC Department of Education Press Secretary</dc:creator>
		<pubDate>Fri, 20 Mar 2009 21:57:12 +0000</pubDate>
		<guid isPermaLink="false">http://gothamschools.org/?p=11363#comment-74001</guid>
		<description>Aaron Pallas’s analysis is flawed.
 
&lt;strong&gt;Disaggregating NYC from Rest of State&lt;/strong&gt;
Pallas compares average New York City scale scores with average New York State scale scores instead of comparing NYC scores to the rest of NYS scores (NYS- NYC).  Given that New York City is such a large percentage of the state population, state averages can suppress real differences in gains between NYC and other areas of the state.  DOE’s analysis compared NYC to the rest of the state, which is why Pallas’s analysis did not match Andy Jacob’s reports.  NAEP data is not available disaggregating NYC vs. the rest of NYS, which is why NYC compares its performance to other Large Cities.
 
&lt;strong&gt;Proficiency levels vs. scale scores&lt;/strong&gt;
Beginning in 2006 the New York State Education Department expanded the ELA and mathematics testing programs to all grades 3-8.  At this time the state also re-scaled the grade 4 and 8 tests, changing the scale scores and their corresponding ranges.  For example, as a result of these changes, the interval between a score of 650 and 660 in 2003 is not equal to the interval between a score of 650 and 660 in 2008. These changes make it necessary to use standardized scores (z-scores) in order to make scale score comparisons prior to and after 2006.

NCLB and state accountability systems define assessment performance by proficiency levels.  Therefore, NYS state assessments are designed to determine whether students are meeting proficiency.  Test items are selected for this purpose.  In particular, the state tests include many more items that assess whether students are meeting standards than whether they are exceeding them.  (For example, in most grade levels, a student would only have to miss one or two questions to be considered a Level 3.  For more information, please refer to the NYS Department of Education&#039;s assessment technical reports.)  For this reason, NYS and NYC report their achievement results in terms of the percentage of students at proficiency.  (Average scores are more appropriate for tests that are designed to measure achievement equally across all levels.)
 
&lt;strong&gt;Bias&lt;/strong&gt;
Pallas’s entire comparison, in using 2003 as opposed to 2002 as a baseline for measuring achievement trends, is premised on a non-statistical, political judgment. Obviously what happened before this administration influenced this administration’s results, just as we will influence those who follow us, but the usual measurement of an administration’s performance looks at what happened while the administration was in office: in this case, from Sept 02-today.  Setting a baseline that forecloses on the administration having any affect on performance during a year it ran the school system isn’t a function of serious analysis; it’s advocacy.</description>
		<content:encoded><![CDATA[<p>Aaron Pallas’s analysis is flawed.</p>
<p><strong>Disaggregating NYC from Rest of State</strong><br />
Pallas compares average New York City scale scores with average New York State scale scores instead of comparing NYC scores to the rest of NYS scores (NYS- NYC).  Given that New York City is such a large percentage of the state population, state averages can suppress real differences in gains between NYC and other areas of the state.  DOE’s analysis compared NYC to the rest of the state, which is why Pallas’s analysis did not match Andy Jacob’s reports.  NAEP data is not available disaggregating NYC vs. the rest of NYS, which is why NYC compares its performance to other Large Cities.</p>
<p><strong>Proficiency levels vs. scale scores</strong><br />
Beginning in 2006 the New York State Education Department expanded the ELA and mathematics testing programs to all grades 3-8.  At this time the state also re-scaled the grade 4 and 8 tests, changing the scale scores and their corresponding ranges.  For example, as a result of these changes, the interval between a score of 650 and 660 in 2003 is not equal to the interval between a score of 650 and 660 in 2008. These changes make it necessary to use standardized scores (z-scores) in order to make scale score comparisons prior to and after 2006.</p>
<p>NCLB and state accountability systems define assessment performance by proficiency levels.  Therefore, NYS state assessments are designed to determine whether students are meeting proficiency.  Test items are selected for this purpose.  In particular, the state tests include many more items that assess whether students are meeting standards than whether they are exceeding them.  (For example, in most grade levels, a student would only have to miss one or two questions to be considered a Level 3.  For more information, please refer to the NYS Department of Education&#8217;s assessment technical reports.)  For this reason, NYS and NYC report their achievement results in terms of the percentage of students at proficiency.  (Average scores are more appropriate for tests that are designed to measure achievement equally across all levels.)</p>
<p><strong>Bias</strong><br />
Pallas’s entire comparison, in using 2003 as opposed to 2002 as a baseline for measuring achievement trends, is premised on a non-statistical, political judgment. Obviously what happened before this administration influenced this administration’s results, just as we will influence those who follow us, but the usual measurement of an administration’s performance looks at what happened while the administration was in office: in this case, from Sept 02-today.  Setting a baseline that forecloses on the administration having any affect on performance during a year it ran the school system isn’t a function of serious analysis; it’s advocacy.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Michael M.</title>
		<link>http://gothamschools.org/2009/03/17/new-york-vs-new-york/comment-page-1/#comment-71877</link>
		<dc:creator>Michael M.</dc:creator>
		<pubDate>Wed, 18 Mar 2009 01:35:14 +0000</pubDate>
		<guid isPermaLink="false">http://gothamschools.org/?p=11363#comment-71877</guid>
		<description>SB,

Thanks.  Will share with CECD2.

Adding to my comment above re CECD2 looking at PI as one metric by which to measure the Achievement Gap, we recognized  that the formula itself for calculating PI has a bias:  it treats 3&#039;s and 4&#039;s the same, which loosely put, for D2, is more likely to &quot;cap&quot; the performance of the Whites on ELA and Math, and Asians on Math than other groups on either measure.

Given such a &quot;3 is good enough&quot; formula, I assert the PI understates what may more precisely be called the Proficiency Gap.  It&#039;s more like an under-performance index.  (And personally, I am concerned it may result in classrooms that neglect 3&#039;s by putting the emphasis on raising a 1 to a 2, or a 2 to a 3.  Raising a 3 to a 4 simply does not boost PI.)

I recognize your Aug 08 Memo used scale scores (which I believe have no such high-end compression).  I now wonder how PI -- used by NYS, and by the city for NCLB reporting, and readily available for every school on its DOE website -- would compare to your analysis in any given year, as well as trend over time.
(Click my name for link to CECD2 Resolution on topic.)</description>
		<content:encoded><![CDATA[<p>SB,</p>
<p>Thanks.  Will share with CECD2.</p>
<p>Adding to my comment above re CECD2 looking at PI as one metric by which to measure the Achievement Gap, we recognized  that the formula itself for calculating PI has a bias:  it treats 3&#8242;s and 4&#8242;s the same, which loosely put, for D2, is more likely to &#8220;cap&#8221; the performance of the Whites on ELA and Math, and Asians on Math than other groups on either measure.</p>
<p>Given such a &#8220;3 is good enough&#8221; formula, I assert the PI understates what may more precisely be called the Proficiency Gap.  It&#8217;s more like an under-performance index.  (And personally, I am concerned it may result in classrooms that neglect 3&#8242;s by putting the emphasis on raising a 1 to a 2, or a 2 to a 3.  Raising a 3 to a 4 simply does not boost PI.)</p>
<p>I recognize your Aug 08 Memo used scale scores (which I believe have no such high-end compression).  I now wonder how PI &#8212; used by NYS, and by the city for NCLB reporting, and readily available for every school on its DOE website &#8212; would compare to your analysis in any given year, as well as trend over time.<br />
(Click my name for link to CECD2 Resolution on topic.)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Michael M.</title>
		<link>http://gothamschools.org/2009/03/17/new-york-vs-new-york/comment-page-1/#comment-71535</link>
		<dc:creator>Michael M.</dc:creator>
		<pubDate>Tue, 17 Mar 2009 17:27:19 +0000</pubDate>
		<guid isPermaLink="false">http://gothamschools.org/?p=11363#comment-71535</guid>
		<description>Rock on SB! Rock on Eduwonkette!

Time for a spinoff of SB’s September ‘08 classic: “Could a Monkey Do a Better Job of Predicting Which Schools Show Student Progress in English Skills than the New York City Department of Education?”

Hint: “Score: Monkey 6, DOE 0.”
(http COLON //blogs DOT edweek DOT org/edweek/eduwonkette/2008/09/could_a_monkey_do_a_better_job DOT html)</description>
		<content:encoded><![CDATA[<p>Rock on SB! Rock on Eduwonkette!</p>
<p>Time for a spinoff of SB’s September ‘08 classic: “Could a Monkey Do a Better Job of Predicting Which Schools Show Student Progress in English Skills than the New York City Department of Education?”</p>
<p>Hint: “Score: Monkey 6, DOE 0.”<br />
(http COLON //blogs DOT edweek DOT org/edweek/eduwonkette/2008/09/could_a_monkey_do_a_better_job DOT html)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: skoolboy</title>
		<link>http://gothamschools.org/2009/03/17/new-york-vs-new-york/comment-page-1/#comment-71530</link>
		<dc:creator>skoolboy</dc:creator>
		<pubDate>Tue, 17 Mar 2009 17:17:37 +0000</pubDate>
		<guid isPermaLink="false">http://gothamschools.org/?p=11363#comment-71530</guid>
		<description>Michael M., 

You can see my analysis of whether NYC was closing the achievement gap among racial/ethnic groups here:  http://www.nysun.com/files/pallasmemo.pdf</description>
		<content:encoded><![CDATA[<p>Michael M., </p>
<p>You can see my analysis of whether NYC was closing the achievement gap among racial/ethnic groups here:  <a href="http://www.nysun.com/files/pallasmemo.pdf" rel="nofollow">http://www.nysun.com/files/pallasmemo.pdf</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Michael M.</title>
		<link>http://gothamschools.org/2009/03/17/new-york-vs-new-york/comment-page-1/#comment-71503</link>
		<dc:creator>Michael M.</dc:creator>
		<pubDate>Tue, 17 Mar 2009 14:18:23 +0000</pubDate>
		<guid isPermaLink="false">http://gothamschools.org/?p=11363#comment-71503</guid>
		<description>And what about the in-town Achievement Gap between Whites and Asians; and Blacks, Hispanics, Special Ed, and ELL... W-I-T-H-I-N the city?

Widening or closing over similar periods? 


Note: CECD2 has disagreggated Performance Index (PI) data supporting the above groupings, for D2 only.  In D2, income as a factor appeared to be a wash.  See link for the CECD2 resolution on the general topic.</description>
		<content:encoded><![CDATA[<p>And what about the in-town Achievement Gap between Whites and Asians; and Blacks, Hispanics, Special Ed, and ELL&#8230; W-I-T-H-I-N the city?</p>
<p>Widening or closing over similar periods? </p>
<p>Note: CECD2 has disagreggated Performance Index (PI) data supporting the above groupings, for D2 only.  In D2, income as a factor appeared to be a wash.  See link for the CECD2 resolution on the general topic.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: eduwonkette</title>
		<link>http://gothamschools.org/2009/03/17/new-york-vs-new-york/comment-page-1/#comment-71498</link>
		<dc:creator>eduwonkette</dc:creator>
		<pubDate>Tue, 17 Mar 2009 13:51:54 +0000</pubDate>
		<guid isPermaLink="false">http://gothamschools.org/?p=11363#comment-71498</guid>
		<description>Great post, SB. It still amazes me how misleading proficiency rates can be, and what you have here is a stellar example of that principle. 

I am also surprised that given the 57% gap closing for 4th grade math between state and NYC from 2003-2008  that we don&#039;t see a similar closing on the NAEP. I can see plenty of reasons why they may not match up perfectly - but these are two *very* different stories, and does make one concerned that these skills aren&#039;t transferring to other tests. 

And finally, an idea -  you might also do a similar comparison between NYC TUDA and the other cities. Klein often says that the NAEP is less reliable because it is a small sample, but it is a small sample across the board and the fact is that cities like Atlanta have been making substantial progress while we haven&#039;t.</description>
		<content:encoded><![CDATA[<p>Great post, SB. It still amazes me how misleading proficiency rates can be, and what you have here is a stellar example of that principle. </p>
<p>I am also surprised that given the 57% gap closing for 4th grade math between state and NYC from 2003-2008  that we don&#8217;t see a similar closing on the NAEP. I can see plenty of reasons why they may not match up perfectly &#8211; but these are two *very* different stories, and does make one concerned that these skills aren&#8217;t transferring to other tests. </p>
<p>And finally, an idea &#8211;  you might also do a similar comparison between NYC TUDA and the other cities. Klein often says that the NAEP is less reliable because it is a small sample, but it is a small sample across the board and the fact is that cities like Atlanta have been making substantial progress while we haven&#8217;t.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

