GothamSchools — daily independent reporting on NYC public schools

Eye on Education

The Flat Earth Society

Today’s New York Daily News published a bold editorial on the progress of New York City schoolchildren under the administration of Mayor Mike Bloomberg and Chancellor Joel Klein.  “You would be better off arguing that the world is flat, or that the sun revolves around the Earth, than to dispute that New York City kids are performing better and better in school,” writes the Daily News, crowing that there are “fresh and incontrovertible data” pointing to what the newspaper refers to as a “sea change” in New York City. 

They might have wanted to wait a day.

This morning, the U.S. Department of Education released the 2009 results of the National Assessment of Educational Progress assessments of fourth-grade and eighth-grade mathematics in each state and for the nation overall.  Nationally, fourth-grade performance held steady from 2007 to 2009, and there was a slight but statistically significance over this period in eighth-grade math performance.  In New York State, the small declines in fourth-grade and gains in eighth grade were not statistically significant, leading to the conclusion that there has been no change in the performance of New York students on the NAEP math assessment from 2007 to 2009. 

This is a very different story than the one told by New York’s own assessment system, on which the Bloomberg and Klein administration has staked its claims about the great progress in student achievement.  The average scale score in fourth-grade mathematics increased from 680 in 2007 to 689 in 2009, a hefty 9 points;  the jump in eighth-grade scores was even more dramatic, as the average scale score rose from 657 in 2007 to 675 in 2009, a remarkable increase of 18 points.

To put these two sets of numbers in context, the chart below shows the gains in fourth-grade and eighth-grade math performance from 2007 to 2009 expressed in standard deviation units (i.e., the amount of variation among individual students in 2007).  According to NAEP, fourth-graders’ performance fell .07 standard deviations from 2007 to 2009, a difference that is not significantly different from zero.  In contrast, fourth-graders gained .23 standard deviations on the New York State assessment from 2007 to 2009.  Similarly, the NAEP results indicate that eighth-graders in New York gained .08 standard deviations from 2007 to 2009 in math performance, a difference that is not significantly different from zero, but they gained .47 standard deviations over this period on the New York State test.

flat-earth

Another way of comparing the implications of the two different sets of test results is to think about where the average student in 2009 would have scored in 2007.  Based on these standard deviations, and assuming that the scores follow a bell-curve distribution, the New York State scores indicate that the average fourth-grader in 2009 scored at the 59th percentile of the 2007 fourth-grade distribution, which is a pretty big jump.  The increment for eighth-graders is even more striking:  the average eighth-grader in 2009 scored at the 68th percentile of the 2007 eighth-grade distribution, based on the New York State tests.  In contrast, the NAEP data indicate that the average New York fourth-grader in 2009 scored at the 47th percentile of the 2007 distribution of fourth-grade math performance in New York State, and the average eighth-grader in 2009 scored at the 53rd percentile of the 2007 eighth-grade distribution.
 
How can we explain these differences?  There are lots of possible explanations, but most of them don’t hold up under close scrutiny.  The two tests are taken by similar populations of students under similar conditions, and the grade-level mathematics standards on which the two assessments are based do not differ dramatically.  The NAEP test is a low-stakes test, which might result in students not taking it seriously, but the statisticians who oversee the NAEP testing program look for patterns suggesting this, and find little evidence of it.  It’s extremely unlikely that there’s rampant cheating going on in the New York State testing system that could explain the differences. 

 It’s possible that the New York State tests have been getting easier over time.  I have yet to see definitive evidence ruling this out.  There also is strong suggestive evidence of “score inflation” in the New York State tests, because there are predictable patterns in the standards which appear on the state tests year after year, with some standards showing up repeatedly each year, and some standards having never been tested at all during the life of the testing program.  Schools and teachers can make use of these patterns, which also show up in the format of test questions covering particular standards, to focus their instruction on the subset of standards that crop up again and again.  Because the New York State tests never test some standards, we have no idea about whether students have mastered them.  In contrast, the design of the NAEP assessment allows for a much broader picture of mathematics performance, because so many more standards and test item formats are incorporated into the test.

 Whatever the reason, the discrepancy between the NAEP trends and trends in the NewYork State test scores raises serious questions about what the New York tests are telling us about the academic performance of students in New York State.  The same, of course, goes for New York City.  We’ll see NAEP scores for New York City in a month or so, but it’s unlikely that they will yield a different story than what I’m describing here.
 
Is the Earth flat?  No.  But New York State test scores, and probably New York City scores, are.

  • Fred Smith

    Another phony argument the City has raised to discredit the inconvenient NAEP results is the one about sampling. Somehow NAEP results, which are based on well-designed samples of students, are maligned as incomplete or flawed compared to the NYS and NYC tests, which everyone supposedly takes.

    Just another irrelevant argument used as a smokescreen to defend NY results that have been juiced up by mind-numbing test preparation activities, financial incentives that reward higher test scores achieved by (read my lips) any means necessary, standards set lower making it easier to reach Level 3 in the last two years, the bending of test administration practices, the slanted overkill presentation of favorable data to the public and the shallow reporting of same in the press.

    Shameless. The destruction of another generation of kids by the numbers.

  • Michael M.

    I noticed the cover of the NAEP study PPT here on GS was stamped, “Embargoed until 10/14.”
    Mayoral debate was 10/13.

    Coinkidink?

  • Aaron Pallas

    Michael M,

    Yes, it is a coincidence. The release date was determined by the U.S. Department of Education, and NYC politics did not play a role. Similarly, it’s an unfortunate coincidence that we won’t have the New York City results for NAEP until about a month from now, well after the mayoral election.

  • Michael M.

    AP,
    Thanks.
    It was a fun conspiracy while it lasted.
    Then again…

  • QueensParent

    Ah Yes MM everything’s a conspiracy to you it seems. Aaron you call the NAEP a
    “low stakes” test but I don’t see it the way you describe in. In fact, you assign more credibility to it and its result than the NY state assessments, which are administered and field tested much more often. Why is that? It’s as if just because it’s a federal test it has more credibility without much evidence in support of it. Just because something is “federal” does not make it more reliable and valid. Finally, you did not address the fact that NY’s tests and the Federal tests are testing different learning standards. One is not more valid than the other.

  • Citizen X

    Skoolboy: Thanks for the timely post. I am relying more and more on blogs these days to sort through the terrain of accountability arguments and data in this education free enterprise zone. Can you please explain about how the criterion reference New York State achievement tests are designed vis-a-vis validity and reliability?

  • leonie haimson

    good piece; thanks! I think the evidence suggests that the many factors have led to test score inflation: the state tests have gotten easier over time, in that the more difficult questions have been removed; the questions have gotten more predictable; and the scoring has gotten easier (ie fewer correct answers needed to get a specific scale scores.) Meanwhile, financial rewards and the threat of losing jobs has led to excessive test prep taking over our schools; and the protections against cheating have been mostly removed.

  • http://ednotesonline.blogspot.com/ norm

    Ah Queens Parent. Any words discrediting the BloomKlein administration calls you forth as the great defender of the pretenders. Where is the dog in this race?

    Why not give us some of the different learning standards between the state and the NAEP?

    This is an open and shut case validating the information received from many teachers about the ease of the tests. It’s not that these teachers have a dog in the race because the UFT always tries to jump on board when the DOE issues phony test results.

  • Michael M.

    QP et al,
    Next time, I’ll add “*kidding*.” I’m a fan of Occam’s Razor, NOT conspiracy theories.

    Those darn feds. They just don’t test what we’re teaching kids in NYS, let alone NYC.
    *I’m kidding.* But that’s effectively what QP is suggesting.

    Let’s rethink that carefully.

    If Bloomberg and Klein are such super educators, then why is all of NYS outpacing the NAEP?

    To maintain that Mayoral Control, or the Mayor, or the Chancellor, are doing anything super-duper unique, even within the state… NYC results would have to outpace even the NYS results.
    I’m looking forward to the NYC results.

    P.S. Bouncing back to the other GS story re Klein’s presentation…. how does DOE explain that one of their counties is stuck in last place over a number of years, while others show significant movement?

  • QueensParent

    Norm the real issue here I think is how the entire world for you revolves around opposing the Mayor. Liking Mayoral control is not the same thing as liking the Mayor but for you it is or at least it is in your posts. If one likes ending social promotion then they have to be a Bloomberg. I’m glad fortunately that real life isn’t that way. Try being less paranoid. Life’s too short.

  • http://www.sinksalive.blogspot.com KitchenSink

    Leonie, for once I agree with you!!

  • http://ednotesonline.blogspot.com/ norm

    The very fact that you think Bloomberg really ended social promotion as opposed to issuing a press release and installing some procedures is a sign that you buy and support (and attack others for being critical) of every policy he issues. The nitpicking of Aaron’s position and calling both Michael and me paranoid and conspiracy theorists is a sign you have a dog in the race. There is not one thing you say in your comment that has any truth to it or makes any sense. Your comment could come directly from Joel Klein. Did he move to Queens recently? Oh, David Cantor lives there I believe and he is a Queens parent. There goes my paranoia again.

  • QueensParent

    Yeah Norm, I’m Joel Klein. Did you miss a pill this morning?

  • pat

    Most prolific posters on GothamSchools identify themselves. Who is Queens Parent? He/she really seems like a flack for Bloomberg/Klein.

  • QueensParent

    Pat give me a break. I think the real issue here is that anyone who expresses an opinion on here that is seen not in the “I Hate My Life, I Hate Mayor Bloomberg” pattern is seemingly now either Bloomberg and/or Klein. Talk about paranoid.

  • Aaron Pallas

    QueensParent,

    In response to your initial comment here, it’s not that NAEP is a federal enterprise; we all know of many things that state and local governments can do better than the federal government. But NAEP has been around for nearly 40 years, and the federal government devotes more resources to NAEP than underfunded state departments of education can devote to state assessment systems, many of which have only ramped up in the past few years under No Child Left Behind guidelines. There are, for example, more than 20 full-time staff assigned to NAEP at the National Center for Education Statistics. Moreover, because the contract to carry out NAEP is large and prestigious, testing contractors bid their best people for NAEP work.

    I think there are clearer lines of authority and oversight with NAEP than with state assessment systems such as New York State’s. My fear is that unintended gaps in oversight have resulted in a system where no one really knows how it works.

  • QueensParent

    Aaron, perhaps but if NAEP is so valid and reliable, then why isn’t is used for accountability purposes under NCLB as opposed to each state’s testing system? The Congress had an opportunity to force a federal testing program onto the states with NCLB and it didn’t. This would seem to argue that NAEP isn’t as great as we think it is. I just will never believe it. NAEP tested 2,000 4th graders whereas the state tested all 190,000 of them. I think the State’s results are far more robust given this statistic.

  • Aaron Pallas

    QueensParent,

    I accept that you will never believe it. There are several different issues here. Historically, there has been strong resistance to a mandatory federal test; in fact, the history of NAEP makes clear that initially Congress precluded NAEP from reporting state comparisons. Some people believe that a mandatory federal test exceeds the appropriate role of the federal government in public education, which is a state, not federal, responsibility.

    But there’s a technical issue, and that is that NCLB requires that a test judge each individual student to be proficient or not in math and English language arts in grades three through eight. The design of the NAEP assessment doesn’t generate individual student scores; rather, individual test-takers take a sample of the NAEP items, and then, almost like a jigsaw puzzle, the performance of the individuals get put together in a way that paints a picture of a large urban district, a state, or the nation overall. The tradeoff is that although NAEP does not provide a reliable look at how individual students are doing, the jigsaw puzzle approach allows for many more topics in a given subject to be considered. In contrast, although state tests may provide a clear picture of the performance of individual students, typically that picture only looks at a small subset of all of the topics (i.e., standards) in a given subject.

    In 2009, there were 31 items on the New York third-grade math assessment. But there are 114 standards in the New York State third-grade mathematics core curriculum. By design, most standards in the New York State curriculum do not get tested in a given year.

  • Fred Smith

    Would it be more robust to take six pints of blood to see if you were anemic or is a small test tube sufficient?

  • http://ednotesonline.blogspot.com/ norm

    “I just will never believe it.”
    “I think the State’s results are far more robust given this statistic.”

    What a perfect match to the title of this piece.
    I also don’t believe the earth is round.

  • http://ednotesonline.blogspot.com/ norm

    Queens parent would be willing to sacrifice 6 pints to make BloomKlein look good.

  • QueensParent

    Aaron but that’s my point exactly. The federal government has taken a SAMPLE of students on a bi-annual test, and that’s supposed to be somehow a more robust picture of student achievement in NY State than testing every student annually. At least that is what apologists of the NAEP are claiming, but it just defies logic. Further, if in the 4th grade the NY State sample was 2,000 students, then proportionally I’d say only about 600 of them are NYC students. So, the test results of 600 4th graders in NYC are going to say more than the thousands of all 4th graders in NYC on the State tests? I think not. Impossible. What I do think the NAEP shows us is that folks are reading far too much into what these tests mean. All I take from them is we know students can be more proficient than they are, but no federal test is going to get to do to it. Education is a local responsibility.

  • Michael M.

    Lying about education is a local responsibility to ferret out.

  • inexile

    Forget the tests for a minute.  All one has to do is look at student work.  I brought home a bunch of essays students had written to type. Why am I typing student work?  There are only lap top computers on rolling carts at my school.  There is no computer lab.  So students have to have a flash drive to save their work to and they have to know how to save their work.  About 50 percent of my students do not know how to do this.  They don’t have basic technological skills. So I bring home student work and type it myself because it’s easier for me and for them to revise and edit their work if it’s typed.  So back to the work itself.  My husband helps me in this endeavor.  He takes 30 essays and I type 30.  He is a business man and he is stunned by the writing of my 9th grade students.  Of the 60 essay, probably 50% of the kids are functionally illiterate.  They’ve arrived in the 9th grade with 2s on the ELA and they cannot write.  They use no writing conventions.  I have students who write with no periods and no punctuation and show no understanding of capitalization.  Their spelling is atrocious.  How do you arrive in 9th grade functionally illiterate?  I’m sure many would say they’ve had many years of bad teaching – but how do you account for the 50% of them who are very literate and write well?  Many of them have been in school together for 10 years so they presumably received the same instruction.  It’s a mystery. 

  • QueensParent

    I accept your point but I also would point out that my son has had two teachers that I would consider to be illiterate based on the written letters and notes they send home to parents. Misspelled words. Lack of punctuation. No subject verb agreement. Talk about compensatory education. And a school with only one laptop? I’ve never heard of this. I live in one of the poorer Queens districts and the schools have computers. No every student doesn’t have one but I don’t think thatls the case at any school.

  • http://ednotesonline.blogspot.com/ norm

    I love the deflection Queens parent of the implied criticism of the level of 9th graders who have spent 7 years under BloomKlein to the 2 teachers. I bet you blame the UFT and not BloomKlein for any of this, maybe even the lack of laptops in the schools. I was in tech ed for the last 15 years I was in the system (I left at the onset of BloomKlein) and word is that many schools are further behind that when I left. The DOE has reorganized tech even more often than the rest of the system and it all kept getting worse. The major difference today is that more kids have computers at home so whatever skills they learn they bring to the table. Many computer labs have been dismantled and even where there are laptops, the test prep mania (one of the factors in the high NY State scores) has pushed out most of the curriculum, with tech ed taking an especially big hit. Only schools with forward looking principals or a very aggressive tech teacher have a semblance of a program. I left the punctuation in for your benefit but I bet I mispeled a phew words. So call me illiterate.

Tips, questions, feedback?

Contact us at .

Word from Our Sponsor

Follow GothamSchools

RSS
Subscribe to the daily email digest:

Chalk It Up

Recent Comments

1 comment so far today

Archives

May 2013
M T W T F S S
« Apr  
 12345
6789101112
13141516171819
20212223242526
2728293031