GothamSchools — daily independent reporting on NYC public schools

fact-check

Feds correct Klein on how to talk about the achievement gap

A statistic that Joel Klein, Al Sharpton, and Mort Zuckerman have all recently employed to bemoan the racial achievement gap appears to be wrong.

Here’s the statistic, as Klein and Sharpton recently summarized in the Wall Street Journal (and Mort Zuckerman used it here):

“today the average 12th-grade black or Hispanic student has the reading, writing and math skills of an eighth-grade white student.”

The problem isn’t the principle behind the claim; America definitely has a racial achievement gap. The problem, according to an official at the National Center for Education Statistics, is in the specific way that Klein et al describe the gap.

The best available measure we have to compare all American kids is the National Assessment of Educational Progress, or the NAEP test. But the NAEP test, which is given only to a sample of students across the country, not to every child, does not permit the kind of detailed comparison Klein’s statistic would demand, Arnold Goldstein, the NCES official, said. “It would be great if we could. It’s kind of frustrating not to be able to make these sorts of statements,” said Goldstein, who is program director for design, analysis, and reporting at NCES’s assessment division. “But that’s a limitation of the data.”

I contacted the Department of Education several times for comment but got no response this week. UPDATE: A spokesman, Andrew Jacob, wrote to say that Klein got the statistic from “No Excuses: Closing the Racial Gap in Learning,” a book by Abigail and Stephan Thernstrom.

Goldstein said that one problem is that the statistic Klein uses aims to compare students who are in different grade levels. But the NAEP test for, say, eighth-grade math, which might look at algebra, is not comparable to the test for, say, twelfth-grade math, which looks at higher-level skills. Observers cannot, then, compare a high school senior’s skill level to a middle schooler’s.

Goldstein said another problem is that NAEP relies on sampling. That means that, in order to draw conclusions about all students in a group, researchers have to make sure that enough students took the tests — in other words, that the sample size was large enough. Narrowing down what you’re looking at so that it’s not just one racial group in one grade, but comparing racial groups across subjects and grades, would require a larger sample size than federal researchers have, Goldstein said.

How should the achievement gap be described? Goldstein said the best thing to do is to stick to comparisons within grade levels. One true fact: In 2005, the most recent year high school senior were tested, 16% of black twelfth-graders scored proficient on the national reading test, compared to 43% of white twelfth-graders.

Thanks to the Education Writers Association’s president and public editor for asking Goldstein this question first.

  • ceolaf

    This is not quite right, though it is very close. The statement “But the NAEP test, which is given only to a sample of students across the country, not to every child, does not permit the kind of detailed comparison Klein’s statistic would demand,” is actually highly misleading.

    1) This problem is NOT a product of the fact that the test is given to only a sample of students, or even because no single student gets the entire set of questions. You could test every kid (in a given grade) in the entire country with the entire test, and you still couldn’t make this kind of comparison.

    2) This is an issue of test design, not test administration.

    3) The issue is called “linking.” That is, are the different tests (i.e. the 8th and 12th grade tests) linked? And and they are not. (I’m stretching my memory a little bit. Linking MIGHT not be the right term, but I think that it is.)

    4) This mistake is easy to make because they scores are reported on very similar looking scales, with the 12th scores generally greater than the 8th grade scores. However, this is neither a design nor an administration issue. They could report the scores on any scale they want. It’s an arbitrary decision, and one that has been made poorly. This mistake is not actually that uncommon, because of the dumb way the scores are reported. But the people who run the NAEP chose to report them this way, anyway.

    5) I believe that they actually broke the link between sucessive administrations of the NAEP at one point. That is, we want to be able to look at this year’s results and compare them to the results of last time, and the time before that and the time before that. And you can generally do that, and are supposed to be able to do that. But one time, you couldn’t. So, you can compare the results of every administation before that poin to each other, and every administration after that point to each other. But you can’t compare across that broken link. But because they kept the same scale, people make this mistake all the time.

    ******************

    This is all incredibly important because of the way the the country — the policy makers, the practitioners, the public, the media and everyone else — looks to test scores as the most important educational outcome. (Heck, I would suggest that most poeple look to test scores as the only important educational outcome.) And yet, a tiny fraction of a tiny fraction of a tiny fraction of people know about testing to understand the the reports might actually mean. Even policy professionals, knowledge of educational measurement is quite poor.

    Alas…

  • Anon E. Muss

    The correct phrase that the previous commenter was searching for is “equating.”
    Because NAEP tests are not scaled from grade to grade, or vertically scaled, one cannot accurately equate the scores between grade 8 and grade 12.

  • ceolaf

    Yes, “vertical scaling” as the term I was looking for.

    Thank you, Anon.

    ************************************

    A little bit more on vertical scaling.

    There are plenty of tests out there that are vertically scaled across many grades, even all the way from K through 12. Obviously, the different tests for each grade will focus on different material. This does not prevent them from being vertically scaled.

    There are technical reasons why the 8th grade NAEP might not be scaled with the 12th grade NAEP, but there are a bunch of practical reasons, too. It costs time and money to do that kind of thing, for example. Also, there are assumptions you have to make about that material in order to vertically scale, but I think that these assumptions have already been made to generate the scores that exist. (We make assumptions in our work all the time, based on experience and judgement. These are not unwarranted assumptions or rash assumptions. Rather, they are define the problem we are working on. Others might view the problem differently, and therefore have different assumptions.)

    I don’t buy Goldstein’s explanation for why the two scales are not equated, because what he talking about does not make it impossible. I do not know if he is trying to give an explanation that is easy for laypeople to understand, or if he is intentionally trying to obfuscate, or if he thinks that this is actually the reason. But I am fairly confident that he is wrong.

Tips, questions, feedback?

Contact us at .

Follow GothamSchools

RSS

Feb. 10: You’re invited!

Chalk It Up

Recent Comments

48 comments so far today

Our Twitter Updates

  • Despite some tense confrontations between protesters and police, nothing ever got physical and a lieutenant just said there were no arrests. 17 mins ago
  • He's been frozen in that stoic position all night MT @lisafleisher: A protester speaks with his middle finger. http://t.co/xLar4NRU 19 mins ago
  • Last of the occupy protesters just walked out together, shouting expletives and insults on their way out. #toughcrowd 23 mins ago
  • Frank Thomas, DOE spokesman just told me no arrests have been made tonight at PEP despite confrontation between protesters & police earlier. 58 mins ago
  • RT @leoniehaimson: It's been shown repeatedly that as one schl closes another overwhelmed w/ high needs kids that small schls won't take 1 hr ago
  • More updates...

Archives

February 2012
M T W T F S S
« Jan  
 12345
6789101112
13141516171819
20212223242526
272829  
?>