It's no fun being graded on a curve

Mark Palko points to a news article by Michael Winerip on teacher assessment:

No one at the Lab Middle School for Collaborative Studies works harder than Stacey Isaacson, a seventh-grade English and social studies teacher. She is out the door of her Queens home by 6:15 a.m., takes the E train into Manhattan and is standing out front when the school doors are unlocked, at 7. Nights, she leaves her classroom at 5:30. . . .

Her principal, Megan Adams, has given her terrific reviews during the two and a half years Ms. Isaacson has been a teacher. . . . The Lab School has selective admissions, and Ms. Isaacson’s students have excelled. Her first year teaching, 65 of 66 scored proficient on the state language arts test, meaning they got 3’s or 4’s; only one scored below grade level with a 2. More than two dozen students from her first two years teaching have gone on to . . . the city’s most competitive high schools. . . .

You would think the Department of Education would want to replicate Ms. Isaacson . . . Instead, the department’s accountability experts have developed a complex formula to calculate how much academic progress a teacher’s students make in a year — the teacher’s value-added score — and that formula indicates that Ms. Isaacson is one of the city’s worst teachers.

According to the formula, Ms. Isaacson ranks in the 7th percentile among her teaching peers — meaning 93 per cent are better. . . .

How could this happen to Ms. Isaacson? . . . Everyone who teaches math or English has received a teacher data report. On the surface the report seems straightforward. Ms. Isaacson’s students had a prior proficiency score of 3.57. Her students were predicted to get a 3.69 — based on the scores of comparable students around the city. Her students actually scored 3.63. So Ms. Isaacson’s value added is 3.63-3.69.

Remember, the exam is on a 1-4 scale, and we were already told that 65 out of 66 students scored 3 or 4, so an average of 3.63 (or, for that matter, 3.69) is plausible. The 3.57 is “the average prior year proficiency rating of the students who contribute to a teacher’s value added score.” I assume that the “proficiency rating” is the same as the 1-4 test score but I can’t be sure.

The predicted score is, according to Winerip, “based on 32 variables — including whether a student was retained in grade before pretest year and whether a student is new to city in pretest or post-test year. . . . Ms. Isaacson’s best guess about what the department is trying to tell her is: Even though 65 of her 66 students scored proficient on the state test, more of her 3s should have been 4s.”

This makes sense to me. Winerip seems to presenting this is as some mysterious process but it seems pretty clear to me. A “3” is a passing grade, but if you’re teaching in a school with “selective admissions” with the particular mix of kids that this teacher has, the expectation is that most of your students will get “4”s.

We can work through the math (at least approximately). We don’t know this teacher’s students did this year so I’ll use the data given above, from her first year. Suppose that x students in the class got 4’s, 65-x got 3’s, and one student got a 2. To get an average of 3.63, you need 4x + 3(65-x) + 2 = 3.63*66. That is, x = 3.63*66 – 2 – 3*65 = 42.58. This looks like x=43. Let’s try it out: (4*43 + 3*22 + 2)/66 = 3.63 (or, to three decimal places, 3.636). This is close enough for me. To get 3.69 (more precisely, 3.697), you’d need 47 4’s, 18 3’s, and a 2. So the gap would be covered by four students (in a class of 66) moving up from a 3 to a 4. This gives a sense of the difference between a teacher in the 7th percentile and a teacher in the 50th.

I wonder what this teacher’s value-added scores were for the previous two years.

P.S. Further detail on the test scoring here.

Topics on this page

English United States Department of Education

28

It’s no fun being graded on a curve

Topics on this page

Related