Outsiders should evaluate schools

The most important determinant of educational quality is teacher quality. Yet, as a recent study of school principals’ permissiveness in teacher evaluations and a cheating scandal in Atlanta show, this performance is difficult to measure.

The best way forward is to move the evaluation of teachers outside the schools entirely, with standardized tests administered by an independent agency. This would be supplemented by classroom assessments based on unobtrusive videotaping, also judged by outsiders, including teachers’ representatives.

Researchers have long noted the power that teachers have over student test scores. In an influential paper published in 2005, economists Steven Rivkin, Eric Hanushek and John Kain examined administrative data in Texas and found that 15 percent of the differences in students’ math scores were explained by variations in teacher quality. The difference in test-score gains between a teacher who is rated average and one who is better than 85 percent of educators generates the same improvement as dropping class size by 10.

My Harvard colleagues Raj Chetty and John Friedman, together with Jonah Rockoff, link school data with evidence on adult earnings and find that replacing a teacher “in the bottom 5 percent with an average teacher would increase the present value of students’ lifetime income by more than $250,000.”

Teacher quality matters, but standard observable measures of teacher qualifications don’t. Research, including that cited above, typically finds that extra degrees or certificates or years of experience only marginally affect student performance as measured by test scores.

One approach is to follow standard corporate practices, giving principals the power to decide which teachers are good or bad. Good managers should know which workers are more productive, and good principals should be able to assess their educators, taking into account the ephemeral elements that can influence a classroom. Economists Brian Jacob and Lars Lefgren have found that principals are quite good at identifying which teachers produce the highest test-score gains.

But a recent New York Times article reminds us that you can’t always trust principals to use the knowledge they possess. In Florida, Michigan and Tennessee, principals were called on to rate teachers and graded more than 97 percent of them as being effective or better. Now that’s grade inflation.

Like college teachers who freely dispense A’s to unworthy students, principals want to be popular and avoid the hassles inherent in telling people that their work is subpar. In corporations, managers know that their own promotions depend on firing the incompetent. Principals know no such thing, so they take the easy way out.

Just as colleges could fix grade inflation with a simple policy of assigning a grade distribution to each teacher, requiring a fixed number of A’s, B’s, C’s and F’s, principals could be similarly required to fire the bottom tenth of their teachers each year. I suspect that such a draconian policy would help American children, but good luck getting that one through the teachers’ unions.

The standard alternative to relying on principal evaluations is to use student test-score gains. This approach also has limitations. Unions aren’t fond of evaluating teachers this way. In 2008, the New York Legislature went so far as to stop the use of test scores in teacher evaluation. Luckily, the ban hasn’t lasted.

Many criticisms leveled at using student test scores are refutable. We can deal with the problem that some teachers get tougher students both by looking at test-score gains (rather than score levels) and controlling for observable student attributes.

Even the best teacher can get a bad class, yet teachers can be evaluated over long enough periods to smooth out the idiosyncrasies of particular student groups. Teaching to the test may not be ideal. Yet as long as the test is sufficiently broad, it will still measure student learning. Moreover, the Chetty-Friedman-Rockoff work confirms that teachers who raise test scores also raise adult earnings.

Test scores become valueless, however, if they reflect teacher cheating rather than student achievement. Last week, 35 Georgia educators were indicted in a scandal in which seven teachers were accused of raising students’ scores by erasing wrong answers and making them right,” according to the New York Times report.

A study by Steven Levitt and Brian Jacob recounted in “Freakonomics” documented teacher cheating by looking at suspicious patterns of incorrect answers in Chicago’s standardized tests.

Teacher cheating isn’t an excuse to give up on standardized tests. It is a reason to administer them properly. Just imagine if college admissions tests were given by individual teachers rather than by the College Board. Teachers would have a huge incentive to help their favored students; the College Board, therefore, administers tests at well-monitored sites.

If the U.S. is going to use standardized tests to evaluate teachers or schools, it should pay the extra price of using an external agency, such as the College Board.

The idea of outside evaluation also makes sense when it comes to assessing the more ephemeral aspects of teacher quality. One advantage of using outsiders, rather than principals, is that independent experts would be more insulated from the pressure to rate poor teachers as effective. Those outsiders could be held accountable if they routinely gave high marks to teachers who achieved limited test score gains.

A second advantage of this evaluation approach is that teachers themselves, including teachers’ unions, can be brought into the process. The old-fashioned way of doing this would be to have teams of outside monitors randomly visit classrooms. The modern version would be to have video monitors installed in classrooms, a step that Wyoming debated and dismissed in 2011. In theory, video monitoring has improved to the point where students as well as teachers could be regularly filmed, so that the degree of student engagement could be more accurately assessed.

The obvious downside of videotaping is the loss of privacy for teachers and students. I can sympathize with those concerns, because I don’t typically videotape my lectures. As long as some externally administered evaluation complements the test-score approach, I would happily give up the video-camera idea - for now.

The U.S. needs externally administered student tests and classroom visits. The best teachers must be paid more; poor instructors must be managed out.

By Edward Glaeser

*The author, an economics professor at Harvard University, is a Bloomberg View columnist.