Just the research we all want to see: Multiple measures doesn't mean muddied mess

See all posts

Value-added measures are often criticized for providing a narrow view of a teacher's performance. Conversely, broader measures like observations are seen as too subjective. A new study shows--happily--that both types of evaluations are consistent and complementary: they predict future students' achievement. Teachers who score well on one also score well on the other. Best of all, combining them produces a stronger and more accurate measure of a teacher's effectiveness than using either alone.

Jonah E. Rockoff and Cecilia Speroni of Columbia University looked at the ability of three measures to predict teacher effectiveness: a rigorous job application process, observations and ratings by trained mentors, and value-added calculations based on students' math and English scores.

Here's a scorecard that sorts out the rather complicated relationship between these evaluations and student performance:




As suggested by these findings, it's not clear why schools need to look at mentor scores at all. After all, wouldn't it be simpler to base a teacher's evaluation entirely on student test data?

The answer is that for teachers in the middle--neither exemplary nor especially weak--value-added data is the least reliable measure. But for these same teachers, as Rockoff and Speroni show, the input provided by the mentors is especially meaningful and provides a way to increase the reliability of their evaluations.

In a few months, we'll get an even better grasp on this issue when the Bill and Melinda Gates Foundation releases the second set of findings from its Measures of Effective Teaching study--this time looking at the correlation between evaluation and test scores.