Growth Scores

Get your kid assigned to the right teacher and they just might grow a little taller, new research suggests.

Tennessee has long used something called “value-added assessment” to determine the amount of academic growth students make from year to year. These “growth scores” are then used to generate a score for teachers. The formula in Tennessee is known as TVAAS — Tennessee Value Added Assessment System. Tennessee was among the first states in the nation to use value-added assessment, and the formula became a part of teacher evaluations in 2011.

Here’s how the Tennessee Department of Education describes the utility of TVAAS:

Because students’ performance is compared to that of their peers, and because their peers are moving through the same standards and assessment transitions at the same time, any drops in proficiency during these transitions have no impact on the ability of teachers, schools, and districts to earn strong TVAAS scores.

Now, research on value-added modeling indicates teacher assignment is almost as likely to predict the future height of students as it is their academic achievement. Here’s the abstract from a National Bureau of Economic Research working paper:

Estimates of teacher “value-added” suggest teachers vary substantially in their ability to promote student learning. Prompted by this finding, many states and school districts have adopted valueadded measures as indicators of teacher job performance. In this paper, we conduct a new test of the validity of value-added models. Using administrative student data from New York City, we apply commonly estimated value-added models to an outcome teachers cannot plausibly affect: student height. We find the standard deviation of teacher effects on height is nearly as large as that for math and reading achievement, raising obvious questions about validity. Subsequent analysis finds these “effects” are largely spurious variation (noise), rather than bias resulting from sorting on unobserved factors related to achievement. Given the difficulty of differentiating signal from noise in real-world teacher effect estimates, this paper serves as a cautionary tale for their use in practice.

The researchers offer a word of caution:

Taken together, our results provide a cautionary tale for the naïve application of VAMs to teacher evaluation and other settings. They point to the possibility of the misidentification of sizable teacher
“effects” where none exist. These effects may be due in part to spurious variation driven by the typically small samples of children used to estimate a teacher’s individual effect.

In short: Using TVAAS to make decisions regarding hiring, firing, and compensation is bad policy.

However, the authors note that policymakers thirst for low-cost, convenient solutions:

In the face of data and measurement limitations, school leaders and state
education departments seek low-cost, unbiased ways to observe and monitor the impact that their teachers have on students. Although many have criticized the use of VAMs to evaluate teachers, they remain a
widely-used measure of teacher performance. In part, their popularity is due to convenience-while observational protocols which send observers to every teacher’s classroom require expensive training and considerable resources to implement at scale, VAMs use existing data and can be calculated centrally at low cost.

While states like Hawaii and Oklahoma have moved away from value-added models in teacher evaluation, Tennessee remains committed to this flawed method. Perhaps Tennessee lawmakers are hoping for the formula that will ensure a crop of especially tall kids ready to bring home a UT basketball national title.

4 thoughts on “Growth Scores

