Reform is Working

That’s the message from the Tennessee Department of Education based on recently released TCAP results and an analysis of the data over time.

You can see for yourself here and here.

The one area of concern is reading, but overall, students are performing better than they were when new TCAP tests were started and standards were raised.

Here’s the interesting thing: This is true across school districts and demographic subgroups. The trend is positive.

Here’s something else: A similar trend could be seen in results before the change in the test in 2009.

Tennessee students were steadily making gains. Teachers and schools were hitting the mark set for them by policymakers. This in an age of collective bargaining for teachers and no TVAAS-based evaluation or pay schemes.

When the standards were made higher — certainly a welcome change — teachers again hit the mark.

Of course, since the standards change, lots of other reforms have taken place. Most of these have centered around teachers and the incorporation of TVAAS in teacher evaluation and even pay schemes. The State Board of Education even gutted the old state salary schedule to promote pay differentiation, ostensibly based on TVAAS scores.

But does pay for TVAAS actually lead to improved student outcomes as measured by TVAAS?

Consider this comparison of Putnam County and Cumberland County. Putnam was one of the original TIF recipients and among the first to develop a pay scheme based on teacher evaluations and TVAAS.

Putnam’s 2014 TVAAS results are positive, to be sure. But neighboring Cumberland County (a district that is demographically similar and has a similar assortment of schools) also shows positive TVAAS results.  Cumberland relies on the traditional teacher pay scale. From 2012-13 to 2013-14, Putnam saw a 50% increase in the number of categories (all schools included) in which they earned TVAAS scores of 5. So did Cumberland County.

Likewise, from 2012-13 to 2013-14, Putnam saw a 13% decline in the number of categories in which they earned TVAAS scores below a 3. In Cumberland County, the number was cut by 11%.

This is one example over a two-year cycle. New district level results for 2015 will soon be available and will warrant an update. But, it’s also worth noting that these results track results seen in Denver in analysis of their ProComp pay system. Specifially, University of Colorado’s Denver ProComp Evaluation Report (2010-2012) finds little impact of ProComp on student achievement, or on teachers’ professional practices, including their teaching practices or retention.

The Putnam-Cumberland initial analysis tracks with that of the ProComp studies: Teacher performance pay, even if devised in conjunction with teacher groups, cannot be said to have a significant impact on student performance over time.

So, prior to 2008, student academic achievement as measured by Tennessee standardized tests showed steady improvement over time. This occurred in an environment with no performance pay. Again from 2009-2015, across districts and demographic groups, student achievement is improving. Only a small number of Tennessee districts have performance pay schemes — so, that alone would indicate that performance pay is not driving improved student outcomes.  Then, a preliminary comparison of two districts suggests that both performance pay and non-performance pay districts see significant (and similar) TVAAS gains.

Reform may be working — but it may not be the reform the reformers want to push.

For more on education politics and policy in Tennessee, follow @TNEdReport

Is John Oliver Reading TN Ed Report?

John Oliver recently took on the issue of standardized testing and it sounds like he’s been reading Tennessee Education Report. In 18 brilliant minutes, he hits on a number of topics covered here time and again.

Oliver discussed teacher merit pay, the recruiting tactics of testing companies, value-added assessment, and testing transparency.

Back in 2013, Tennessee’s State Board of Education moved toward merit pay based on value-added data.

This year, while adding nearly $100 million to the pot for teacher compensation, Governor Haslam continued a push for merit pay.

While Oliver noted that Pearson recruits test scorers on Craigslist, Tennessee’s new testing vendor, Measurement, Inc. uses the same practice.

And of course, there’s the issue of value-added assessment — in Tennessee, called TVAAS. While it yields some interesting information, it’s not a reliable predictor of teacher performance and it’s going to be even more unreliable going forward, due to the shift from TCAP to TNReady. Here’s what we’ve learned from TVAAS in Tennessee:

In fact, this analysis demonstrates that the difference between a value-added identified “great” teacher and a value-added identified “average” teacher is about $300 in earnings per year per student.  So, not that much at all.  Statistically speaking, we’d call that insignificant.  That’s not to say that teachers don’t impact students.  It IS to say that TVAAS data tells us very little about HOW teachers impact students.

Surprisingly, Tennessee has spent roughly $326 million on TVAAS and attendant assessment over the past 20 years. That’s $16 million a year on a system that is not yielding much useful information.

And then there’s testing transparency. Oliver points out that it’s difficult if not impossible to get access to the actual test questions. In fact, Tennessee’s testing vendor, Measurement, Inc., has a contract with Utah’s testing vendor that involves a fine if test questions are revealed — $5000 per question:

The contract further notes that any release of the questions either by accident or as required by law, will result in a fee of $5000 per test item released. That means if Tennessee wants to release a bank of questions generated from the Utah test and used for Tennessee’s assessment, the state would pay $5000 per question.

Here’s the clip from John Oliver:

 

For more on education politics and policy in Tennessee, follow @TNEdReport

 

The End of an Era

Over at Bluff City Ed, Jon Alfuth celebrates the end of the EOC testing era. Those tests will be replaced with TNReady next year.

Alfuth notes that there are many challenges with the current testing regime, including gaming the system and misalignment with current standards.

Here’s what he says he hopes the new tests provide:

First, I’d personally like to see aligned pre- and formative assessments to allow teachers to track tests throughout the year. These could be given to the districts and used to develop a benchmark for where students are starting and track their progress throughout the year. These should be designed by Measurement Inc. to ensure close alignment to the actual test.

Second, we need to see shorter tests. Asking students to sit for between 2 to 4 three hour assessments in a four day period is a lot, and it does stress kids out. I’d like to see the number of questions reduced on the new TNReady assessments to reflect this reality.

Third, we need better special education and special needs accommodations. I’m not a special education teacher myself, but from talking to some of my colleagues my understanding is that the accommodations for the EOC regime aren’t the greatest. Hopefully a technologically advanced test like TNReady (it can be given on paper or on a computer) could include better accommodations for kids with special needs. I also hope it makes automatic adjustments for students who, say, speak English as a second language.

Fourth, we need to see a substantial increase of resources aligned to the new assessments and SOON. Teachers need time to internalize the format at the types of questions that students will be asked to complete on the new assessments. That was one of the failings of PARCC and one reason I believe we no longer have it in Tennessee – teachers didn’t have enough supporting resources and backed off support for the assessment. Lets hope that TNReady doesn’t make the same mistake.

More on TNReady:

TNReady to Borrow Questions from Utah

Transition to TNReady Creates TVAAS Problems

For more on education politics and policy, follow @TNEdReport

A Little Less Bad

From a story in Chalkbeat:

Tennessee’s teacher evaluation system is more accurate than ever in measuring teacher quality…

That’s the conclusion drawn from a report on the state’s teacher evaluation system conducted by the State Department of Education.

The idea is that the system is improving.

Here’s the evidence the report uses to justify the claim of an improving evaluation system:

1) Teacher observation scores now more closely align with teacher TVAAS scores — TVAAS is the value-added modeling system used to determine a teacher’s impact on student growth

2) More teachers in untested subjects are now being evaluated using the portfolio system rather than TVAAS data from students they never taught

On the second item, I’d note that previously, 3 districts were using the a portfolio model and now 11 districts use it. This model allows related-arts teachers and those in other untested subjects to present a portfolio of student work to demonstrate that teacher’s impact on growth. The model is generally applauded by teachers who have a chance to use it.

However, there are 141 districts in Tennessee and 11 use this model. Part of the reason is the time it takes to assess portfolios well and another reason is the cost associated with having trained evaluators assess the portfolios. Since the state has not (yet) provided funding for the use of portfolios, it’s no surprise more districts haven’t adopted the model. If the state wants the evaluation model to really improve (and thereby improve teaching practice), they should support districts in their efforts to provide meaningful evaluation to teachers.

A portfolio system could work well for all teachers, by the way. The state could move to a system of project-based learning and thus provide a rich source of material for both evaluating student mastery of concepts AND teacher ability to impact student learning.

On to the issue of TVAAS and observation alignment. Here’s what the report noted:

Among the findings, state education leaders are touting the higher correlation between a teacher’s value-added score (TVAAS), which estimates how much teachers contribute to students’ growth on statewide assessments, and observation scores conducted primarily by administrators.

First, the purpose of using multiple measures of teacher performance is not to find perfect alignment, or even strong correlation, but to utilize multiple inputs to assess performance. Pushing for alignment suggests that the department is actually looking for a way to make TVAAS the central input driving teacher evaluation.

Advocates of this approach will tell suggest that student growth can be determined accurately by TVAAS and that TVAAS is a reliable predictor of teacher performance.

I would suggest that TVAAS, like most value-added models, is not a significant differentiator of teacher performance. I’ve written before about the need for caution when using value-added data to evaluate teachers.

More recently, I wrote about the problems inherent in attempting to assign growth scores when shifting to a new testing regime, as Tennessee will do next year when it moves from TCAP to TNReady. In short, it’s not possible to assign valid growth scores when comparing two entirely different tests.  Researchers at RAND noted:

We find that the variation in estimated effects resulting from the different mathematics achievement measures is large relative to variation resulting from choices about model specification, and that the variation within teachers across achievement measures is larger than the variation across teachers. These results suggest that conclusions about individual teachers’ performance based on value-added models can be sensitive to the ways in which student achievement is measured.
These findings align with similar findings by Martineau (2006) and Schmidt et al (2005)
You get different results depending on the type of question you’re measuring.

The researchers tested various VAM models (including the type used in TVAAS) and found that teacher effect estimates changed significantly based on both what was being measured AND how it was measured. 

And they concluded:

Our results provide a clear example that caution is needed when interpreting estimated teacher effects because there is the potential for teacher performance to depend on the skills that are measured by the achievement tests.

So, even if you buy the idea that TVAAS is a significant differentiator of teacher performance, drawing meaningful conclusions from next year’s TNReady simply is not reliable.

The state is touting improvement in a flawed system that may now be a little less bad.  And because they insist on estimating growth from two different tests with differing methodologies, the growth estimates in 2016 will be unreliable at best. If they wanted to improve the system, they would take two to three years to build growth data based on TNReady — that would mean two t0 three years of NO TVAAS data in teacher evaluation.

Alternatively, the state could move to a system of project-based learning and teacher evaluation and professional development based on a Peer Assistance and Review Model. Such an approach would be both student-centered and result in giving teachers the professional respect they deserve. It also carries a price tag — but our students are worth doing the work of both reallocating existing education dollars and finding new ways to invest in our schools.

For more on education politics and policy in Tennessee, follow @TNEdReport

 

 

 

Validating the Invalid?

The Tennessee House of Representatives passed legislation today (HB 108) that makes changes to current practice in teacher evaluation as Tennessee transitions to its new testing regime, TNReady.

The changes adjust the percentage of a teacher’s evaluation that is dependent on TVAAS scores to 10% next year, 20% the following year, and back to the current 35% by the 2017-18 academic year.

This plan is designed to allow for a transition period to the new TNReady tests which will include constructed-response questions and be aligned to the so-called Tennessee standards which match up with the Common Core State Standards.

Here’s the problem: There is no statistically valid way to predict expected growth on a new test based on the historic results of TCAP. First, the new test has (supposedly) not been fully designed. Second, the test is in a different format. It’s both computer-based and it contains constructed-response questions. That is, students must write-out answers and/or demonstrate their work.

Since Tennessee has never had a test like this, it’s impossible to predict growth at all. Not even with 10% confidence. Not with any confidence. It is the textbook definition of comparing apples to oranges.

Clearly, legislators feel like at the very least, this is an improvement. A reasonable accommodation to teachers as our state makes a transition.

But, how is using 10% of an invalid number a good thing? Should any part of a teacher’s evaluation be made up of a number that reveals nothing at all about that teacher’s performance?

While value-added data alone is a relatively poor predictor of teacher performance, the value-added estimate used next year is especially poor because it is not at all valid.

But, don’t just take my word for it. Researchers studying the validity of value-added measures asked whether value-added gave different results depending on the type of question asked. Particularly relevant now because Tennessee is shifting to a new test with different types of questions.

Here’s what Lockwood and McCaffrey (2007) had to say in the Journal of Educational Measurement:

We find that the variation in estimated effects resulting from the different mathematics achievement measures is large relative to variation resulting from choices about model specification, and that the variation within teachers across achievement measures is larger than the variation across teachers. These results suggest that conclusions about individual teachers’ performance based on value-added models can be sensitive to the ways in which student achievement is measured.
These findings align with similar findings by Martineau (2006) and Schmidt et al (2005)
You get different results depending on the type of question you’re measuring.

The researchers tested various VAM models (including the type used in TVAAS) and found that teacher effect estimates changed significantly based on both what was being measured AND how it was measured. 

And they concluded:

Our results provide a clear example that caution is needed when interpreting estimated teacher effects because there is the potential for teacher performance to depend on the skills that are measured by the achievement tests.

If you measure different skills, you get different results. That decreases (or eliminates) the reliability of those results. TNReady is measuring different skills in a different format than TCAP. It’s BOTH a different type of test AND a test on different standards. Any value-added comparison between the two tests is statistically suspect, at best. In the first year, such a comparison is invalid and unreliable. As more years of data become available, it may be possible to make some correlation between past TCAP results and TNReady scores.

Or, if the state is determined to use growth scores (and wants to use them with accuracy), they will wait several years and build completely new growth models based on TNReady alone. At least three years of data would be needed in order to build such a model.

It seems likely that the Senate will follow the House’s lead on Monday and overwhelmingly support the proposed evaluation changes. But in doing so, they should be asking themselves if it’s really ok to base any part of a teacher’s evaluation on numbers that reliably predict nothing.

More on Value-Added:

Real World Harms of Value-Added Data

Struggles with Value-Added Data

 

Is THAT even legal?

That’s the question the Tennessee Education Association is asking about the use of value-added data (TVAAS) in teacher evaluations.

The TEA, joining with the National Education Association, has filed a lawsuit challenging the constitutionality of Tennessee’s use of TVAAS data in teacher evaluations.

According to a press release, TEA is specifically concerned about teachers who receive value-added scores based on students they have never taught. A significant number of Tennessee teachers currently receive a portion of their evaluation score based on TVAAS scores from school-wide or other data, meaning teachers are graded based on students they’ve never taught.

The release states:

More than half of the public school teachers in Tennessee receive evaluations that are based substantially on standardized test scores of students in subjects they do not teach. The lawsuit seeks relief for those teachers from the arbitrary and irrational practice of measuring their effectiveness with statistical estimates based on standardized test scores from students they do not teach and may have never met. 

While Governor Haslam is proposing that the legislature reduce the impact of TVAAS scores on teacher evaluations during the state’s transition to new standardized tests, his proposal does not address the issues of statistical validity with the transition. There is no way to determine how TCAP scores will interface with the scores from a test that has not even been developed yet. To hold teachers accountable for data generated in such an unreliable fashion is not only statistically suspect, it’s disrespectful.

Finally, it’s worth noting that value-added data doesn’t do much in terms of differentiating teacher performance. Of course, even if it did, holding teachers accountable for students they don’t teach defies logic.

For more on education politics and policy in Tennessee, follow @TNEdReport

 

Do Your Job, Get Less Money

Over at Bluff City Ed, there’s an article analyzing the new pay scale for teachers in Shelby County Schools. The scale is weighted toward TVAAS data and the evaluation rubric, which rates teachers on a scale of 1-5, 1 being significantly below expectations and 5 being significantly above. A teacher earning a 3 “meets expectations.” That means they are doing their job and doing it well.

Jon does a nice job of breaking down what it means to “meet expectations.” But, here’s the problem he’s highlighting:  Teachers who meet expectations in the new system would see a reduction in their annual step raise. That’s right: They do their job and meet the district’s performance expectations and yet earn LESS than they would with the current pay system.

Jon puts it this way:

But what the district outlines as meeting expectations exemplifies a hardworking and effective educator who is making real progress with their community, school and students. If a teacher is doing all these things, I believe that they should be in line for a yearly raise, not a cut. At its core, this new merit pay system devalues our teachers who fulfill their professional duties in every conceivable way.

I would add to this argument that to the extent that the new pay scale is based on a flawed TVAAS system which provides minimal differentiation among teachers, it is also flawed. Value-added data does not reveal much about the differences in teacher performance. As such, this data shouldn’t weigh heavily (or at all) in performance pay schemes.

Systems like Shelby County may be better served by a pay scale that starts teachers at a high salary and rewards them well over time. Increasing pay overall creates the type of economic incentives that both attract strong teachers and encourage school systems to develop talent and counsel out low performers.

Shelby County can certainly do more to attract and retain strong teaching talent. But the new pay scale is the wrong way to achieve that goal.

For more on education politics and policy in Tennessee, follow @TNEdReport

 

Little Value Added?

 

That’s the conclusion teacher Jon Alfuth draws about Governor Bill Haslam’s recently announced changes to teacher evaluation and support.

Alfuth notes with frustration that Haslam appears happy to support teachers in ways that don’t involve any new money.

Reducing the weight given TVAAS on a teacher’s evaluation, for example, doesn’t cost anything. Adding a few teachers to a “cabinet” to give feedback on tests is welcome change, but also doesn’t carry a price tag.

Haslam’s changes still unfairly assess teachers in non-tested subjects, in Alfuth’s view:

While reducing the percentage from 25 to 15 percent achievement data for non-EOC teachers is a step in the right direction, I don’t feel that it goes far enough. I personally think it’s unfair to use test scores from courses not taught by a teacher in their evaluation given the concerns surrounding the reliability of these data systems overall.

And, Alfuth says, the financial support teachers and schools need is simply not discussed:

Consider the teacher salary discussion we’ve been having here in Tennessee. This is something that Tennessee Teachers have been clamoring for and which the governor promised but then went back on this past spring. There’s no mention of other initiatives that would require extra funding, such as BEP2.0, which would provide millions of additional dollars to our school districts across the state and do much to help teachers. There’s also no mention of expanding training Common Core trainng, which is essential if we’re going to continue to enable teachers to be successful when the three year phase in of growth scores winds down.

In short, while the proposed changes are step forward, at least in the view of one teacher, much more can be done to truly support teachers and their students.

More on the importance of investing in teacher pay:

Notes on Teacher Pay

More on the state’s broken school funding formula, the BEP:

A BEP Lawsuit?

The Broken BEP

What is BEP 2.0?

For more from Jon Alfuth and education issues in Memphis, follow @BluffCityEd

For more on education politics and policy in Tennessee, follow @TNEdReport

Value Added Changes

 

In what is certain to be welcome news to many teachers across the state, Governor Bill Haslam announced yesterday that he will be proposing changes to the state’s teacher evaluation process in the 2015 legislative session.

Perhaps the most significant proposal is to reduce the weight of value-added data on teacher evaluations during the transition to a new test for Tennessee students.

From the Governor’s press release explaining the proposed changes:

The governor’s proposal would:
•        Adjust the weighting of student growth data in a teacher’s evaluation so that the new state assessments in ELA and math will count 10 percent of the overall evaluation in the first year of
administration (2016), 20 percent in year two (2017) and 35 percent in year
three (2018). Currently 35 percent of an educator’s evaluation is comprised of
student achievement data based on student growth;
•        Lower the weight of student achievement growth for teachers in non-tested grades and subjects
from 25 percent to 15 percent;
•        And make explicit local school district discretion in both the qualitative teacher evaluation model that is used for the observation portion of the evaluation as well as the specific
weight student achievement growth in evaluations will play in personnel
decisions made by the district.

 

The proposal does not go as far as some have proposed, but it does represent a transition period to new tests that teachers have been seeking.  It also provides more local discretion in how evaluations are conducted.

Some educators and critics question the ability of value-added modeling to accurately predict teacher performance.

In fact, the American Statistical Association released a statement on value-added models that says, in part:

Most VAM studies find that teachers account for about 1% to 14% of the variability in test scores

Additional analysis of the ability of value-added modeling to predict significant differences in teacher performance finds that this data doesn’t effectively differentiate among teachers.

I certainly have been critical of the over-reliance on value-added modeling in the TEAM evaluation model used in Tennessee. While the proposed change ultimately returns to using VAM for a significant portion of teacher scores, it also represents an opportunity to both transition to a new test AND explore other options for improving the teacher evaluation system.

For more on value-added modeling and its impact on the teaching profession:

Saving Money and Supporting Teachers

Real World Harms of Value-Added Data

Struggles with Value-Added Data

An Ineffective Teacher?

Principals’ Group Challenges VAM

 

For more on education policy and politics in Tennessee, follow @TNEdReport

Ravitch: Ed Reform is a Hoax

Education scholar and activist Diane Ravitch spoke at Vanderbilt University in Nashville last night at an event hosted by Tennesseans Reclaiming Educational Excellence (TREE), the Tennessee BATs (Badass Teachers), and the Momma Bears.

Ravitch touched on a number of hot-button education issues, including vouchers, charter schools, teacher evaluations, and testing. Many of these issues are seeing plenty of attention in Tennessee public policy circles both on the local and state levels.

She singled out K12, Inc. as a bad actor in the education space, calling the Tennessee Virtual Academy it runs a “sham.”

Attempts have been made to cap enrollment and shut down K12, Inc. in Tennessee, but they are still operating this year. More recently, the Union County School Board defied the State Department of Education and allowed 626 students to remain enrolled in the troubled school. The reason? Union County gets a payoff of $132,000 for their contract with K12.

Ravitch noted that there are good actors in the charter sector, but also said she adamantly opposes for-profit charter schools. Legislation that ultimately failed in 2014 would have allowed for-profit charter management companies to be hired by Tennessee charter schools.

On vouchers, an issue that has been a hot topic in the last two General Assemblies, Ravitch pointed to well-established data from Milwaukee that vouchers have made no difference in overall student performance.

Despite the evidence against vouchers, it seems quite likely they will again be an issue in the 2015 General Assembly. In fact, the Koch Brothers and their allies spent heavily in the recent elections to ensure that vouchers are back on the agenda.

Ravitch told the crowd that using value-added data to evaluate teachers makes no sense. The Tennessee Value-Added Assessment System (TVAAS) has been around since the BEP in 1992. It was created by UT Ag Professor Bill Sanders. Outgoing Commissioner of Education Kevin Huffman made an attempt to tie teacher licenses to TVAAS scores, but that was later repealed by the state board of education. A careful analysis of the claims of value-added proponents demonstrates that the data reveals very little in terms of differentiation among teachers.

Ravitch said that instead of punitive evaluation systems, teachers need resources and support. Specifically, she mentioned Peer Assistance and Review as an effective way to provide support and meaningful development to teachers.

A crowd of around 400 listened and responded positively throughout the hour-long speech. Ravitch encouraged the audience to speak up about the harms of ed reform and rally for the reforms and investments our schools truly need.

For more on education politics and policy in Tennessee, follow @TNEdReport