Key Driver

Much is being made of Tennessee’s teacher evaluation system as a “key driver” in recent “success” in the state’s schools.

A closer look, however, reveals there’s more to the story.

Here’s a key piece of information in a recent story in the Commercial Appeal:

The report admits an inability to draw a direct, causal link from the changes in teacher evaluations, implemented during the 2011-12 school year, and the subsequent growth in classrooms across the state.

Over the same years, the state has also raised its education standards, overhauled its assessment and teacher preparation programs and implemented new turnaround programs for struggling schools.

Of course, it’s also worth noting that BEFORE any of these changes, Tennessee students were scoring well on the state’s TCAP test — teachers were given a mark and were consistently hitting the mark, no matter the evaluation style.

Additionally, it’s worth noting that “growth” as it relates to the current TNReady test is difficult to measure due to the unreliable test administration, including this year’s problems with hackers and dump trucks.

While the TEAM evaluation rubric is certainly more comprehensive than those used in the past, the classroom observation piece becomes difficult to capture in a single observation and the TVAAS-based growth component is fraught with problems even under the best circumstances.

Let’s look again, though, at the claim of sustained “success” since the implementation of these evaluation measures as well as other changes.

We’ll turn to the oft-lauded NAEP results for a closer look:

First, notice that between 2009 and 2011, Tennessee saw drops in 4th and 8th grade reading and 8th grade math. That helps explain the “big gains” seen in 2013. Next, note that in 4th and 8th grade reading and 4th grade math, our 2017 scores are lower than the 2013 scores. There’s that leveling off I suggested was likely. Finally, note that in 4th and 8th grade reading, the 2017 scores are very close to the 2009 scores. So much for “fastest-improving.”

Tennessee is four points below the national average in both 4th and 8th grade math. When it comes to reading, we are 3 points behind the national average in 4th grade and 5 points behind in 8th grade.

All of this to say: You can’t say you’re the fastest-improving state on NAEP based on one testing cycle. You also shouldn’t make long-term policy decisions based on seemingly fabulous results in one testing cycle. Since 2013, Tennessee has doubled down on reforms with what now appears to be little positive result.

In other words, in terms of a national comparison of education “success,” Tennessee still has a long way to go.

That may well be because we have yet to actually meaningfully improve investment in schools:

Tennessee is near the bottom. The data shows we’re not improving (Since Bill Haslam became Governor). At least not faster than other states.

We ranked 44th in the country for investment in public schools back in 2010 — just before these reforms — and we rank 44th now.

Next, let’s turn to the issue of assessing growth. Even in good years, that’s problematic using value-added data:

And so perhaps we shouldn’t be using value-added modeling for more than informing teachers about their students and their own performance. Using it as one small tool as they seek to continuously improve practice. One might even mention a VAM score on an evaluation — but one certainly wouldn’t base 35-50% of a teacher’s entire evaluation on such data. In light of these numbers from the Harvard researchers, that seems entirely irresponsible.

Then, there’s the issue of fairness when it comes to using TVAAS. Two different studies have shown notable discrepancies in the value-added scores of middle school teachers at various levels:

Last year, I wrote about a study of Tennessee TVAAS scores conducted by Jessica Holloway-Libell. She examined 10 Tennessee school districts and their TVAAS score distribution. Her findings suggest that ELA teachers are less likely than Math teachers to receive positive TVAAS scores, and that middle school teachers generally, and middle school ELA teachers in particular, are more likely to receive lower TVAAS scores.

A second, more comprehensive study indicates a similar challenge:

The study used TVAAS scores alone to determine a student’s access to “effective teaching.” A teacher receiving a TVAAS score of a 4 or 5 was determined to be “highly effective” for the purposes of the study. The findings indicate that Math teachers are more likely to be rated effective by TVAAS than ELA teachers and that ELA teachers in grades 4-8 (mostly middle school grades) were the least likely to be rated effective. These findings offer support for the similar findings made by Holloway-Libell in a sample of districts. They are particularly noteworthy because they are more comprehensive, including most districts in the state.

These studies are based on TVAAS when everything else is going well. But, testing hasn’t been going well and testing is what generates TVAAS scores. So, the Tennessee Department of Education has generated a handy sheet explaining all the exceptions to the rules regarding TVAAS and teacher evaluation:

However, to comply with the Legislation and ensure no adverse action based on 2017-18 TNReady data, teachers and principals who have 2017-18 TNReady data included in their LOE (school-wide TVAAS, individual TVAAS, or achievement measure) may choose to nullify their entire evaluation score (LOE) for the 2017-18 school year at their discretion. No adverse action may be taken against a teacher or principal based on their decision to nullify his or her LOE. Nullifying an LOE will occur in TNCompass through the evaluation summative conference.

Then, there’s the guidance document which includes all the percentage options for using TVAAS:

What is included in teacher evaluation in 2017-18 for a teacher with 3 years of TVAAS data? There are three composite options for this teacher:

• Option 1: TVAAS data from 2017-18 will be factored in at 10%, TVAAS data from 2016-17 will be factored in at 10% and TVAAS data from 2015-16 will be factored in at 15% if it benefits the teacher.

• Option 2: TVAAS data from 2017-18 and 2016-17 will be factored in at 35%.

• Option 3: TVAAS data from 2017-18 will be factored in at 35%. The option that results in the highest LOE for the teacher will be automatically applied. Since 2017-18 TNReady data is included in this calculation, this teacher may nullify his or her entire LOE this year.

That’s just one of several scenarios described to make up for the fact that the State of Tennessee simply cannot reliably deliver a test.

Let’s be clear: Using TVAAS to evaluate a teacher AT ALL in this climate is educational malpractice. But, Commissioner McQueen and Governor Haslam have already demonstrated they have a low opinion of Tennesseans:

Let’s get this straight: Governor Haslam and Commissioner McQueen think no one in Tennessee understands Google? They are “firing” the company that messed up this year’s testing and hiring a new company that owns the old one and that also has a reputation for messing up statewide testing.

To summarize, Tennessee is claiming success off of one particularly positive year on NAEP and on TNReady scores that are consistently unreliable. Then, Tennessee’s Education Commissioner is suggesting the “key driver” to all this success is a highly flawed evaluation system a significant portion of which is based on junk science.

The entire basis of this spurious claim is that two things happened around the same time. Also happened since Tennessee implemented new teacher evaluation and TNReady? Really successful seasons for the Nashville Predators.

Correlation does NOT equal causation. Claiming teacher evaluations are a “key driver” of some fairly limited success story is highly problematic, though typical of this Administration.

Take a basic stats class, Dr. McQueen.

 

For more on education politics and policy in Tennessee, follow @TNEdReport

Your support keeps the education news flowing!


 

This is Fine

Amid the latest round of TNReady troubles that included both miscalculated student scores and errors in how those scores were used in some teacher evaluations, the House of Representatives held hearings last week to search for answers.

On the same day of the committee hearings, Governor Bill Haslam let everyone know that things were going well.

Chalkbeat reports:

Earlier in the day, Gov. Bill Haslam called the controversy overblown because this year’s errors were discovered as part of the state’s process for vetting scores.

“I think the one thing that’s gotten lost in all this discussion is the process worked,” Haslam told reporters. “It was during the embargo period before any of the results were sent out to students and their families that this was caught.”

Here’s the deal: If this were the only problem with TNReady so far, Governor Haslam would be right. This would be no big deal. But, you know, it’s not the only problem. At all.

Let’s start from the beginning. Which was supposed to be 2016. Except it didn’t happen. And then it kept not happening. For full disclosure, I have a child who was in 4th grade at the time of what was to be the inaugural year of TNReady. The frustration of watching her prepare for a week of testing only to be told it would happen later and then later and then maybe never was infuriating. That adults at decision-making levels think it is just fine to treat students that way is telling. It also says something that when some adults try to stand up for their students, they are smacked down by our Commissioner of Education.

As for the aforementioned Commissioner of Education, some may remember the blame shifting and finger pointing engaged in by Commissioner McQueen and then-TNReady vendor Measurement, Inc. That same attitude was on display again this year when key deadlines were missed for the return of “quick scores” to school districts.

Which brings us to the perennial issue of delivering accurate score reports to districts. This year was the fourth year in a row there have been problems delivering these results to school districts. Each year, we hear excuses and promises about how it will be better next year. Then, it isn’t.

Oh, and what if you’re a parent like me and you’re so frustrated you just want to opt your child out of testing. Well, according to Commissioner McQueen and the Governor who supports her, that’s not an option. Sadly, many districts have fallen in line with this way of thinking.

Here’s the thing: McQueen’s reasoning is missing something. Yes, she lacks credibility generally. But, specifically, she’s ignoring some key evidence. As I noted previously:

All along, the state has argued a district’s federal funds could be in jeopardy due to refusal to administer the test or a district’s inability to test at least 95% of its students.

As such, the argument goes, districts should fight back against opt-outs and test refusals by adopting policies that penalize students for taking these actions.

There’s just one problem: The federal government has not (yet) penalized a single district for failing to hit the 95% benchmark. In fact, in the face of significant opt-outs in New York last year (including one district where 89% of students opted-out), the U.S. Department of Education communicated a clear message to New York state education leaders:  Districts and states will not suffer a loss of federal dollars due to high test refusal rates. The USDOE left it up to New York to decide whether or not to penalize districts financially.

So, you have a system that is far from perfect and based on this system (TNReady), you penalize teachers (through their evaluations) and schools (through an A-F school grading system). Oh yeah, and you generate “growth” scores and announce “reward” schools based on what can best be described as a problematic (so far) measuring stick with no true comparability to the previous measuring stick.

Anyway, Bill Haslam is probably right. This is fine.

For more on education politics and policy in Tennessee, follow @TNEdReport


 

A Lot of Words

The Murfreesboro City School Board has already expressed concern about the state’s TNReady tests and the delay in receiving results.

More recently, Board members expressed frustration with the response they received from Education Commissioner Candice McQueen.

The Murfreesboro Post reports:

“I felt like it was a lot of words for not really answering our questions,” said Board Member Jared Barrett. He referred to the response as having “excuses” and “dodging the question.”

“My first response when I read this letter was that there’s something in here that doesn’t add up,” said Board Member Phil King. “My fear is they haven’t solved the problem of getting the paper tests in our hands in a timely manner.”

King suggested moving away from using TNReady in teacher evaluations until the state can prove it can get results back to districts in a timely manner.

The Murfreesboro School Board meeting happened before the most recent round of TNReady troubles, with some students receiving incorrect scores and some teachers not having students properly counted in their TVAAS scores.

In response to those issues, House Speaker Beth Harwell has called for hearings on the issue of state testing.

Additionally, yesterday, the United Education Association of Shelby County called for TNReady scores for this year to be invalidated and for a moratorium on including TNReady scores in accountability measures until 2021.

For more on education politics and policy in Tennessee, follow @TNEdReport


 

 

Apples and Oranges

Here’s what Director of Schools Dorsey Hopson had to say amid reports that schools in his Shelby County district showed low growth according to recently released state test data:

Hopson acknowledged concerns over how the state compares results from “two very different tests which clearly are apples and oranges,” but he added that the district won’t use that as an excuse.

“Notwithstanding those questions, it’s the system upon which we’re evaluated on and judged,” he said.

State officials stand by TVAAS. They say drops in proficiency rates resulting from a harder test have no impact on the ability of teachers, schools and districts to earn strong TVAAS scores, since all students are experiencing the same change.

That’s all well and good, except when the system upon which you are evaluated is seriously flawed, it seems there’s an obligation to speak out and fight back.

Two years ago, ahead of what should have been the first year of TNReady, I wrote about the challenges of creating valid TVAAS scores while transitioning to a new test. TNReady was not just a different test, it was (is) a different type of test than the previous TCAP test. For example, it included constructed response questions instead of simply multiple choice bubble-in questions.

Here’s what I wrote:

Here’s the problem: There is no statistically valid way to predict expected growth on a new test based on the historic results of TCAP. First, the new test has (supposedly) not been fully designed. Second, the test is in a different format. It’s both computer-based and it contains constructed-response questions. That is, students must write-out answers and/or demonstrate their work.

Since Tennessee has never had a test like this, it’s impossible to predict growth at all. Not even with 10% confidence. Not with any confidence. It is the textbook definition of comparing apples to oranges.

Here’s a statement from the academic article I cited to support this claim:

Here’s what Lockwood and McCaffrey (2007) had to say in the Journal of Educational Measurement:

We find that the variation in estimated effects resulting from the different mathematics achievement measures is large relative to variation resulting from choices about model specification, and that the variation within teachers across achievement measures is larger than the variation across teachers.
You get different value-added results depending on the type of test you use. That is, you can’t just say this is a new test but we’ll compare peer groups from the old test and see what happens. Plus, TNReady presents the added challenge of not having been fully administered last year, so you’re now looking at data from two years ago and extrapolating to this year’s results.
Of course, the company paid millions to crunch the TVAAS numbers says that this transition presents no problem at all. Here’s what their technical document has to say about the matter:
In 2015-16, Tennessee implemented new End-of-Course (EOC) assessments in math and English/language arts. Redesigned assessments in Math and English/language arts were also implemented in grades 3-8 during the 2016-17 school year. Changes in testing regimes occur at regular intervals within any state, and these changes need not disrupt the continuity and use of value-added reporting by educators and policymakers. Based on twenty years of experience with providing valueadded and growth reporting to Tennessee educators, EVAAS has developed several ways to accommodate changes in testing regimes.
Prior to any value-added analyses with new tests, EVAAS verifies that the test’s scaling properties are suitable for such reporting. In addition to the criteria listed above, EVAAS verifies that the new test is related to the old test to ensure that the comparison from one year to the next is statistically reliable. Perfect correlation is not required, but there should be a strong relationship between the new test and old test. For example, a new Algebra I exam should be correlated to previous math scores in grades seven and eight and to a lesser extent other grades and subjects such as English/language arts and science. Once suitability of any new assessment has been confirmed, it is possible to use both the historical testing data and the new testing data to avoid any breaks or delays in value-added reporting.
A couple of problems with this. First, there was NO complete administration of a new testing regime in 2015-16. It didn’t happen.
Second, EVAAS doesn’t get paid if there’s not a way to generate these “growth scores” so it is in their interest to find some justification for comparing the two very different tests.
Third, researchers who study value-added modeling are highly skeptical of the reliability of comparisons between different types of tests when it comes to generating value-added scores. I noted Lockwood and McCaffrey (2007) above. Here are some more:
John Papay (2011) did a similar study using three different reading tests, with similar results. He stated his conclusion as follows: [T]he correlations between teacher value-added estimates derived from three separate reading tests — the state test, SRI [Scholastic Reading Inventory], and SAT [Stanford Achievement Test] — range from 0.15 to 0.58 across a wide range of model specifications. Although these correlations are moderately high, these assessments produce substantially different answers about individual teacher performance and do not rank individual teachers consistently. Even using the same test but varying the timing of the baseline and outcome measure introduces a great deal of instability to teacher rankings.
Two points worth noting here: First, different tests yield different value-added scores. Second, even using the same test but varying the timing can create instability in growth measures.
Then, there’s data from the Measures of Effective Teaching (MET) Project, which included data from Memphis. In terms of reliability when using value-added among different types of tests, here’s what MET reported:
Once more, the MET study offered corroborating evidence. The correlation between value-added scores based on two different mathematics tests given to the same students the same year was only .38. For 2 different reading tests, the correlation was .22 (the MET Project, 2010, pp. 23, 25).
Despite the claims of EVAAS, the academic research raises significant concerns about extrapolating results from different types of tests. In short, when you move to a different test, you get different value-added results. As I noted in 2015:

If you measure different skills, you get different results. That decreases (or eliminates) the reliability of those results. TNReady is measuring different skills in a different format than TCAP. It’s BOTH a different type of test AND a test on different standards. Any value-added comparison between the two tests is statistically suspect, at best. In the first year, such a comparison is invalid and unreliable. As more years of data become available, it may be possible to make some correlation between past TCAP results and TNReady scores.

Or, if the state is determined to use growth scores (and wants to use them with accuracy), they will wait several years and build completely new growth models based on TNReady alone. At least three years of data would be needed in order to build such a model.

Dorsey Hopson and other Directors of Schools should be pushing back aggressively. Educators should be outraged. After all, this unreliable data will be used as a portion of their teacher evaluations this year. Schools are being rated on a 1-5 scale based on a growth model grounded in suspect methods.

How much is this apple like last year’s orange? How much will this apple ever be like last year’s orange?

If we’re determined to use value-added modeling to measure school-wide growth or district performance, we should at least be determined to do it in a way that ensures valid, reliable results.

For more on education politics and policy in Tennessee, follow @TNEdReport


 

Mike Stein on the Teachers’ Bill of Rights

Coffee County teacher Mike Stein offers his thoughts on the Teachers’ Bill of Rights (SB14/HB1074) being sponsored at the General Assembly by Mark Green of Clarksville and Jay Reedy of Erin.

Here’s some of what he has to say:

In my view, the most impactful elements of the Teachers’ Bill of Rights are the last four items. Teachers have been saying for decades that we shouldn’t be expected to purchase our own school supplies. No other profession does that. Additionally, it makes much-needed changes to the evaluation system. It is difficult, if not impossible, to argue against the notion that we should be evaluated by other educators with the same expertise. While good teaching is good teaching, there are content-specific strategies that only experts in that subject would truly be able to appreciate fully. Both the Coffee County Education Association and the Tennessee Education Association support this bill.

And here are those four items he references:

This bill further provides that an educator is not: (1) Required to spend the educator’s personal money to appropriately equip a classroom; (2) Evaluated by professionals, under the teacher evaluation advisory committee, without the same subject matter expertise as the educator; (3) Evaluated based on the performance of students whom the educator has never taught; or (4) Relocated to a different school based solely on test scores from state mandated assessments.

The legislation would change the teacher evaluation system by effectively eliminating TVAAS scores from the evaluations of teachers in non-tested subjects — those scores may be replaced by portfolios, an idea the state has rolled out but not funded. Additionally, identifying subject matter specific evaluators could prove difficult, but would likely provide stronger, more relevant evaluations.

Currently, teachers aren’t required to spend their own money on classrooms, but many teachers do because schools too often lack the resources to meet the needs of students. It’s good to see Senator Green and Rep. Reedy drawing attention to the important issue of classroom resources.

For more on education politics and policy in Tennessee, follow @TNEdReport


 

Knox County Takes a Stand

Last night, the Knox County School Board voted 6-3 in favor of a resolution calling on the General Assembly and State Board of Education to waive the use of TCAP/TNReady data in student grades and teacher evaluations this year.

The move comes as the state prepares to administer the tests this year with a new vendor following last year’s TNReady disaster. The lack of a complete testing cycle last year plus the addition of a new vendor means this year is the first year of the new test.

The Board passed the resolution in spite of Governor Haslam warning against taking such a step.

In his warning, Haslam said:

“The results we’ve seen are not by accident in Tennessee, and I think you have to be really careful about doing anything that could cause that to back up,” Haslam said.

He added:

Haslam attributed that progress to three things, including tying standardized tests to teacher evaluations.

“It’s about raising our standards and expectations, it’s about having year-end assessments that match those standards and then I think it’s about having assessments that are part of teachers’ evaluations,” Haslam said. “I think that you have to have all of those for a recipe for success.”

Haslam can present no evidence for his claim about the use of student assessment in teacher evaluation. In fact, it’s worth noting that prior to 2008, Tennessee students achieved at a high level according to what were then the state standards. While the standards themselves were determined to need improvement, the point is teachers were helping students hit the designated mark.

Teachers were moving students forward at this time without evaluations tied to student test results. Policymakers set a mark for student performance, teachers worked to hit that mark and succeeded. Standards were raised in 2008, and since then, Tennessee has seen detectable growth in overall results, including some exciting news when NAEP results are released.

To suggest that a year without the use of TVAAS scores in teacher evaluations will cause a setback is to insult Tennessee’s teachers. As if they’ll just relax and not teach as hard.

Another argument raised against the resolution is that it will somehow absolve teachers and students of accountability.

Joe Sullivan reports in the Knoxville Mercury:

In an email to board members, [Interim Director of Schools Buzz] Thomas asserted that, “We need a good standardized test each year to tell us how we are doing compared to others across the state and the nation. We will achieve greatness not by shying away from this accountability but by embracing it.” And he fretted that, “This resolution puts that at risk. In short, it will divide us. Once again we could find ourselves in two disputing camps. The pro-achievement folks on the one side and the pro-teacher folks on the other.”

Right now, we don’t know if we have a good standardized test. Taking a year to get it right is important, especially in light of the frustrations of last year’s TNReady experience.

Of course, there’s no need for pro-achievement and pro-teacher folks to be divided into two camps, either. Tennessee can have a good, solid test that is an accurate measure of student achievement and also treat teachers fairly in the evaluation process.

To be clear, teachers aren’t asking for a waiver from all evaluation. They are asking for a fair, transparent evaluation system. TVAAS has long been criticized as neither. Even under the best of circumstances, TVAAS provides a minimal level of useful information about teacher performance.

Now, we’re shifting to a new test. That shift alone makes it impossible to achieve a valid value-added score. In fact, researchers in the Journal of Educational Measurement have said:

We find that the variation in estimated effects resulting from the different mathematics achievement measures is large relative to variation resulting from choices about model specification, and that the variation within teachers across achievement measures is larger than the variation across teachers. These results suggest that conclusions about individual teachers’ performance based on value-added models can be sensitive to the ways in which student achievement is measured.
These findings align with similar findings by Martineau (2006) and Schmidt et al (2005)
You get different results depending on the type of question you’re measuring.

The researchers tested various VAM models (including the type used in TVAAS) and found that teacher effect estimates changed significantly based on both what was being measured AND how it was measured.

Changing to a new type of test creates value-added uncertainty. That means results attributed to teachers based on a comparison of this year’s tests and the old tests will not yield valid results.

While insisting that districts use TVAAS in teacher evaluations this year, the state is also admitting it’s not quite sure how that will work.

From Sullivan’s story:

When asked how these determinations will be made, a spokesperson for the state Department of Education acknowledges that a different methodology will have to be employed and says that, “we are still working with various statisticians and experts to determine the exact methodology we will use this year.”

Why not at take at least a year, be sure there’s a test that works, and then build a model based on that? What harm would come from giving teachers and students a year with a test that’s just a test? Moreover, the best education researchers have already warned that testing transitions create value-added bumps. Why not avoid the bumps and work to create an evaluation system that is fair and transparent?

Knox County has taken a stand. We’ll soon see if others follow suit. And if the state is listening.

For more on education politics and policy in Tennessee, follow @TNEdReport


 

 

Bias Confirmed

Last year, I wrote about a study of Tennessee TVAAS scores conducted by Jessica Holloway-Libell. She examined 10 Tennessee school districts and their TVAAS score distribution. Her findings suggest that ELA teachers are less likely than Math teachers to receive positive TVAAS scores, and that middle school teachers generally, and middle school ELA teachers in particular, are more likely to receive lower TVAAS scores.

The findings, based on a sampling of districts, suggest one of two things:

1) Tennessee’s ELA teachers are NOT as effective as Tennessee’s Math teachers and the middle school teachers are less effective than the high school teachers

OR

2) TVAAS scores are biased against ELA teachers (or in favor of Math teachers) due to the nature of the subjects being tested.

The second option actually has support from data analysis, as I indicated at the time and repeat here:

Holloway-Libell’s findings are consistent with those of Lockwood and McCaffrey (2007) published in the Journal of Educational Measurement:

The researchers tested various VAM models and found that teacher effect estimates changed significantly based on both what was being measured AND how it was measured.

That is, it’s totally consistent with VAM to have different estimates for math and ELA teachers, for example. Math questions are often asked in a different manner than ELA questions and the assessment is covering different subject matter.

Now, there’s even more evidence to suggest that TVAAS scores vary based on subject matter and grade level – which would minimize their ability to provide meaningful information about teacher effectiveness.

A recently released study about effective teaching in Tennessee includes the following information:

The study used TVAAS scores alone to determine a student’s access to “effective teaching.” A teacher receiving a TVAAS score of a 4 or 5 was determined to be “highly effective” for the purposes of the study. The findings indicate that Math teachers are more likely to be rated effective by TVAAS than ELA teachers and that ELA teachers in grades 4-8 (mostly middle school grades) were the least likely to be rated effective. These findings offer support for the similar findings made by Holloway-Libell in a sample of districts. They are particularly noteworthy because they are more comprehensive, including most districts in the state.

Here’s a breakdown of the findings by percentage of teachers rated effective and including the number of districts used to determine the average.

4-8 Math           47.5% effective                        126 districts

HS Math            38.9% effective                          94 districts

4-8 ELA              24.2% effective                      131 districts

HS ELA               31.1% effective                       100 districts

So, TVAAS scores are more likely to result in math teachers being rated effective and middle school ELA teachers are the least likely to receive effective ratings.

Again, the question is: Are Tennessee’s ELA teachers really worse than our Math teachers? And, are middle school ELA teachers the worst teachers in Tennessee?

Alternatively, one might suppose that TVAAS, as data from other value-added models suggests, is susceptible to subject matter bias, and to a lesser extent, grade level bias.

That is, the data generated by TVAAS is not a reliable predictor of teacher performance.

For more on education politics and policy in Tennessee, follow @TNEdReport

 

Not Yet Ready for Teacher Evaluation?

Last night, the Knox County Board of Education passed a resolution asking the state to not count this year’s new TNReady test in teacher evaluation.

Board members cited the grace period the state is granting to students as one reason for the request. While standardized test scores count in student grades, the state has granted a waiver of that requirement in the first year of the new test.

However, no such waiver was granted for teachers, who are evaluated using student test scores and a metric known as value-added modeling that purports to reflect student growth.

Instead, the Department of Education proposed and the legislature supported a plan to phase-in the TNReady scores in teacher evaluations. This plan presents problems in terms of statistical validity.

Additionally, the American Educational Research Association released a statement recently cautioning states against using value-added models in high-stakes decisions involving teachers:

In a statement released today, the American Educational Research Association (AERA) advises those using or considering use of value-added models (VAM) about the scientific and technical limitations of these measures for evaluating educators and programs that prepare teachers. The statement, approved by AERA Council, cautions against the use of VAM for high-stakes decisions regarding educators.

So, regardless of the phase-in of TNReady, value-added models for evaluating teachers are problematic. When you add the transition to a new test to the mix, you only compound the existing problems, making any “score” assigned to a teacher even more unreliable.

Tullahoma City Schools Superintendent Dan Lawson spoke to the challenges with TVAAS recently in a letter he released in which he noted:

Our teachers are tasked with a tremendous responsibility and our principals who provide direct supervision assign teachers to areas where they are most needed. The excessive reliance on production of a “teacher number” produces stress, a lack of confidence and a drive to first protect oneself rather than best educate the child.

It will be interesting to see if other school systems follow Knox County’s lead on this front. Even more interesting: Will the legislature take action and at the least, waive the TNReady scores from teacher evaluations in the first year of the new test?

A more serious, long-term concern is the use of value-added modeling in teacher evaluation and, especially, in high-stakes decisions like the granting of tenure, pay, and hiring/firing.

More on Value-Added Modeling

The Absurdity of VAM

Unreliable and Invalid

Some Inconvenient Facts About VAM

For more on education politics and policy in Tennessee, follow @TNEdReport