Seeping Scores Sour School Board

Members of the Murfreesboro City School Board are not happy with the slow pace of results coming from the state’s new TNReady test. All seven elected board members sent a letter to Commissioner of Education Candice McQueen expressing their concerns.

The Daily News Journal reports:

“However, currently those test scores seep ever-so-slowly back to their source of origin from September until January,” the letter states. “And every year, precious time is lost. We encourage you to do everything possible to get test results — all the test results — to schools in a timely manner.

“We also encourage you to try to schedule distribution of those results at one time so that months are not consumed in interpreting, explaining and responding to those results,” the letter continued.

A Department of Education spokesperson suggested the state wants the results back sooner, too:

“We know educators, families and community members want these results so they can make key decisions and improve, and we want them to be in their hands as soon as possible,” Gast said.. “We, at the department, also desire these results sooner.”

Of course, this is the same department that continues to have trouble releasing quick score data in time for schools to use it in student report cards. In fact, this marked the fourth consecutive year there’s been a problem with end of year data — either timely release of that data or clear calculation of the data.

TDOE spokesperson Sara Gast went further in distancing the department from blame, saying:

Local schools should go beyond TNReady tests in determining student placement and teacher evaluations, Gast said.

“All personnel decisions, including retaining, placing, and paying educators, are decisions that are made locally, and they are not required to be based on TNReady results,” Gast said. “We hope that local leaders use multiple sources of feedback in making those determinations, not just one source, but local officials have discretion on their processes for those decisions.”

Here’s the problem with that statement: This is THE test. It is the test that determines a school’s achievement and growth score. It is THE test used to calculate an (albeit invalid) TVAAS score for teachers. It is THE test used in student report cards (when the quick scores come back on time). This is THE test.

Teachers are being asked RIGHT NOW to make choices about the achievement measure they will be evaluated on for their 2017-18 TEAM evaluation. One choice: THE test. The TNReady test. But there aren’t results available to allow teachers and principals to make informed choices.

One possible solution to the concern expressed by the Murfreesboro School Board is to press the pause button. That is, get the testing right before using it for any type of accountability measure. Build some data in order to establish the validity of the growth scores. Administer the test, get the results back, and use the time to work out any challenges. Set a goal of 2019 to have full use of TNReady results.

Another solution is to move to a different set of assessments. Students in Tennessee spend a lot of time taking tests. Perhaps a set of assessments that was less time-consuming could allow for both more instructional time and more useful feedback. I’ve heard some educators suggest the ACT suite of assessments could be adapted in a way that’s relevant to Tennessee classrooms.

It will be interesting to see if more school districts challenge the Department of Education on the current testing situation.

For more on education politics and policy in Tennessee, follow @TNEdReport


 

Apples and Oranges

Here’s what Director of Schools Dorsey Hopson had to say amid reports that schools in his Shelby County district showed low growth according to recently released state test data:

Hopson acknowledged concerns over how the state compares results from “two very different tests which clearly are apples and oranges,” but he added that the district won’t use that as an excuse.

“Notwithstanding those questions, it’s the system upon which we’re evaluated on and judged,” he said.

State officials stand by TVAAS. They say drops in proficiency rates resulting from a harder test have no impact on the ability of teachers, schools and districts to earn strong TVAAS scores, since all students are experiencing the same change.

That’s all well and good, except when the system upon which you are evaluated is seriously flawed, it seems there’s an obligation to speak out and fight back.

Two years ago, ahead of what should have been the first year of TNReady, I wrote about the challenges of creating valid TVAAS scores while transitioning to a new test. TNReady was not just a different test, it was (is) a different type of test than the previous TCAP test. For example, it included constructed response questions instead of simply multiple choice bubble-in questions.

Here’s what I wrote:

Here’s the problem: There is no statistically valid way to predict expected growth on a new test based on the historic results of TCAP. First, the new test has (supposedly) not been fully designed. Second, the test is in a different format. It’s both computer-based and it contains constructed-response questions. That is, students must write-out answers and/or demonstrate their work.

Since Tennessee has never had a test like this, it’s impossible to predict growth at all. Not even with 10% confidence. Not with any confidence. It is the textbook definition of comparing apples to oranges.

Here’s a statement from the academic article I cited to support this claim:

Here’s what Lockwood and McCaffrey (2007) had to say in the Journal of Educational Measurement:

We find that the variation in estimated effects resulting from the different mathematics achievement measures is large relative to variation resulting from choices about model specification, and that the variation within teachers across achievement measures is larger than the variation across teachers.
You get different value-added results depending on the type of test you use. That is, you can’t just say this is a new test but we’ll compare peer groups from the old test and see what happens. Plus, TNReady presents the added challenge of not having been fully administered last year, so you’re now looking at data from two years ago and extrapolating to this year’s results.
Of course, the company paid millions to crunch the TVAAS numbers says that this transition presents no problem at all. Here’s what their technical document has to say about the matter:
In 2015-16, Tennessee implemented new End-of-Course (EOC) assessments in math and English/language arts. Redesigned assessments in Math and English/language arts were also implemented in grades 3-8 during the 2016-17 school year. Changes in testing regimes occur at regular intervals within any state, and these changes need not disrupt the continuity and use of value-added reporting by educators and policymakers. Based on twenty years of experience with providing valueadded and growth reporting to Tennessee educators, EVAAS has developed several ways to accommodate changes in testing regimes.
Prior to any value-added analyses with new tests, EVAAS verifies that the test’s scaling properties are suitable for such reporting. In addition to the criteria listed above, EVAAS verifies that the new test is related to the old test to ensure that the comparison from one year to the next is statistically reliable. Perfect correlation is not required, but there should be a strong relationship between the new test and old test. For example, a new Algebra I exam should be correlated to previous math scores in grades seven and eight and to a lesser extent other grades and subjects such as English/language arts and science. Once suitability of any new assessment has been confirmed, it is possible to use both the historical testing data and the new testing data to avoid any breaks or delays in value-added reporting.
A couple of problems with this. First, there was NO complete administration of a new testing regime in 2015-16. It didn’t happen.
Second, EVAAS doesn’t get paid if there’s not a way to generate these “growth scores” so it is in their interest to find some justification for comparing the two very different tests.
Third, researchers who study value-added modeling are highly skeptical of the reliability of comparisons between different types of tests when it comes to generating value-added scores. I noted Lockwood and McCaffrey (2007) above. Here are some more:
John Papay (2011) did a similar study using three different reading tests, with similar results. He stated his conclusion as follows: [T]he correlations between teacher value-added estimates derived from three separate reading tests — the state test, SRI [Scholastic Reading Inventory], and SAT [Stanford Achievement Test] — range from 0.15 to 0.58 across a wide range of model specifications. Although these correlations are moderately high, these assessments produce substantially different answers about individual teacher performance and do not rank individual teachers consistently. Even using the same test but varying the timing of the baseline and outcome measure introduces a great deal of instability to teacher rankings.
Two points worth noting here: First, different tests yield different value-added scores. Second, even using the same test but varying the timing can create instability in growth measures.
Then, there’s data from the Measures of Effective Teaching (MET) Project, which included data from Memphis. In terms of reliability when using value-added among different types of tests, here’s what MET reported:
Once more, the MET study offered corroborating evidence. The correlation between value-added scores based on two different mathematics tests given to the same students the same year was only .38. For 2 different reading tests, the correlation was .22 (the MET Project, 2010, pp. 23, 25).
Despite the claims of EVAAS, the academic research raises significant concerns about extrapolating results from different types of tests. In short, when you move to a different test, you get different value-added results. As I noted in 2015:

If you measure different skills, you get different results. That decreases (or eliminates) the reliability of those results. TNReady is measuring different skills in a different format than TCAP. It’s BOTH a different type of test AND a test on different standards. Any value-added comparison between the two tests is statistically suspect, at best. In the first year, such a comparison is invalid and unreliable. As more years of data become available, it may be possible to make some correlation between past TCAP results and TNReady scores.

Or, if the state is determined to use growth scores (and wants to use them with accuracy), they will wait several years and build completely new growth models based on TNReady alone. At least three years of data would be needed in order to build such a model.

Dorsey Hopson and other Directors of Schools should be pushing back aggressively. Educators should be outraged. After all, this unreliable data will be used as a portion of their teacher evaluations this year. Schools are being rated on a 1-5 scale based on a growth model grounded in suspect methods.

How much is this apple like last year’s orange? How much will this apple ever be like last year’s orange?

If we’re determined to use value-added modeling to measure school-wide growth or district performance, we should at least be determined to do it in a way that ensures valid, reliable results.

For more on education politics and policy in Tennessee, follow @TNEdReport


 

It Doesn’t Matter Except When It Does

This year’s TNReady quick score setback means some districts will use the results in student report cards and some won’t. Of course, that’s nobody’s fault. 

One interesting note out of all of this came as Commissioner McQueen noted that quick scores aren’t what really matters anyway. Chalkbeat reports:

The commissioner emphasized that the data that matters most is not the preliminary data but the final score reports, which are scheduled for release in July for high schools and the fall for grades 3-8. Those scores are factored into teachers’ evaluations and are also used to measure the effectiveness of schools and districts.

“Not until you get the score report will you have the full context of a student’s performance level and strengths and weaknesses in relation to the standards,” she said.

The early data matters to districts, though, since Tennessee has tied the scores to student grades since 2011.

First, tying the quick scores to student grades is problematic. Assuming TNReady is a good, reliable test, we’d want the best results to be used in any grade calculation. Using pencil and paper this year makes that impossible. Even when we switch to a test fully administered online, it may not be possible to get the full scores back in time to use those in student grades.

Shifting to a model that uses TNReady to inform and diagnose rather than evaluate students and teachers could help address this issue. Shifting further to a project-based assessment model could actually help students while also serving as a more accurate indicator of whether they have met the standards.

Next, the story notes that teachers will be evaluated based on the scores. This will be done via TVAAS — the state’s value-added modeling system. Even as more states move away from value-added models in teacher evaluation, Tennessee continues to insist on using this flawed model.

Again, let’s assume TNReady is an amazing test that truly measures student mastery of standards. It’s still NOT designed for the purpose of evaluating teacher performance. Further, this is the first year the test has been administered. That means it’s simply not possible to generate valid data on teacher performance from this year’s results. You can’t just take this year’s test (TNReady) and compare it to the TCAP from two years ago. They are different tests designed to measure different standards in a different way. You know, the old apples and oranges thing.

One teacher had this to say about the situation:

“There’s so much time and stress on students, and here again it’s not ready,” said Tikeila Rucker, a Memphis teacher who is president of the United Education Association of Shelby County.

For more on education politics and policy in Tennessee, follow @TNEdReport


 

4 Bills Teachers Need To Know About

While the debate around vouchers is loud and needed, we must not forget about the other bills that are making their way through the legislative process. Here are four bills that teachers need to know that will change how teacher effectiveness and preparation are measured.

The proposed bills all look like they will pass and become law. Spread the word about these bills so teachers will have the most updated information.

SB114/HB695 By Senator Bo Watson & Rep. Ryan Williams

There is a consensus that we need to improve the preparation of future teachers. Teachers need the most updated information from faculty that still have connections to the classroom.

The amended version of the bill requires education preparation faculty, including education deans, to have direct personal involvement in a school annually. The state summarizes that bill as follows:

Requires full-time educator preparation program faculty members, including academic deans, to have direct personal involvement in public schools or local education agencies (LEAs) annually. Requires faculty involvement to include professional learning targeted to pre-K through grade 12 teachers; learning focused on LEA specific initiatives; direct instruction to pre- K through grade 12 students; district-level partnership; or observation of pre-K through grade 12 teachers.

The bill has passed the Senate and is waiting to be taken up in the House Finance committee this week.

SB575/HB626 by Senator Dolores Gresham & Rep. Sabi Kumar:

Right now, you are able to log on the Teacher Prep Report Card to find out information about how teacher preparation programs are doing in preparing teachers for the classroom. This bill will add teacher observation data into this mix.

The bill requires the department of education to provide all state board of education approved teacher training programs access to annual evaluation data for teachers and principals graduating from the programs for a minimum of five years following the completion of the program.

It’s not clear if the public will be able to see the evaluation data from the different preparation programs. Either way, I hope the programs will use the data to improve.

The bill has passed the Senate and is waiting for the House to take it up this weeks.

SB1196/HB309 by Senator Mark Norris and Rep. David Hawk

This bill deals with assessment data that are used in overall teacher evaluations. The bill makes permanent the flexibility to use the most recent year of TVAAS student growth, if it leads to a higher evaluation score for the teacher. I’ve heard that some superintendents like this bill because it could be used to reward a teacher for a large one year growth. The three year growth option will allow teacher flexibility to change schools, grade level, or move to support a higher need population.

And here’s the state’s summary:

Requires the student growth portion of teacher evaluations to account for 10 percent of the overall evaluation criteria in FY16-17 and 20 percent in FY17-18 and each year thereafter. Requires that the most recent year’s student growth evaluation composite account for 35 percent of growth data in a teacher’s evaluation, if such use results in a higher evaluation score. Authorizes the use of educational progress and evaluation data for research purposes at postsecondary institutions. Requires Tennessee Comprehensive Assessment Program (TCAP) subject-area scores to make up the following percentages of elementary and middle school students’ final spring semester grades in grades 3-8: 10 percent in FY16-17; 15 percent in FY17-18; and 15 to 25 percent in FY18-19 and subsequent years.

This bill has passed the House and is waiting to be passed in the Senate.

SB250/HB67 by Senator Jim Tracy & Rep. Eddie Smith

This bill is trying to solve the problem that arises with teachers who teach in non-tested subjects. The state summary is pretty clear in this case:

Requires local education agencies (LEAs), by the 2018-2019 academic school year, to adopt at least one appropriate alternative growth model approved by the State Board of Education in order to provide individual growth scores to teachers in non- tested grades and subjects. Requires the Department of Education (DOE) to develop valid and reliable alternative student growth models for non-tested grades and subjects currently without such models.

What do you think?

Teachers, what are thoughts on these four bills? Let us know in the comments.

For more on education politics and policy in Tennessee, follow @TNEdReport.


 

 

It May Be Ready, But is it Valid?

In today’s edition of Commissioner Candice McQueen’s Educator Update, she talks about pending legislation addressing teacher evaluation and TNReady.

Here’s what McQueen has to say about the issue:

As we continue to support students and educators in the transition to TNReady, the department has proposed legislation (HB 309) that lessens the impact of state test results on students’ grades and teachers’ evaluations this year.

In 2015, the Tennessee Teaching Evaluation Enhancement Act created a phase-in of TNReady in evaluation to acknowledge the state’s move to a new assessment that is fully aligned to Tennessee state standards with new types of test questions. Under the current law, TNReady data would be weighted at 20 percent for the 2016-17 year.

However, in the spirit of the original bill, the department’s new legislation resets the phase-in of growth scores from TNReady assessments as was originally proposed in the Tennessee Teaching Evaluation Enhancement Act. Additionally, moving forward, the most recent year’s growth score will be used for a teacher’s entire growth component if such use results in a higher evaluation score for the teacher.

We will update you as this bill moves through the legislative process, and if signed into law, we will share detailed guidance that includes the specific options available for educators this year. As we announced last year, if a teacher’s 2015-16 individual growth data ever negatively impacts his or her overall evaluation, it will be excluded. Additionally, as noted above, teachers will be able to use 2016-17 growth data as 35 percent of their evaluation if it results in a higher overall level of effectiveness.

And here’s a handy graphic that describes the change:

TNReady Graphic

 

 

Of course, there’s a problem with all of this: There’s not going to be valid data to use for TVAAS. Not this year. It’s bad enough that the state is transitioning from one type of test to another. That alone would call into question the validity of any comparison used to generate a value-added score. Now, there’s a gap in the data. As you might recall, there wasn’t a complete TNReady test last year. So, to generate a TVAAS score, the state will have to compare 2014-15 data from the old TCAP tests to 2016-17 data from what we hope is a sound administration of TNReady.

We really need at least three years of data from the new test to make anything approaching a valid comparison. Or, we should start over building a data-set with this year as the baseline. Better yet, we could go the way of Hawaii and Oklahoma and just scrap the use of value-added scores altogether.

Even in the best of scenarios — a smooth transition from TCAP to TNReady — data validity was going to be challenge.

As I noted when the issue of testing transition first came up:

Here’s what Lockwood and McCaffrey (2007) had to say in the Journal of Educational Measurement:

We find that the variation in estimated effects resulting from the different mathematics achievement measures is large relative to variation resulting from choices about model specification, and that the variation within teachers across achievement measures is larger than the variation across teachers. These results suggest that conclusions about individual teachers’ performance based on value-added models can be sensitive to the ways in which student achievement is measured.
These findings align with similar findings by Martineau (2006) and Schmidt et al (2005)
You get different results depending on the type of question you’re measuring.

The researchers tested various VAM models (including the type used in TVAAS) and found that teacher effect estimates changed significantly based on both what was being measured AND how it was measured.

And they concluded:

Our results provide a clear example that caution is needed when interpreting estimated teacher effects because there is the potential for teacher performance to depend on the skills that are measured by the achievement tests.

If you measure different skills, you get different results. That decreases (or eliminates) the reliability of those results. TNReady is measuring different skills in a different format than TCAP. It’s BOTH a different type of test AND a test on different standards. Any value-added comparison between the two tests is statistically suspect, at best. In the first year, such a comparison is invalid and unreliable.

So, we’re transitioning from TCAP to TNReady AND we have a gap in years of data. That’s especially problematic — but, not problematic enough to keep the Department of Education from plowing ahead (and patting themselves on the back) with a scheme that validates a result sure to be invalid.

For more on education politics and policy in Tennessee, follow @TNEdReport


 

Washington Co. Joins Waiver Wave

Last night, the Washington County School Board voted 6-3 in favor of a resolution asking the State of Tennessee to grant a 1-year waiver from the use of TNReady scores in teacher evaluations and student grades. The resolution is similar to those passed in Nashville and Knox County and comes after the State Board of Education voted to change the way End of Course tests are counted in student grades.

The Washington County resolution comes just days before the Tennessee General Assembly returns to action (January 10th). Barring action by the State Board to grant a waiver, the only way it will happen is if lawmakers force the issue.

Similar resolutions were passed last year ahead of TNReady testing that ultimately failed. That makes this year the first year of new tests, now administered by Questar.

Tune in next week and beyond to see if more school boards pass resolutions asking for a waiver or if the State Board or legislature take action.

For more on education politics and policy in Tennessee, follow @TNEdReport


 

Test Scores Are In! How Did Our Nashville Students Do?

Today, the Tennessee Department of Education released TNReady results for individual districts. The data only show results for high schools because elementary and middle schools did not take the full assessment last school year.

For those of you who just want the gist of it, Nashville’s public high schools are struggling to get kids to proficiency, and they’re particularly struggling with math.

Let’s dig a little deeper, using some screenshots from the state’s Report Card website.

ACT Achievement

screen-shot-2016-12-13-at-10-19-14-am

I have written previously about the ACT scores of the district. TNReady is trying to be more aligned with the ACT.

Math and ELA Achievement 

screen-shot-2016-12-13-at-10-21-30-am screen-shot-2016-12-13-at-10-21-34-am

The data show that our high schools are struggling more with math than English language arts (ELA), though each section has only a small percentage of students who are scoring within the top two tiers of TNReady.

Here’s the more in-depth breakdown of the data, including individual subjects. As we see from the graph below, we have new terminology to use when discussing the data.

screen-shot-2016-12-13-at-10-24-19-am

The data clearly show that too many high school students are not “on track” nor have achieved mastery of the subjects. We have given our high schools a makeover, but has that makeover really improved the achievement of our students? That will be hard to tell because this is a brand new assessment.

The achievement of high school students are more than just a problem with high schools. We need more support in lower grades to give students the skills they need to achieve in high school so that they can graduate and move on to college or a career.

Growth

screen-shot-2016-12-13-at-10-28-09-am

screen-shot-2016-12-13-at-10-28-59-am

It’s great to see that we are showing growth in literacy, but we have to do better in math.

We Have to Do Better

Our district has to do better. We have too many students not achieving at the level they should be. I hope our school board will really delve into this issue, instead of spending so much time on petty resolutions that will only hurt the district in the long run.

Turning around our district is not something that will make the newspaper tomorrow. It’s not something that you can brag about in your monthly email in a few weeks. Turning around our district takes time, resources, and a vision to help all students achieve. It means that everyone involved in the education system must work together, which can be hard for some.

It’s results like this that draw people away from Davidson county and into the suburbs and private schools. We can’t let it continue.

Let’s get to work!

For more on education politics and policy in Tennessee, follow @TNEdReport.


 

Waiver Wave

The MNPS School Board unanimously approved a resolution calling for a one-year waiver of the use of TNReady/TCAP scores in both student grades and teacher evaluation. The request follows Knox County’s passage of a similar resolution earlier this month.

Here’s what I wrote about why that was the right move:

Right now, we don’t know if we have a good standardized test. Taking a year to get it right is important, especially in light of the frustrations of last year’s TNReady experience.

Of course, there’s no need for pro-achievement and pro-teacher folks to be divided into two camps, either. Tennessee can have a good, solid test that is an accurate measure of student achievement and also treat teachers fairly in the evaluation process.

To be clear, teachers aren’t asking for a waiver from all evaluation. They are asking for a fair, transparent evaluation system. TVAAS has long been criticized as neither. Even under the best of circumstances, TVAAS provides a minimal levelof useful information about teacher performance.

Now, we’re shifting to a new test. That shift alone makes it impossible to achieve a valid value-added score.

Now, two large Tennessee school districts are calling for a waiver from using test data in student grades and teacher evaluations. Will other districts follow suit? Will the General Assembly pay attention?

Here’s the text of the Nashville resolution:

WHEREAS, the Metropolitan Nashville Public Schools Board of Education is responsible for providing a local system of public education; and
WHEREAS, the State of Tennessee, through the work of the Tennessee General Assembly, the Tennessee Department of Education, the State Board of Education and local school boards, has established nationally recognized standards and measures for accountability in public education; and
WHEREAS, the rollout of the TNReady assessment in School Year 2015-2016 was a failure resulting in lost instructional time for students and undue stress for stakeholders; and
WHEREAS, due to the TNReady failure a waiver was provided for School Year 2015-2016
WHEREAS, a new assessment vendor, Questar, was not selected until July 6, 2016, yet high school students are set to take EOC exams from November 28-December 16; and
WHEREAS, there are documented errors on the part of Questar to administer similar assessments in New York and Mississippi; and
WHEREAS, score reports will be unavailable until Fall 2017; and
WHEREAS, Tennessee teachers will not be involved in writing test items for the assessment in School Year 2016-2017; and
WHEREAS, there is a reliance on using test items from other states, which may not align with Tennessee standards; and
WHEREAS, more than seventy percent of Metro Nashville Public School teachers do not produce individual TVAAS data; and
WHEREAS, the American Educational Research Association released a statement cautioning against the use of value added models, like TVAAS, for evaluating educators and using such data for high-stakes educational decisions;

NOW THEREFORE BE IT RESOLVED BY THE METRO NASHVILLE BOARD OF EDUCATION AS FOLLOWS:

The METRO NASHVILLE Board of Education opposes the use of TCAP data for any percentage of teacher and principal evaluations and student grades for school year 2016-2017 and urges Governor Haslam, Commissioner of Education Candice McQueen, the General Assembly and the State Board of Education to provide a one-year waiver.

For more on education politics and policy in Tennessee, follow @TNEdReport


 

 

Knox County Takes a Stand

Last night, the Knox County School Board voted 6-3 in favor of a resolution calling on the General Assembly and State Board of Education to waive the use of TCAP/TNReady data in student grades and teacher evaluations this year.

The move comes as the state prepares to administer the tests this year with a new vendor following last year’s TNReady disaster. The lack of a complete testing cycle last year plus the addition of a new vendor means this year is the first year of the new test.

The Board passed the resolution in spite of Governor Haslam warning against taking such a step.

In his warning, Haslam said:

“The results we’ve seen are not by accident in Tennessee, and I think you have to be really careful about doing anything that could cause that to back up,” Haslam said.

He added:

Haslam attributed that progress to three things, including tying standardized tests to teacher evaluations.

“It’s about raising our standards and expectations, it’s about having year-end assessments that match those standards and then I think it’s about having assessments that are part of teachers’ evaluations,” Haslam said. “I think that you have to have all of those for a recipe for success.”

Haslam can present no evidence for his claim about the use of student assessment in teacher evaluation. In fact, it’s worth noting that prior to 2008, Tennessee students achieved at a high level according to what were then the state standards. While the standards themselves were determined to need improvement, the point is teachers were helping students hit the designated mark.

Teachers were moving students forward at this time without evaluations tied to student test results. Policymakers set a mark for student performance, teachers worked to hit that mark and succeeded. Standards were raised in 2008, and since then, Tennessee has seen detectable growth in overall results, including some exciting news when NAEP results are released.

To suggest that a year without the use of TVAAS scores in teacher evaluations will cause a setback is to insult Tennessee’s teachers. As if they’ll just relax and not teach as hard.

Another argument raised against the resolution is that it will somehow absolve teachers and students of accountability.

Joe Sullivan reports in the Knoxville Mercury:

In an email to board members, [Interim Director of Schools Buzz] Thomas asserted that, “We need a good standardized test each year to tell us how we are doing compared to others across the state and the nation. We will achieve greatness not by shying away from this accountability but by embracing it.” And he fretted that, “This resolution puts that at risk. In short, it will divide us. Once again we could find ourselves in two disputing camps. The pro-achievement folks on the one side and the pro-teacher folks on the other.”

Right now, we don’t know if we have a good standardized test. Taking a year to get it right is important, especially in light of the frustrations of last year’s TNReady experience.

Of course, there’s no need for pro-achievement and pro-teacher folks to be divided into two camps, either. Tennessee can have a good, solid test that is an accurate measure of student achievement and also treat teachers fairly in the evaluation process.

To be clear, teachers aren’t asking for a waiver from all evaluation. They are asking for a fair, transparent evaluation system. TVAAS has long been criticized as neither. Even under the best of circumstances, TVAAS provides a minimal level of useful information about teacher performance.

Now, we’re shifting to a new test. That shift alone makes it impossible to achieve a valid value-added score. In fact, researchers in the Journal of Educational Measurement have said:

We find that the variation in estimated effects resulting from the different mathematics achievement measures is large relative to variation resulting from choices about model specification, and that the variation within teachers across achievement measures is larger than the variation across teachers. These results suggest that conclusions about individual teachers’ performance based on value-added models can be sensitive to the ways in which student achievement is measured.
These findings align with similar findings by Martineau (2006) and Schmidt et al (2005)
You get different results depending on the type of question you’re measuring.

The researchers tested various VAM models (including the type used in TVAAS) and found that teacher effect estimates changed significantly based on both what was being measured AND how it was measured.

Changing to a new type of test creates value-added uncertainty. That means results attributed to teachers based on a comparison of this year’s tests and the old tests will not yield valid results.

While insisting that districts use TVAAS in teacher evaluations this year, the state is also admitting it’s not quite sure how that will work.

From Sullivan’s story:

When asked how these determinations will be made, a spokesperson for the state Department of Education acknowledges that a different methodology will have to be employed and says that, “we are still working with various statisticians and experts to determine the exact methodology we will use this year.”

Why not at take at least a year, be sure there’s a test that works, and then build a model based on that? What harm would come from giving teachers and students a year with a test that’s just a test? Moreover, the best education researchers have already warned that testing transitions create value-added bumps. Why not avoid the bumps and work to create an evaluation system that is fair and transparent?

Knox County has taken a stand. We’ll soon see if others follow suit. And if the state is listening.

For more on education politics and policy in Tennessee, follow @TNEdReport


 

 

Bias Confirmed

Last year, I wrote about a study of Tennessee TVAAS scores conducted by Jessica Holloway-Libell. She examined 10 Tennessee school districts and their TVAAS score distribution. Her findings suggest that ELA teachers are less likely than Math teachers to receive positive TVAAS scores, and that middle school teachers generally, and middle school ELA teachers in particular, are more likely to receive lower TVAAS scores.

The findings, based on a sampling of districts, suggest one of two things:

1) Tennessee’s ELA teachers are NOT as effective as Tennessee’s Math teachers and the middle school teachers are less effective than the high school teachers

OR

2) TVAAS scores are biased against ELA teachers (or in favor of Math teachers) due to the nature of the subjects being tested.

The second option actually has support from data analysis, as I indicated at the time and repeat here:

Holloway-Libell’s findings are consistent with those of Lockwood and McCaffrey (2007) published in the Journal of Educational Measurement:

The researchers tested various VAM models and found that teacher effect estimates changed significantly based on both what was being measured AND how it was measured.

That is, it’s totally consistent with VAM to have different estimates for math and ELA teachers, for example. Math questions are often asked in a different manner than ELA questions and the assessment is covering different subject matter.

Now, there’s even more evidence to suggest that TVAAS scores vary based on subject matter and grade level – which would minimize their ability to provide meaningful information about teacher effectiveness.

A recently released study about effective teaching in Tennessee includes the following information:

The study used TVAAS scores alone to determine a student’s access to “effective teaching.” A teacher receiving a TVAAS score of a 4 or 5 was determined to be “highly effective” for the purposes of the study. The findings indicate that Math teachers are more likely to be rated effective by TVAAS than ELA teachers and that ELA teachers in grades 4-8 (mostly middle school grades) were the least likely to be rated effective. These findings offer support for the similar findings made by Holloway-Libell in a sample of districts. They are particularly noteworthy because they are more comprehensive, including most districts in the state.

Here’s a breakdown of the findings by percentage of teachers rated effective and including the number of districts used to determine the average.

4-8 Math           47.5% effective                        126 districts

HS Math            38.9% effective                          94 districts

4-8 ELA              24.2% effective                      131 districts

HS ELA               31.1% effective                       100 districts

So, TVAAS scores are more likely to result in math teachers being rated effective and middle school ELA teachers are the least likely to receive effective ratings.

Again, the question is: Are Tennessee’s ELA teachers really worse than our Math teachers? And, are middle school ELA teachers the worst teachers in Tennessee?

Alternatively, one might suppose that TVAAS, as data from other value-added models suggests, is susceptible to subject matter bias, and to a lesser extent, grade level bias.

That is, the data generated by TVAAS is not a reliable predictor of teacher performance.

For more on education politics and policy in Tennessee, follow @TNEdReport