As Flexible as a Brick Wall

Grace Tatter reports that officials at the Tennessee Department of Education are “perplexed” by concerns over using TNReady data in this year’s teacher evaluations.

While a number of districts have passed resolutions asking for a waiver from including TVAAS scores in this year’s teacher evaluations due to the transition to TNReady, a department spokesperson said:

“Districts have complete discretion to choose how they want to factor that data,” Ball said Thursday. “They don’t have to use TNReady or growth data in hiring, firing, retention or promotion.”

As Tatter’s story notes, however, data from TNReady will still be a part of a teacher’s TVAAS score — 10%. And that score becomes a part of a teacher’s overall evaluation score — a ranking from 1-5 that purports to measure a teacher’s relative effectiveness.

10% is enough to move a ranking up or down a number, and that can have significant impacts on a teacher’s career, even if they are not fired and their pay is not impacted. Of course, some districts may use this year’s data for those purposes, since it is not prohibited under the evaluation changes passed last year.

Dan Lawson outlines some of the of impact faced by teachers based on that final number:

The statutorily revised “new tenure” requires five years of service (probationary period) as well as an overall score of “4” or “5” for two consecutive years preceding the recommendation to the Board of Education. Last year, no social studies assessment score was provided since it was a field tested and the teacher was compelled to select a school wide measure of growth.  He chose POORLY and his observation score of a “4.38” paired with a school wide growth score in the selected area of a “2” producing a sum teacher score of “3” thereby making him ineligible for tenure nomination.

According to TCA 49-5-503, a teacher may not be awarded tenure unless she achieves a TEAM score of 4 or 5 in two consecutive years immediately prior to being tenure eligible. That means a TVAAS score that takes a teacher from a 4 to a 3 would render her ineligible.

Further, a tenured teacher who receives a TEAM score of a 1 or 2 in two consecutive years is returned to probationary status (TCA 49-5-504). So, that tenured teacher who was a 2 last year could be impacted by a TNReady-based TVAAS score that moves a TEAM score of a 3 down to a 2.

Districts don’t have “complete discretion” to waive state law as TNDOE spokesperson Ashley Ball seems to imply.

Further, basing any part of a teacher’s evaluation on TVAAS scores based on TNReady creates problems with validity. Why include a number in a teacher’s evaluation that is fundamentally invalid?

Teachers want an evaluation process that is fair and transparent. There’s nothing perplexing about that.

For more on education politics and policy in Tennessee, follow @TNEdReport

Still Not Ready

The MNPS Board of Education last night passed a resolution calling on the State of Tennessee to delay the use of TVAAS scores in teacher evaluations during the first year of the new TNReady test. The resolution is similar to one passed in Knox County last month.

Here’s the MNPS version:

A RESOLUTION OF THE METROPOLITAN NASHVILLE PUBLIC SCHOOLS BOARD OF EDUCATION IN OPPOSITION TO THE USE OF TNREADY DATA FOR TEACHER EVALUATIONS FOR THE SCHOOL YEAR 2015-2016

PROPOSED BY ANNA SHEPHERD

WHEREAS, Metropolitan Nashville Public Schools (MNPS) is responsible for providing a local system of public education, and
WHEREAS, The State of Tennessee through the work of the Tennessee General Assembly, the Tennessee Department of Education, the Tennessee Board of Education, and local boards of education, has established nationally recognized standards and measures for accountability in public education, and
WHEREAS, all public school systems in Tennessee have been granted a one-time pass in the 2015-2016 school year to not integrate TNReady scores into each student’s final grades due to an anticipated delay in assessment results, and
WHEREAS, teachers with at least five years of experience are eligible for tenure only if they receive an overall evaluation score above expectations or significantly above expectations for the prior two years, and
WHEREAS, this school year is the first year that the TNReady assessment will be administered, and
WHEREAS, the TNReady assessment is not a compatible assessment with the TCAP assessment, and
WHEREAS, the TNReady assessment requires the extensive use of technology and the State of Tennessee BEP funding formula, already inadequate, does not meet these technology needs or the needs of MNPS schools as a whole, and
WHEREAS, the Tennessee General Assembly and Tennessee Board of Education have already adopted the “Tennessee Teaching Evaluation Act” to lessen the evaluation score impact of TNReady in English/language arts and math, and
WHEREAS, over 70% of MNPS teachers, counselors, librarians, instructional coaches, and others do not produce individual TVAAS data, and
WHEREAS, MNPS seeks to recruit and retain excellent teachers to serve our students.
NOW, THEREFORE, BE IT RESOLVED BY METROPOLITAN NASHVILLE PUBLIC SCHOOLS BOARD OF EDUCATION AS FOLLOWS:
MNPS Board of Education strongly urges the Tennessee General Assembly and the Tennessee Board of Education to provide a waiver from utilizing the TNReady data for the use of teacher evaluations for the school year 2015-2016 or allow districts to only use observation data from evaluations to make decisions on hiring, placement, and compensation based strictly on the 2015-2016 TNReady data, and
BE IT FURTHER RESOLVED, that the Tennessee General Assembly and the Tennessee Board of Education consider the impact of the 2015-2016 TNReady data upon future years of teacher evaluations, and
BE IT FURTHER RESOLVED, that the Tennessee General Assembly and the Tennessee Board of Education consider allowing teachers to be eligible for tenure when they have received a composite score of four (4) or five (5) for two of any of the last five years, as opposed to the prior two years only.
ADOPTED BY THE MNPS BOARD OF EDUCATION AT ITS MEETING ON TUESDAY, JANUARY 12, 2016.

 

The resolution includes a few interesting notes:

  • 70% of MNPS teachers don’t have individual TVAAS data
  • There’s mention of the inadequacy of the BEP formula
  • There’s a call for further review of TVAAS after this year

According to prepared remarks by MNPS teacher Amanda Kail prior to the vote, four other counties have passed similar resolutions.

For more on education politics and policy in Tennessee, follow @TNEdReport

CAPE Flies into 2016

At the first MNPS Board meeting of 2016, advocacy group CAPE will again be encouraging teachers to raise their voices and speak out. CAPE member Amanda Kail previews the remarks she plans to make this evening:

Ladies and gentlemen of the school board — My name is Amanda Kail. I am an EL teacher at Margaret Allen Middle School.
First and foremost, I would like to wish all of you a happy new year. And in that vein, I would like all of us as a district to take a moment to reflect on what we have gotten right, and how we can improve in 2016.
First of all, you are to be commended in recognizing that over-testing has become a serious problem for our schools. Countless studies from leading experts in education, as well as the groundswell of parents around the country who are opting their children out of the tests, and even demands from students, such as the White Station High School students organizing in Shelby County point to the same conclusion — high-stakes testing has been a colossal mistake, regardless of the intentions. Many of you have made statements recognizing the need to reign in the testing as a priority. Thank you. Now let’s make 2016 the year that happens.
How can we do that? First, let’s end testing where we can. DISTRICT benchmarks take up SIGNIFICANT instructional time, and are often given so close to other tests as to be redundant. Getting rid of them would mean 3 less weeks of testing (and 3 weeks more of instruction).
Second, make instructional time THE FOCUS of school days again so teachers can teach and students can learn. Cap building-level testing to no more than once per semester. Remember that assessments are now given on-line, and that most schools at MNPS do not have enough computers to give these assessments in one day, meaning that a single whole-school assessment can drag on for one or two weeks in order to accommodate all students and grade levels.
Third, join Knox County, Blount County, Washington County and Anderson County schools by supporting Board Member Shepherd’s proposal to postpone using TN Ready scores on teacher evaluations this year. Tell Nashville teachers you respect our profession enough to not evaluate us on something that is so much beyond our control. Then tell the Tennessee legislature that it is time to reexamine the trust we have placed in high-stakes testing to tell us anything besides which schools are rich and which are poor.
Finally, lets find a director of schools who truly has ALL of our schools at heart. MNPS needs someone who will ask our legislature to end high-stakes testing and who will demand full funding for our district. Someone who will spend their time getting struggling schools more resources, like the wrap-around services from the Community Achieves program, and who will implement a fair and fully-supported discipline policy grounded in restorative justice. Someone who recognizes that threatening and punishing schools that are serving students with the highest needs is not nearly as useful as finding those schools the resources they need.
We have much work to do, but if we work together, this can be the year our system truly shines. Thank you.
For more on education politics and policy in Tennessee, follow @TNEdReport

A Matter of Fairness

A coalition of education advocacy groups released an online petition today calling for a one year waiver from using student test scores in teacher evaluations in Tennessee.

Here’s the press release:

A coalition of groups supporting public education today launched an online petition asking the Tennessee General Assembly and Governor Bill Haslam to grant teachers a grace period from the use of student test scores in their evaluations in the first year of new TNReady tests. The petition tracks language adopted unanimously by the Knox County School Board, which passed a resolution last week opposing the use of student test scores in teacher evaluation for this academic year.

“The state has granted waivers so that TNReady scores aren’t required to be counted in student grades for this year,” said Lyn Hoyt, president of Tennesseans Reclaiming Educational Excellence (TREE). “If TNReady won’t count in student grades, it’s only fair that it shouldn’t count for teacher evaluation.” Hoyt noted that the transition to the new test means entering uncharted territory in terms of student scores and impact on teacher evaluation scores. As such, she said, there should be a one year or more grace period to allow for adjustment to the new testing regime.

“TNReady is different than the standardized tests we’ve had in the past,” Hoyt said. “Our students and teachers both deserve a reasonable transition period. We support the Knox County resolution and we are calling on the General Assembly to take notice and take action. Taking a thoughtful path transitioning to the new test can also build confidence and trust in the process.”

Hoyt also cited a recent policy statement by the American Educational Research Association that cautions against using value-added data in teacher evaluations and for high-stakes purposes. “Researchers who study value-added data are urging states to be cautious in how it is used to evaluate teachers,” Hoyt said. “The transition to TNReady is the perfect time to take a closer look at how test scores are used in teacher evaluations. Let’s take a year off, and give our students and teachers time to adjust. It’s a matter of fundamental fairness.”

Groups supporting the petition include:

Strong Schools (Sumner County)
Williamson Strong (Williamson County)
SPEAK (Students, Parents, Educators Across Knox County)
SOCM (Statewide Organizing for Community eMpowerment)

Middle TN CAPE (Coalition Advocating for Public Education)
Momma Bears Blog
Advocates for Change in Education (Hamilton County)
Concerned Parents of Franklin County (Franklin County)
Parents of Wilson County, TN, Schools
Friends of Oak Ridge Schools (City of Oak Ridge Schools)
TNBATs (State branch of National BATs)
TREE (Tennesseans Reclaiming Educational Excellence)
TEA (Tennessee Education Association)

For more on education politics and policy in Tennessee, follow @TNEdReport

It All Comes Down to a Number

Dan Lawson is the Director of Schools for Tullahoma City Schools. He sent this message and the American Educational Research Association press release to a group of Tennessee lawmakers.

I am the superintendent of Tullahoma City Schools and in light of the media coverage associated with Representative Holt and a dialogue with teachers in west Tennessee I wanted to share a few thoughts with each of who represent teachers in other districts in Tennessee. I am thankful that each of you have a commitment to service and work to cultivate a great relationship with teachers and communities that you represent.

While it is certainly troubling that the standards taught are disconcerting in that developmental appropriateness is in question by many, and that the actual test administration may be a considerable challenge due to hardware, software and capacity concerns, I think one of the major issues has been overlooked and is one that could easily address many concerns and restore a sense of confidence in many of our teachers.

Earlier this week the American Educational Research Association released a statement (see below) cautioning states “against the use of VAM for high-stakes decisions regarding educators.” It seems to me that no matter what counsel I provide, what resources I bring to assist and how much I share our corporate school district priorities, we boil our work and worth as a teacher down to a number. And for many that number is a product of how well they guess on what a school-wide number could be since they don’t have a tested area.

Our teachers are tasked with a tremendous responsibility and our principals who provide direct supervision assign teachers to areas where they are most needed. The excessive reliance on production of a “teacher number” produces stress, a lack of confidence and a drive to first protect oneself rather than best educate the child. As an example, one of my principals joined me in meeting with an exceptional middle school math teacher, Trent Stout. Trent expressed great concerns about the order in which the standards were presented (grade level) and advised that our math department was confident that a different order would better serve our students developmentally and better prepare them for higher level math courses offered in our community. He went on to opine that while he thought we (and he) would take a “hit” on our eighth grade assessment it would serve our students better to adopt the proposed timeline. I agreed. It is important to note that I was able to dialogue with this professional out of a sense of joint respect and trust and with knowledge that his status with our district was solely controlled by local decision makers. He is a recipient of “old tenure.” However, don’t mishear me, I am not requesting the restoration of “old tenure,” simply a modification of the newly enacted statute. I propose that a great deal of confidence in “listening and valuing” teachers could be restored by amending the tenure statute to allow local control rather than state eligibility.

I have teachers in my employ with no test data who guess well and are eligible for the tenure status, while I have others who guess poorly and are not eligible. Certainly, the final decision to award tenure is a local one, but local based on state produced data that may be flawed or based on teachers other than the potential nominee. Furthermore, if we opine that tenure does indeed have value, I am absolutely lost when I attempt to explain to new teachers that if they are not eligible for tenure I may employ them for an unlimited number of added contracts but if they are eligible based on their number and our BOE decides that they will not award tenure to anyone I am compelled to non-renew those who may be highly effective teachers. The thought that statue allows me to reemploy a level 1 teacher while compelling me to non-renew a level 5 teacher seems more than a bit ironic and ridiculous.

I greatly appreciate your service to our state and our future and would love to see an extensive dialogue associated to the adoption of Common Sense.

The American Educational Research Association Statement on Value-Added Modeling:

In a statement released today, the American Educational Research Association (AERA) advises those using or considering use of value-added models (VAM) about the scientific and technical limitations of these measures for evaluating educators and programs that prepare teachers. The statement, approved by AERA Council, cautions against the use of VAM for high-stakes decisions regarding educators.

In recent years, many states and districts have attempted to use VAM to determine the contributions of educators, or the programs in which they were trained, to student learning outcomes, as captured by standardized student tests. The AERA statement speaks to the formidable statistical and methodological issues involved in isolating either the effects of educators or teacher preparation programs from a complex set of factors that shape student performance.

“This statement draws on the leading testing, statistical, and methodological expertise in the field of education research and related sciences, and on the highest standards that guide education research and its applications in policy and practice,” said AERA Executive Director Felice J. Levine.

The statement addresses the challenges facing the validity of inferences from VAM, as well as specifies eight technical requirements that must be met for the use of VAM to be accurate, reliable, and valid. It cautions that these requirements cannot be met in most evaluative contexts.

The statement notes that, while VAM may be superior to some other models of measuring teacher impacts on student learning outcomes, “it does not mean that they are ready for use in educator or program evaluation. There are potentially serious negative consequences in the context of evaluation that can result from the use of VAM based on incomplete or flawed data, as well as from the misinterpretation or misuse of the VAM results.”

The statement also notes that there are promising alternatives to VAM currently in use in the United States that merit attention, including the use of teacher observation data and peer assistance and review models that provide formative and summative assessments of teaching and honor teachers’ due process rights.

The statement concludes: “The value of high-quality, research-based evidence cannot be over-emphasized. Ultimately, only rigorously supported inferences about the quality and effectiveness of teachers, educational leaders, and preparation programs can contribute to improved student learning.” Thus, the statement also calls for substantial investment in research on VAM and on alternative methods and models of educator and educator preparation program evaluation.

The AERA Statement includes 8 technical requirements for the use of VAM:

  1. “VAM scores must only be derived from students’ scores on assessments that meet professional standards of reliability and validity for the purpose to be served…Relevant evidence should be reported in the documentation supporting the claims and proposed uses of VAM results, including evidence that the tests used are a valid measure of growth [emphasis added] by measuring the actual subject matter being taught and the full range of student achievement represented in teachers’ classrooms” (p. 3).
  2. “VAM scores must be accompanied by separate lines of evidence of reliability and validity that support each [and every] claim and interpretative argument” (p. 3).
  3. “VAM scores must be based on multiple years of data from sufficient numbers of students…[Related,] VAM scores should always be accompanied by estimates of uncertainty to guard against [simplistic] overinterpretation[s] of [simple] differences” (p. 3).
  4. “VAM scores must only be calculated from scores on tests that are comparable over time…[In addition,] VAM scores should generally not be employed across transitions [to new, albeit different tests over time]” (AERA Council, 2015, p. 3).
  5. “VAM scores must not be calculated in grades or for subjects where there are not standardized assessments that are accompanied by evidence of their reliability and validity…When standardized assessment data are not available across all grades (K–12) and subjects (e.g., health, social studies) in a state or district, alternative measures (e.g., locally developed assessments, proxy measures, observational ratings) are often employed in those grades and subjects to implement VAM. Such alternative assessments should not be used unless they are accompanied by evidence of reliability and validity as required by the AERA, APA, and NCME Standards for Educational and Psychological Testing” (p. 3).
  6. “VAM scores must never be used alone or in isolation in educator or program evaluation systems…Other measures of practice and student outcomes should always be integrated into judgments about overall teacher effectiveness” (p. 3).
  7. “Evaluation systems using VAM must include ongoing monitoring for technical quality and validity of use…Ongoing monitoring is essential to any educator evaluation program and especially important for those incorporating indicators based on VAM that have only recently been employed widely. If authorizing bodies mandate the use of VAM, they, together with the organizations that implement and report results, are responsible for conducting the ongoing evaluation of both intended and unintended consequences. The monitoring should be of sufficient scope and extent to provide evidence to document the technical quality of the VAM application and the validity of its use within a given evaluation system” (AERA Council, 2015, p. 3).
  8. “Evaluation reports and determinations based on VAM must include statistical estimates of error associated with student growth measures and any ratings or measures derived from them…There should be transparency with respect to VAM uses and the overall evaluation systems in which they are embedded. Reporting should include the rationale and methods used to estimate error and the precision associated with different VAM scores. Also, their reliability from year to year and course to course should be reported. Additionally, when cut scores or performance levels are established for the purpose of evaluative decisions, the methods used, as well as estimates of classification accuracy, should be documented and reported. Justification should [also] be provided for the inclusion of each indicator and the weight accorded to it in the evaluation process…Dissemination should [also] include accessible formats that are widely available to the public, as well as to professionals” ( p. 3-4).

The bottom line:  Tennessee’s use of TVAAS in teacher evaluations is highly problematic.

More on TVAAS:

Not Yet TNReady

The Worst Teachers

Validating the Invalid

More on Peer Assistance and Review:

Is PAR a Worthy Investment?

For more on education politics and policy in Tennessee, follow @TNEdReport

 

The Worst Teachers?

“There is a decently large percentage of teachers who are saying that they feel evaluation isn’t fair,” he (state data guru Nate Schwartz) said. “That’s something we need to think about in the process we use to evaluate teachers … and what we can do to make clear to teachers how this process works so they feel more secure about it.”

This from a story about the recently released 2015 Educator Survey regarding teacher attitudes in Tennessee.

One reason teachers might feel the evaluation is unfair is the continued push to align observation scores with TVAAS (Tennessee Value-Added Assessment System) data – data that purportedly captures student growth and thereby represents an indicator of teacher performance.

From WPLN:

Classroom observation scores calculated by principals should roughly line up with how a teacher’s students do on standardized tests. That’s what state education officials believe. But the numbers on the state’s five point scale don’t match up well.

“The gap between observation and individual growth largely exists because we see so few evaluators giving 1s or 2s on observation,” the report states.

“The goal is not perfect alignment,” Department of Education assistant commissioner Paul Fleming says, acknowledging that a teacher could be doing many of the right things at the front of the class and still not get the test results to show for it. But the two figures should be close.

In order to be better at aligning observation scores with TVAAS scores, principals could start by assigning lower scores to sixth and seventh grade teachers. At least, that’s what the findings of a study by Jessica Holloway-Libell published in June in the Teachers College Record suggest.

Holloway-Libell studied value-added scores assigned to individual schools in 10 Tennessee districts — Urban and suburban — and found:

In ELA in 2013, schools were, across the board, much more likely to receive positive value-added scores for ELA in fourth and eighth grades than in other grades (see Table 1). Simultaneously, districts struggled to yield positive value-added scores for their sixth and seventh grades in the same subject-areas. Fifth grade scores fell consistently in the middle range, while the third-grade scores varied across districts

Table 1. Percent of Schools that had Positive Value-Added Scores in English/language arts by Grade and District (2013) (Districts which had less than 25% of schools indicate positive growth are in bold)
District      Third      Fourth    Fifth     Sixth     Seventh      Eighth
Memphis      41%       43%        45%      19%        14%           76%
Nashville      NA        43%        28%      16%        15%           74%
Knox             72%       79%        47%      14%         7%            73%
Hamilton     38%      64%        48%      33%      29%            81%
Shelby           97%     76%         61%       6%        50%            69%
Sumner         77%     85%         42%       17%      33%            83%
Montgomery NA      71%         62%       0%        0%              71%
Rutherford     83%   92%         63%      15%     23%             85%
Williamson    NA      88%        58%      11%      33%           100%
Murfreesboro NA     90%        50%     30%     NA              NA

SOURCE: Teachers College Record, Date Published: June 08, 2015
http://www.tcrecord.org ID Number: 17987, Date Accessed: 7/27/2015

In examining three-year averages, Holloway-Libell found:

The three-year composite scores were similar except even more schools received positive value-added scores for the fifth and eighth grades. In fact, in each of the nine districts that had a composite score for eighth grade, at least 86% of their schools received positive value-added scores at the eighth-grade level.

By contrast, results in math were consistently positive across grade level and district type:

In particular, the fourth and seventh grade-level scores were consistently higher than those of the third, fifth, sixth, and eighth grades, which illustrated much greater variation across districts. The three-year composite scores were similar. In fact, a majority of schools across the state received positive value-added scores in mathematics across all grade levels.

So, what does this mean?

Well, it could mean that Tennessee’s 6th and 7th grade ELA teachers are the worst in the state. Or, it could mean that math teachers in Tennessee are better teachers than ELA teachers. Or, it could mean that 8th grade ELA teachers are rock stars.

Alternatively, one might suspect that the results of Holloway-Libell’s analysis suggest both grade level and subject matter bias in TVAAS.

In short, TVAAS is an unreliable predictor of teacher performance. Or, teaching 6th and 7th grade students reading is really hard.

Holloway-Libell’s findings are consistent with those of Lockwood and McCaffrey (2007) published in the Journal of Educational Measurement:

The researchers tested various VAM models and found that teacher effect estimates changed significantly based on both what was being measured AND how it was measured.

That is, it’s totally consistent with VAM to have different estimates for math and ELA teachers, for example. Math questions are often asked in a different manner than ELA questions and the assessment is covering different subject matter.

So, TVAAS is like other VAM models in this respect. Which means, as Lockwood and McCaffrey suggest, “caution is needed when interpreting estimated teacher effects” when using VAM models (like TVAAS).

In other words: TVAAS is not a reliable predictor of teacher performance.

Which begs the question: Why is the Tennessee Department of Education attempting to force correlation between observed teacher behavior and a flawed, unreliable measure of teacher performance? More importantly, why is such an unreliable measure being used to evaluate (and in some districts, reward with salary increases) teachers?

Don’t Tennessee’s students and parents deserve a teacher evaluation system that actually reveals strong teaching and provides support for teachers who need improvement?

Aren’t Tennessee’s teachers deserving of meaningful evaluation based on sound evidence instead of a system that is consistent only in its unreliability?

The American Statistical Association has said value-added models generally are unreliable as predictors of teacher performance. Now, there’s Tennessee-specific evidence that suggests strongly that TVAAS is biased, unreliable, and not effective as a predictor of teacher performance.

Unless, that is, you believe that 6th and 7th grade ELA teachers are our state’s worst.

For more on education politics and policy in Tennessee, follow @TNEdReport


 

 

 

Reform is Working

That’s the message from the Tennessee Department of Education based on recently released TCAP results and an analysis of the data over time.

You can see for yourself here and here.

The one area of concern is reading, but overall, students are performing better than they were when new TCAP tests were started and standards were raised.

Here’s the interesting thing: This is true across school districts and demographic subgroups. The trend is positive.

Here’s something else: A similar trend could be seen in results before the change in the test in 2009.

Tennessee students were steadily making gains. Teachers and schools were hitting the mark set for them by policymakers. This in an age of collective bargaining for teachers and no TVAAS-based evaluation or pay schemes.

When the standards were made higher — certainly a welcome change — teachers again hit the mark.

Of course, since the standards change, lots of other reforms have taken place. Most of these have centered around teachers and the incorporation of TVAAS in teacher evaluation and even pay schemes. The State Board of Education even gutted the old state salary schedule to promote pay differentiation, ostensibly based on TVAAS scores.

But does pay for TVAAS actually lead to improved student outcomes as measured by TVAAS?

Consider this comparison of Putnam County and Cumberland County. Putnam was one of the original TIF recipients and among the first to develop a pay scheme based on teacher evaluations and TVAAS.

Putnam’s 2014 TVAAS results are positive, to be sure. But neighboring Cumberland County (a district that is demographically similar and has a similar assortment of schools) also shows positive TVAAS results.  Cumberland relies on the traditional teacher pay scale. From 2012-13 to 2013-14, Putnam saw a 50% increase in the number of categories (all schools included) in which they earned TVAAS scores of 5. So did Cumberland County.

Likewise, from 2012-13 to 2013-14, Putnam saw a 13% decline in the number of categories in which they earned TVAAS scores below a 3. In Cumberland County, the number was cut by 11%.

This is one example over a two-year cycle. New district level results for 2015 will soon be available and will warrant an update. But, it’s also worth noting that these results track results seen in Denver in analysis of their ProComp pay system. Specifially, University of Colorado’s Denver ProComp Evaluation Report (2010-2012) finds little impact of ProComp on student achievement, or on teachers’ professional practices, including their teaching practices or retention.

The Putnam-Cumberland initial analysis tracks with that of the ProComp studies: Teacher performance pay, even if devised in conjunction with teacher groups, cannot be said to have a significant impact on student performance over time.

So, prior to 2008, student academic achievement as measured by Tennessee standardized tests showed steady improvement over time. This occurred in an environment with no performance pay. Again from 2009-2015, across districts and demographic groups, student achievement is improving. Only a small number of Tennessee districts have performance pay schemes — so, that alone would indicate that performance pay is not driving improved student outcomes.  Then, a preliminary comparison of two districts suggests that both performance pay and non-performance pay districts see significant (and similar) TVAAS gains.

Reform may be working — but it may not be the reform the reformers want to push.

For more on education politics and policy in Tennessee, follow @TNEdReport

Is John Oliver Reading TN Ed Report?

John Oliver recently took on the issue of standardized testing and it sounds like he’s been reading Tennessee Education Report. In 18 brilliant minutes, he hits on a number of topics covered here time and again.

Oliver discussed teacher merit pay, the recruiting tactics of testing companies, value-added assessment, and testing transparency.

Back in 2013, Tennessee’s State Board of Education moved toward merit pay based on value-added data.

This year, while adding nearly $100 million to the pot for teacher compensation, Governor Haslam continued a push for merit pay.

While Oliver noted that Pearson recruits test scorers on Craigslist, Tennessee’s new testing vendor, Measurement, Inc. uses the same practice.

And of course, there’s the issue of value-added assessment — in Tennessee, called TVAAS. While it yields some interesting information, it’s not a reliable predictor of teacher performance and it’s going to be even more unreliable going forward, due to the shift from TCAP to TNReady. Here’s what we’ve learned from TVAAS in Tennessee:

In fact, this analysis demonstrates that the difference between a value-added identified “great” teacher and a value-added identified “average” teacher is about $300 in earnings per year per student.  So, not that much at all.  Statistically speaking, we’d call that insignificant.  That’s not to say that teachers don’t impact students.  It IS to say that TVAAS data tells us very little about HOW teachers impact students.

Surprisingly, Tennessee has spent roughly $326 million on TVAAS and attendant assessment over the past 20 years. That’s $16 million a year on a system that is not yielding much useful information.

And then there’s testing transparency. Oliver points out that it’s difficult if not impossible to get access to the actual test questions. In fact, Tennessee’s testing vendor, Measurement, Inc., has a contract with Utah’s testing vendor that involves a fine if test questions are revealed — $5000 per question:

The contract further notes that any release of the questions either by accident or as required by law, will result in a fee of $5000 per test item released. That means if Tennessee wants to release a bank of questions generated from the Utah test and used for Tennessee’s assessment, the state would pay $5000 per question.

Here’s the clip from John Oliver:

 

For more on education politics and policy in Tennessee, follow @TNEdReport

 

The End of an Era

Over at Bluff City Ed, Jon Alfuth celebrates the end of the EOC testing era. Those tests will be replaced with TNReady next year.

Alfuth notes that there are many challenges with the current testing regime, including gaming the system and misalignment with current standards.

Here’s what he says he hopes the new tests provide:

First, I’d personally like to see aligned pre- and formative assessments to allow teachers to track tests throughout the year. These could be given to the districts and used to develop a benchmark for where students are starting and track their progress throughout the year. These should be designed by Measurement Inc. to ensure close alignment to the actual test.

Second, we need to see shorter tests. Asking students to sit for between 2 to 4 three hour assessments in a four day period is a lot, and it does stress kids out. I’d like to see the number of questions reduced on the new TNReady assessments to reflect this reality.

Third, we need better special education and special needs accommodations. I’m not a special education teacher myself, but from talking to some of my colleagues my understanding is that the accommodations for the EOC regime aren’t the greatest. Hopefully a technologically advanced test like TNReady (it can be given on paper or on a computer) could include better accommodations for kids with special needs. I also hope it makes automatic adjustments for students who, say, speak English as a second language.

Fourth, we need to see a substantial increase of resources aligned to the new assessments and SOON. Teachers need time to internalize the format at the types of questions that students will be asked to complete on the new assessments. That was one of the failings of PARCC and one reason I believe we no longer have it in Tennessee – teachers didn’t have enough supporting resources and backed off support for the assessment. Lets hope that TNReady doesn’t make the same mistake.

More on TNReady:

TNReady to Borrow Questions from Utah

Transition to TNReady Creates TVAAS Problems

For more on education politics and policy, follow @TNEdReport

A Little Less Bad

From a story in Chalkbeat:

Tennessee’s teacher evaluation system is more accurate than ever in measuring teacher quality…

That’s the conclusion drawn from a report on the state’s teacher evaluation system conducted by the State Department of Education.

The idea is that the system is improving.

Here’s the evidence the report uses to justify the claim of an improving evaluation system:

1) Teacher observation scores now more closely align with teacher TVAAS scores — TVAAS is the value-added modeling system used to determine a teacher’s impact on student growth

2) More teachers in untested subjects are now being evaluated using the portfolio system rather than TVAAS data from students they never taught

On the second item, I’d note that previously, 3 districts were using the a portfolio model and now 11 districts use it. This model allows related-arts teachers and those in other untested subjects to present a portfolio of student work to demonstrate that teacher’s impact on growth. The model is generally applauded by teachers who have a chance to use it.

However, there are 141 districts in Tennessee and 11 use this model. Part of the reason is the time it takes to assess portfolios well and another reason is the cost associated with having trained evaluators assess the portfolios. Since the state has not (yet) provided funding for the use of portfolios, it’s no surprise more districts haven’t adopted the model. If the state wants the evaluation model to really improve (and thereby improve teaching practice), they should support districts in their efforts to provide meaningful evaluation to teachers.

A portfolio system could work well for all teachers, by the way. The state could move to a system of project-based learning and thus provide a rich source of material for both evaluating student mastery of concepts AND teacher ability to impact student learning.

On to the issue of TVAAS and observation alignment. Here’s what the report noted:

Among the findings, state education leaders are touting the higher correlation between a teacher’s value-added score (TVAAS), which estimates how much teachers contribute to students’ growth on statewide assessments, and observation scores conducted primarily by administrators.

First, the purpose of using multiple measures of teacher performance is not to find perfect alignment, or even strong correlation, but to utilize multiple inputs to assess performance. Pushing for alignment suggests that the department is actually looking for a way to make TVAAS the central input driving teacher evaluation.

Advocates of this approach will tell suggest that student growth can be determined accurately by TVAAS and that TVAAS is a reliable predictor of teacher performance.

I would suggest that TVAAS, like most value-added models, is not a significant differentiator of teacher performance. I’ve written before about the need for caution when using value-added data to evaluate teachers.

More recently, I wrote about the problems inherent in attempting to assign growth scores when shifting to a new testing regime, as Tennessee will do next year when it moves from TCAP to TNReady. In short, it’s not possible to assign valid growth scores when comparing two entirely different tests.  Researchers at RAND noted:

We find that the variation in estimated effects resulting from the different mathematics achievement measures is large relative to variation resulting from choices about model specification, and that the variation within teachers across achievement measures is larger than the variation across teachers. These results suggest that conclusions about individual teachers’ performance based on value-added models can be sensitive to the ways in which student achievement is measured.
These findings align with similar findings by Martineau (2006) and Schmidt et al (2005)
You get different results depending on the type of question you’re measuring.

The researchers tested various VAM models (including the type used in TVAAS) and found that teacher effect estimates changed significantly based on both what was being measured AND how it was measured. 

And they concluded:

Our results provide a clear example that caution is needed when interpreting estimated teacher effects because there is the potential for teacher performance to depend on the skills that are measured by the achievement tests.

So, even if you buy the idea that TVAAS is a significant differentiator of teacher performance, drawing meaningful conclusions from next year’s TNReady simply is not reliable.

The state is touting improvement in a flawed system that may now be a little less bad.  And because they insist on estimating growth from two different tests with differing methodologies, the growth estimates in 2016 will be unreliable at best. If they wanted to improve the system, they would take two to three years to build growth data based on TNReady — that would mean two t0 three years of NO TVAAS data in teacher evaluation.

Alternatively, the state could move to a system of project-based learning and teacher evaluation and professional development based on a Peer Assistance and Review Model. Such an approach would be both student-centered and result in giving teachers the professional respect they deserve. It also carries a price tag — but our students are worth doing the work of both reallocating existing education dollars and finding new ways to invest in our schools.

For more on education politics and policy in Tennessee, follow @TNEdReport