The End of an Era

Over at Bluff City Ed, Jon Alfuth celebrates the end of the EOC testing era. Those tests will be replaced with TNReady next year.

Alfuth notes that there are many challenges with the current testing regime, including gaming the system and misalignment with current standards.

Here’s what he says he hopes the new tests provide:

First, I’d personally like to see aligned pre- and formative assessments to allow teachers to track tests throughout the year. These could be given to the districts and used to develop a benchmark for where students are starting and track their progress throughout the year. These should be designed by Measurement Inc. to ensure close alignment to the actual test.

Second, we need to see shorter tests. Asking students to sit for between 2 to 4 three hour assessments in a four day period is a lot, and it does stress kids out. I’d like to see the number of questions reduced on the new TNReady assessments to reflect this reality.

Third, we need better special education and special needs accommodations. I’m not a special education teacher myself, but from talking to some of my colleagues my understanding is that the accommodations for the EOC regime aren’t the greatest. Hopefully a technologically advanced test like TNReady (it can be given on paper or on a computer) could include better accommodations for kids with special needs. I also hope it makes automatic adjustments for students who, say, speak English as a second language.

Fourth, we need to see a substantial increase of resources aligned to the new assessments and SOON. Teachers need time to internalize the format at the types of questions that students will be asked to complete on the new assessments. That was one of the failings of PARCC and one reason I believe we no longer have it in Tennessee – teachers didn’t have enough supporting resources and backed off support for the assessment. Lets hope that TNReady doesn’t make the same mistake.

More on TNReady:

TNReady to Borrow Questions from Utah

Transition to TNReady Creates TVAAS Problems

For more on education politics and policy, follow @TNEdReport

A Little Less Bad

From a story in Chalkbeat:

Tennessee’s teacher evaluation system is more accurate than ever in measuring teacher quality…

That’s the conclusion drawn from a report on the state’s teacher evaluation system conducted by the State Department of Education.

The idea is that the system is improving.

Here’s the evidence the report uses to justify the claim of an improving evaluation system:

1) Teacher observation scores now more closely align with teacher TVAAS scores — TVAAS is the value-added modeling system used to determine a teacher’s impact on student growth

2) More teachers in untested subjects are now being evaluated using the portfolio system rather than TVAAS data from students they never taught

On the second item, I’d note that previously, 3 districts were using the a portfolio model and now 11 districts use it. This model allows related-arts teachers and those in other untested subjects to present a portfolio of student work to demonstrate that teacher’s impact on growth. The model is generally applauded by teachers who have a chance to use it.

However, there are 141 districts in Tennessee and 11 use this model. Part of the reason is the time it takes to assess portfolios well and another reason is the cost associated with having trained evaluators assess the portfolios. Since the state has not (yet) provided funding for the use of portfolios, it’s no surprise more districts haven’t adopted the model. If the state wants the evaluation model to really improve (and thereby improve teaching practice), they should support districts in their efforts to provide meaningful evaluation to teachers.

A portfolio system could work well for all teachers, by the way. The state could move to a system of project-based learning and thus provide a rich source of material for both evaluating student mastery of concepts AND teacher ability to impact student learning.

On to the issue of TVAAS and observation alignment. Here’s what the report noted:

Among the findings, state education leaders are touting the higher correlation between a teacher’s value-added score (TVAAS), which estimates how much teachers contribute to students’ growth on statewide assessments, and observation scores conducted primarily by administrators.

First, the purpose of using multiple measures of teacher performance is not to find perfect alignment, or even strong correlation, but to utilize multiple inputs to assess performance. Pushing for alignment suggests that the department is actually looking for a way to make TVAAS the central input driving teacher evaluation.

Advocates of this approach will tell suggest that student growth can be determined accurately by TVAAS and that TVAAS is a reliable predictor of teacher performance.

I would suggest that TVAAS, like most value-added models, is not a significant differentiator of teacher performance. I’ve written before about the need for caution when using value-added data to evaluate teachers.

More recently, I wrote about the problems inherent in attempting to assign growth scores when shifting to a new testing regime, as Tennessee will do next year when it moves from TCAP to TNReady. In short, it’s not possible to assign valid growth scores when comparing two entirely different tests.  Researchers at RAND noted:

We find that the variation in estimated effects resulting from the different mathematics achievement measures is large relative to variation resulting from choices about model specification, and that the variation within teachers across achievement measures is larger than the variation across teachers. These results suggest that conclusions about individual teachers’ performance based on value-added models can be sensitive to the ways in which student achievement is measured.
These findings align with similar findings by Martineau (2006) and Schmidt et al (2005)
You get different results depending on the type of question you’re measuring.

The researchers tested various VAM models (including the type used in TVAAS) and found that teacher effect estimates changed significantly based on both what was being measured AND how it was measured. 

And they concluded:

Our results provide a clear example that caution is needed when interpreting estimated teacher effects because there is the potential for teacher performance to depend on the skills that are measured by the achievement tests.

So, even if you buy the idea that TVAAS is a significant differentiator of teacher performance, drawing meaningful conclusions from next year’s TNReady simply is not reliable.

The state is touting improvement in a flawed system that may now be a little less bad.  And because they insist on estimating growth from two different tests with differing methodologies, the growth estimates in 2016 will be unreliable at best. If they wanted to improve the system, they would take two to three years to build growth data based on TNReady — that would mean two t0 three years of NO TVAAS data in teacher evaluation.

Alternatively, the state could move to a system of project-based learning and teacher evaluation and professional development based on a Peer Assistance and Review Model. Such an approach would be both student-centered and result in giving teachers the professional respect they deserve. It also carries a price tag — but our students are worth doing the work of both reallocating existing education dollars and finding new ways to invest in our schools.

For more on education politics and policy in Tennessee, follow @TNEdReport

 

 

 

Validating the Invalid?

The Tennessee House of Representatives passed legislation today (HB 108) that makes changes to current practice in teacher evaluation as Tennessee transitions to its new testing regime, TNReady.

The changes adjust the percentage of a teacher’s evaluation that is dependent on TVAAS scores to 10% next year, 20% the following year, and back to the current 35% by the 2017-18 academic year.

This plan is designed to allow for a transition period to the new TNReady tests which will include constructed-response questions and be aligned to the so-called Tennessee standards which match up with the Common Core State Standards.

Here’s the problem: There is no statistically valid way to predict expected growth on a new test based on the historic results of TCAP. First, the new test has (supposedly) not been fully designed. Second, the test is in a different format. It’s both computer-based and it contains constructed-response questions. That is, students must write-out answers and/or demonstrate their work.

Since Tennessee has never had a test like this, it’s impossible to predict growth at all. Not even with 10% confidence. Not with any confidence. It is the textbook definition of comparing apples to oranges.

Clearly, legislators feel like at the very least, this is an improvement. A reasonable accommodation to teachers as our state makes a transition.

But, how is using 10% of an invalid number a good thing? Should any part of a teacher’s evaluation be made up of a number that reveals nothing at all about that teacher’s performance?

While value-added data alone is a relatively poor predictor of teacher performance, the value-added estimate used next year is especially poor because it is not at all valid.

But, don’t just take my word for it. Researchers studying the validity of value-added measures asked whether value-added gave different results depending on the type of question asked. Particularly relevant now because Tennessee is shifting to a new test with different types of questions.

Here’s what Lockwood and McCaffrey (2007) had to say in the Journal of Educational Measurement:

We find that the variation in estimated effects resulting from the different mathematics achievement measures is large relative to variation resulting from choices about model specification, and that the variation within teachers across achievement measures is larger than the variation across teachers. These results suggest that conclusions about individual teachers’ performance based on value-added models can be sensitive to the ways in which student achievement is measured.
These findings align with similar findings by Martineau (2006) and Schmidt et al (2005)
You get different results depending on the type of question you’re measuring.

The researchers tested various VAM models (including the type used in TVAAS) and found that teacher effect estimates changed significantly based on both what was being measured AND how it was measured. 

And they concluded:

Our results provide a clear example that caution is needed when interpreting estimated teacher effects because there is the potential for teacher performance to depend on the skills that are measured by the achievement tests.

If you measure different skills, you get different results. That decreases (or eliminates) the reliability of those results. TNReady is measuring different skills in a different format than TCAP. It’s BOTH a different type of test AND a test on different standards. Any value-added comparison between the two tests is statistically suspect, at best. In the first year, such a comparison is invalid and unreliable. As more years of data become available, it may be possible to make some correlation between past TCAP results and TNReady scores.

Or, if the state is determined to use growth scores (and wants to use them with accuracy), they will wait several years and build completely new growth models based on TNReady alone. At least three years of data would be needed in order to build such a model.

It seems likely that the Senate will follow the House’s lead on Monday and overwhelmingly support the proposed evaluation changes. But in doing so, they should be asking themselves if it’s really ok to base any part of a teacher’s evaluation on numbers that reliably predict nothing.

More on Value-Added:

Real World Harms of Value-Added Data

Struggles with Value-Added Data

 

TNReady … Already?

Back in November, the State of Tennessee awarded a contract to Measurement Inc. to develop the new assessment that would replace TCAP.

This assessment is to be aligned to state standards (largely based on Common Core State Standards) and should take into account feedback from Tennesseans.

Measurement Inc. will be paid $108 million for the contract.

Chalkbeat noted at the time the contract was awarded:

Measurement Inc. is subcontracting to AIR, a much larger player in the country’s testing market. AIR already has contracts with Utah and Florida, so Tennessee educators will be able to compare scores of Tennessee students with students from those states “with certainty and immediately.” AIR is also working with Smarter Balanced, one of two federally funded consortia charged with developing Common Core-aligned exams. That means that educators in Tennessee will also likely be able to measure their students’ progress with students in the 16 states in the Smarter Balanced Consortium.

The Department of Education notes on its website:

Comparability: While the assessments will be unique to Tennessee, TNReady will allow Tennesseans to compare our student progress to that of other states. Through a partnership between Measurement Inc. and American Institutes for Research, TNReady will offer Tennessee a comparison of student performance with other states, likely to include Florida and Utah.

While Measurement Inc. has an interesting approach to recruiting test graders, another item about the contract is also noteworthy.

The Department and Chalkbeat both noted the ability to compare Tennessee test scores with other states, including Utah and Florida.

Here’s why that’s possible. On December 5th, the Utah Board of Education approved the use of revenue from test licensing agreements with Florida, Arizona, and Tennessee based on contracts with AIR, the organization with which Measurement Inc. has a contract, as noted by Chalkbeat.

The contract notes that Utah’s expected arrangement in Tennessee is worth $2.3 million per year (running from 2015-2017) and that Tennessee will use questions licensed for the Utah assessment in Math and ELA in its 2015-16 assessment.

So, Tennessee’s new test will use questions developed for Utah’s assessment and also licensed to Florida and Arizona.

The contract further notes that any release of the questions either by accident or as required by law, will result in a fee of $5000 per test item released. That means if Tennessee wants to release a bank of questions generated from the Utah test and used for Tennessee’s assessment, the state would pay $5000 per question.

While Tennessee has said it may change or adapt the test going forward, it seems that the 2016 edition of the test may be well underway in terms of its development.

For more on education politics and policy in Tennessee, follow @TNEdReport

 

Ready to Grade?

Measurement, Inc. has been hired by the State of Tennessee to design new standardized tests to replace TCAP. The new test is to be aligned to Tennessee’s new standards and will include constructed-response questions in addition to multiple choice. This means students will write answers or demonstrate work as part of the test. The idea is to demonstrate understanding of a subject, rather than simply guessing on a multiple choice test. Typically, grading a constructed response test is costly, because evaluators have to read and consider the answers and then rate them based on a rubric. Fortunately for Tennessee taxpayers, Measurement, Inc. has found a way to keep these costs low.

Here’s an ad from Measurement seeking Evaluators/Readers for tests:

Thank you for your interest in employment with Measurement Incorporated. We are a diverse company engaged in educational research, test development, and the scoring of tests administered throughout the world. Our company has grown to be the largest of its kind by providing consistent and reliable results to our clients. We are able to do so through the efforts of a professional and flexible staff, and we welcome your interest in becoming a member. Measurement Incorporated Reader/Evaluator Position Recruiting for projects starting in March of 2015 for both day and evening shift at the Ypsilanti Scoring Center. If you qualify as a reader/evaluator, you will be eligible to work on a number of our projects. Many projects require readers to score essays for content, organization, grammatical convention, and/or the student’s ability to communicate and to respond to a specific directive. Other projects involve scoring test items in reading, math, science, social studies, or other subject areas. The tests you will score come from many different states and from students at all grade levels, elementary through college, depending on the project.

LOCATION Measurement Incorporated Ypsilanti Scoring Center 1057 Emerick Ypsilanti, MI 48198 (734) 544-7686

REQUIREMENTS Bachelor’s degree in any field Ability to perform adequately on a placement assessment Completion of a successful interview Access to a home computer with high speed internet in a secure work area for telecommuters

HOURS Readers are hired on a temporary basis by project but are expected to work five days per week, Monday through Friday. Hours vary by shift. Attendance during training (usually the first few days of a project) is mandatory. PAY The starting pay is $10.70 per hour. After successful completion of three major scoring projects (or a minimum of 450 hours), readers who meet the minimum standards of production, accuracy and attendance will receive an increase to $11.45 per hour.

APPLICATION PROCEDURE To apply, please go to http://www.measurementinc.com/Employment/ and select the Reader/Evaluator position. Select Ypsilanti as your location and click on the “Apply Online” tab. Qualified applicants will be contacted to complete an online placement assessment, schedule an interview, and provide proof of degree. If invited to work on a scoring project, proof of employment eligibility in order to complete a federal I-9 from will be required within three days of employment.

Apparently, scorers at the Nashville scoring center can earn starting pay of $11.20 an hour.

 

Certainly, quality scorers for TNReady can be found for $10.70-$11.20 an hour via ads posted on Craigslist. I’m sure parents in the state are happy to know this may be the pool of scorers determining their child’s test score. And teachers, whose evaluations are based on growth estimates from these tests, are also sure to be encouraged by the validity of results obtained in this fashion. So, if you have a Bachelor’s degree and want to make around $11 an hour on a temporary, contract basis by all means, get in touch with the developers of Tennessee’s new standardized tests. For more on education politics and policy in Tennessee, follow @TNEdReport

Replacing TCAP

Measurement, Inc. has been hired by the State of Tennessee to design new assessments in ELA and Math.

The contract came about because the General Assembly passed legislation calling on the state to open bidding for new assessments rather than continue as planned with administration of the PARCC tests.

Here’s an email sent to educators today explaining the upcoming changes:

Over the past several months, Gov. Haslam and his administration, including the state department of education, have participated in a number of ongoing conversations with you and your colleagues about K-12 education in Tennessee. These conversations have reflected both the historic progress Tennessee has made through your work as well as your concerns and recommendations for improvement. One emerging theme from these discussions has been the challenges experienced by educators due to the uncertainty of the state’s assessments in English language arts (ELA) and math and the impact of administering the existing TCAP exams while meeting the current ELA and math academic standards.
We are excited to report to you that this week the state of Tennessee completed the process to replace the state’s current TCAP assessments in ELA and math. The new measurements of learning for ELA and math will be called Tennessee Ready (TNReady). These assessments, to be administered by Measurement Inc., were selected through a fair, thorough and transparent process established by the General Assembly and administered by the state’s Central Procurement Office.
TNReady will be administered beginning in the 2015-16 school year and will assess our state standards in ELA and math. These standards are located on the department of education website (ELA is here and math is here).
You’ll find additional information about TNReady below:

  • By Tennessee, For Tennessee: Tennessee educators – both at the K-12 and higher education levels – were significantly involved in the selection process and chose an assessment that is both fully aligned to the state’s academic standards but also adaptable to future improvements. Tennessee will make decisions about item selection, test length and composition, and scoring. In the future, Tennessee will decide on changes to the test based on changes to standards, and Tennesseans will be engaged in item development and review.
  • Higher Expectations and Critical Thinking:  TNReady will expand beyond just multiple choice questions to include: writing that requires students to cite text evidence at all grade levels; questions that measure math fluency without a calculator; and questions that ask students to show their work in math with partial credit available.
  • Resources for Parents and Teachers:  Online tools will be available for schools and teachers to develop practice tests that can provide students, teachers, and parents with valuable and immediate feedback. These resources will be available before the end of the 2014-15 school year.
  • Comparability:  While the assessments will be unique to Tennessee, TNReady will allow Tennesseans to compare our student progress to that of other states. Through a partnership between Measurement Inc. and American Institutes for Research, TNReady will offer Tennessee a comparison of student performance with other states, likely to include Florida and Utah.
  • Training:  The Tennessee Department of Education will provide training for educators across the state during the summer of 2015.
  • Test Administration & Scoring: TNReady will have two parts. The first portion, which will replace the state’s current comprehensive writing assessment, will require extended written responses in ELA and math and will be administered in February/March. The second portion will include selected responses, such as multiple choice and drag-and-drop items, and will be administered in April/May.
  • Technology: TNReady will be administered online and available for use on multiple devices with minimal bandwidth. As most states move their tests for all grade levels online, we must ensure Tennessee students do not fall behind their peers in other states. However, all districts will have the option of administering paper-pencil exams.

We look forward to sharing additional details about the new assessments in the coming months.  Additional information will be posted on the new TNReady page of our website.
Finally, as previously noted, Tennessee will make appropriate revisions to assessments in the future to reflect any change in the academic standards. Recently, Gov. Haslam and the State Board of Education announced a public review process in which all Tennesseans will have an opportunity to provide input on our ELA and math standards. These public comments will then be reviewed by committees of Tennessee educators, which will make recommendations to the state board. We encourage all of you to be engaged in this process in an effort to ensure our academic standards continue to reflect higher expectations for our students. To participate in the standards review process, visit https://apps.tn.gov/tcas/. We want to thank you for your patience and acknowledge the tremendous dedication you have shown in improving the life outcomes for Tennessee students and their families. Thank you for what you do every day.

For more on education policy and politics in Tennessee, follow @TNEdReport