A Letter of Concern

Prior to the latest TNReady debacle, the Director of Schools in Oak Ridge sent a letter outlining some concerns about this year’s testing to Commissioner of Education Candice McQueen. Her response includes the questions he posed and makes for some interesting reading regarding the challenges faced by districts this year.

McQueen’s response is published here in its entirety:

April 11, 2016
Dr. Borchers,
Thank you for sharing your request and thoughts about TNReady and testing this year. I know this has been a tremendous transition for our families and schools, and I do not take these concerns lightly.
I want to address each of the issues you raised, but first I want to make sure you and your educators are aware of the new flexibility we have offered for accountability for the 2015-16 year, in part because of the unexpected issues we experienced on Part I.
First, both teachers and school leaders will not have results from this year’s tests included in their student growth (TVAAS) score unless it benefits them to do so. In other words, if results from this year give a teacher a higher score, they will be included, but if they hurt a teacher’s evaluation, they will be excluded. Educators will automatically receive the best option. You can read more information by clicking here.
In addition, you as a director can provide educators with the option to select a new achievement measure, and those who had originally chosen a schoolwide TVAAS measure can switch to a non-TVAAS option. Also, per the Tennessee Teaching Evaluation Enhancement Act, districts have complete discretion in how they choose to factor test data into employment decisions like promotion, retention, termination, and compensation. And as we had stated earlier, because the scores will be back later this year, districts do not have to include students’ scores in their grades.
Schools also have flexibility for accountability. When we run the Priority School list next year, we will provide a safe harbor for schools who may have seen a decline in performance in 2015-16 that would have resulted in being placed on the list. Instead, we will also look at school performance excluding 2015-16 data, and if that removes the school from being in the bottom 5 percent, they will not be considered a Priority School.
We have already taken steps through our ESEA waiver to revise district accountability this year. For 2015-16, districts will receive the better of two options for purposes of the achievement and gap closure statuses: a one-year growth measure or their relative rank in the state. If a district’s achievement scores decline, but their peers across the state decline in tandem, a district’s relative rank will remain stable. Similar to the governor’s proposal for teachers, districts will automatically receive the option that yields the higher score.

We still believe in the important role state assessment plays in accountability, and this year’s results will provide a baseline from which we can grow. We have a responsibility as a state to make sure all of our students are making progress toward college and career, and state tests give us the best and fairest measure of how all of our children, in all subgroups, are performing. We also have a responsibility to tell taxpayers about how our children are performing given their investment in our education system. No one test is ever a perfect measure of a child’s readiness or full demonstration of everything they have learned, but each feedback loop provides one angle or piece of data that can be considered within a broader context. That is what we hope TNReady will do – and we are equally committed to our responsibility to continue to improve the test and strengthen the data it provides you each year.
To address your specific concerns:
1. Students who were in the middle of testing on the day of the crash saw the exact same questions and prompts when they took the paper-based version. This gave those students a substantial advantage over their peers. 
There were approximately 20,000 students who successfully completed a Part I assessment online on Feb. 8. Those students did not retake the Part I test on paper. There were also 28,000 students who began an online assessment and were not able to complete the ELA, math, or social studies exam. We believe it would have been unfair to penalize those students because of the system disruptions. The department felt it was critical and fair to provide these students another opportunity.
It is highly unlikely that any of the students that attempted to take their Part I assessment online would have encountered the same writing prompt (or math and social studies items) when they took the paper test. There were 1.8 million tests submitted for Part I, compared to the 28,000 students who had logged in but not completed their Part I test. Because of multiple forms and versions being created for both the online and paper versions of the test, only about 16,000 of those students could have possibly been exposed to items on the paper-based test that were on the online versions, and all of those students were ones who likely experienced significant technical interruptions that may have prevented them from moving through or even seeing much or all of the test.
In addition, the students did not receive any feedback on what they may have previously completed, so they had no idea if their response was on track or not. Also, because the prompts for ELA were specific to the passages provided, students were not able to do additional research or practice composing their answer, since they would have needed to reference specific examples in the text to address a particular prompt. Simply seeing a question would not give a student any more of an advantage than a student who has practiced with the test questions on MICA or MIST.
Overall, this means that less than 1% of students may have been exposed to items 2-3 weeks prior to actual administration, received no feedback on their responses, had no access to items or passages until the paper administration, and experienced severe technical disruptions. Therefore, we don’t believe that those students had any advantage. In contrast, we believe these students would have been at a substantial disadvantage if they had not been allowed to complete the assessment via the paper version.
Finally, we will conduct a test mode effect study to determine if students who completed the assessment online in the fall or on Feb. 8 had any significant difference in performance from those who completed the paper-based version this spring.  If we find such differences, then we will make adjustments in the scoring, as is best practice for large-scale assessments (like ACT) that are administered both via paper and online.
2. The online assessment for geometry given in the fall included a reference sheet. In addition, the TNReady blueprint states that a reference sheet would be provided for all high school math exams; however, the students who tested this February did not receive a reference sheet with the paper/pencil assessment. This led to a great deal of concern from the students and will lead to inconsistent results. 
The reference sheet is intended for algebra I & II, and in the proctor script for Part I this spring, there is a reminder to give the math reference sheet to students for algebra I and algebra II only. It would not have benefited students in geometry, as there are no items on the geometry Part I assessment that the reference sheet will help a student answer.
The reference sheets were printed and shipped to districts along with test booklets and answer documents. We did not hear from Oak Ridge if they did not receive these, but let us know if they did not arrive.
3. In secondary math, students have reported questions that did not match the major work of the grade and item types that did not match the percent distribution that we were given with the blueprints. Despite many requests to the Department of Education for accurate blueprints providing accurate item type breakdowns for parts 1 and 2 of the TNReady, we have not been given updated blueprints. This has led to confusion about what students will be tested on and what item types to prepare for on the assessments.
Apologies if you reached out to our team and we were not responsive. We developed these blueprints for the first time this year to try to help educators understand how to pace their teaching over the course of the year and give them a sense of what standards would be covered on which parts of TNReady. We are learning from our educators about how to better support them in that vision, so we are going to be making some changes in the design of the blueprints for next year.
However, to address your concerns about this year’s blueprints, I want to provide context about what we shared with all districts and what students experienced on Part I. In March 2015, the department held regional assessment meetings introducing the test design for the 2015-16 school year. During those meetings, we included the following slides to highlight the content differences for grades 3-8 versus high school:
There is no language in the high school summary that should have indicated only major work of the grade would be covered in Part I for high school math courses. Moreover, the blueprint for geometry indicates that there are standards outside of major work of the grade that is assessed in all high school courses (see below). These blueprints were released in April 2015 and updated in September 2015, both times including standards beyond the major work of the grade in Part I. Those clusters that are not major work of the grade are highlighted below.   (See full letter for graphics)
There are no item type distributions in any of the mathematics blueprints. We shared some very preliminary projections last spring on item type distribution in the regional assessment meetings to give districts a sense of the mix of items. At that time, we emphasized that there would still be multiple choice and multiple select items, and students would have seen a variety of question types if they practiced on MICA or MIST over the fall and winter.
Just as we did for Part I, we have also shared a document with examples of how math questions will appear in the test booklet and on corresponding answer document, which you can view by clicking here. This illustrates the variety of item types which students may see on Part II.
4. On one of the High School EOCs, we were shipped two different answer documents. That normally wouldn’t be a problem except that we were shipped only one test form. Thus many of our students had an answer document that did not match the test on one of the questions. This not only invalidates that question but may also invalidate the responses immediately after that question because students may have started putting their answers in the wrong place so that it better matched the answer document. 
There was a minor printer error found on one of the geometry answer documents, and we appreciate the notification from Jim Hundertmark, the assessment director at Oak Ridge.  We advised him that we followed up with the department’s assessment design team and Measurement Inc. This issue has been flagged for scorers who will complete the hand- scoring process for geometry Part I, so they will be aware as they score students’ responses.
This issue was not widespread and was limited to one printing batch from one of the eight vendors who supplied Part I answer documents. As always, any item that creates irregularity in scoring may ultimately be excluded from student scores such that there is no impact on final performance results.
5. The test document and answer document did not match. As examples, on one test a grid was numbered by one’s in the test booklet, but the grid on the answer document was numbered by another scale. On another 3-8 test the answer document had a box for the answer but the question in the book showed multiple choice. This was misleading for students who transferred work from test booklet to answer booklet and caused a great deal of confusion. 
We are aware of only one issue with a table in 7th grade math. The table in the answer document included the variable “p” on one of the math terms, while there was no “p” in the test booklet.  This was not a widespread issue, and, as with the geometry item as noted above, this is a hand-scored component that scorers have been made aware of.
This table is the only issue we are aware of in which the answer document and test booklet did not match. As with any assessment, items that cause irregularity in scoring will ultimately be excluded from student scores such that there is no impact on final performance.
6. Students repeatedly reported that boxes for some short answer responses were too small and students were not able to fit the entire answer in the box. This led some students to believe that their answers were not correct, causing them to rework problems, wasting precious time, and quite possibly changing correct work to incorrect work. 
Student response on math answer documents only required numerical responses. If student handwriting was larger than the box, this is not an issue, as the items are hand scored.
7. On the first day of testing for grades 3 – 8, the scripts that the proctors were supposed to read did not match the students’ test booklets. Specifically, the students’ test booklets had sample questions; however, the proctor scripts said nothing about sample questions. Some students caught this, many did not. This alone could invalidate each math test for grades 3-8 because students were looking at sample questions and the proctor’s instructions to them were to begin testing, thus resulting in the answers for the sample questions being put in the place of non-sample questions in the answer book.

We were made aware of this and provided a supplemental document for proctors to address the sample. However, it is important to note that the design of the test booklet would have made it extremely difficult for students to confuse the sample questions with actual test items:

The sample items were not numbered. They were labeled “Sample A” and “Sample B” immediately after the directions. Each sample item had an answer block immediately below it.  On the following page, the correct answers for Sample A and Sample B were shown, with the correct method of completing the answer block.  There was a clear STOP sign at the bottom of the page.    The following page noted that there was “no test material on this page.”  The actual assessment begin two pages after the sample questions and then started with number 1, as did the answer document.  There was no Sample A or Sample B on the answer document.
Please see graphics below. It is unlikely that the students answered the sample questions on their test documents given all the visual cues in the test booklet that distinguished sample questions from actual test items.

8. Students have reported to their parents that prompts were confusing, using words such as “at” and “by” in inconsistent ways so that students did not know what they were being asked to do. One parent said, “My concern is that some students are dealing with the stress of thinking about the faulty test instead of being able to focus on the actual questions.” 
It may be helpful to remember that all of our questions are vetted by hundreds of Tennessee teachers each year, and that every test question that is operational – or in other words, scored – is field tested with students in the same grade and subject prior to being made an operational test question. Those teachers approve and edit each question for content, appropriateness, bias, and sensitivity, and after students take field test items, the results are thoroughly vetted to ensure the question was understood and is appropriate for students to take.
Certainly, though, the rigor of this year’s test was higher than we have had in the past, and we understand that some students have had anxiety about this increased level of expectation.
9.  In the practice tests, the answers were written in the same sheets as the questions but the actual test had separate sheets. We have had many students report that they were unsure about where they were supposed to answer their questions. This was a major cause of confusion, especially with our elementary school students. 
We know the paper management has been challenging for some of our younger students. When students did not transfer responses from test booklets to answer documents, teachers, proctors, or other adults transferred answers under the same test security provisions as we have for student transcription.
10. Because the test was originally online, the students in our elementary schools were not taught to bubble correctly throughout the year. We had students who circled answers or put checks in the bubbles, or who were not even sure how to answer. This put the students who did not have knowledge of how to properly bubble at a disadvantage when compared to their peers.
We did not receive any reports from Oak Ridge or any other districts regarding students not understanding how to bubble their answers. Only our third grade students would have never taken a paper-based TCAP assessment in a prior year, and for those students, districts may have provided the opportunity to complete a sample answer form prior to the assessment if they needed to practice.
Additionally, all 3-8 students take the science TCAP on paper each year, and students are expected to fill out their form for that assessment each spring.
11. For the MSAA alternative assessment, questions were written in such a way as to ignore the student’s current achievement levels. 
While this is the first year that Tennessee has given the MSAA, it is the second year for the operational MSAA, which has been given in many other states. After the assessment last year, the tests were not only scored, but the questions were again reviewed to determine if they are appropriate and accessible for students who qualify for the alternate assessment. This reviewer group includes special education teachers, parents, speech language pathologists, directors, and test design specialists. There may be questions that feel too difficult because this assessment is designed for all students who qualify for the alternate, including those who are not as impacted by their disability as much as others.
Our hope, as we have shared, is for the entire test to be adjusted for student level based on the Learner Characteristics Inventory, and this is not yet entirely possible. This means that Part I will include questions for all levels of students.  After they complete Part I, Part II will adjust and be more reflective of the student’s current skill level. There will still be challenging questions because that is important exposure for all students, but there will be fewer that are a challenge and far more at their level.

The TCAP-Alt Portfolio design was very different and in that model, teachers selected an API that they were confident the student would master. With MSAA, students will see questions they may not know the answer to, and that is not only okay, but expected. This is the same experience all other students have in school. That is part of learning. We expect results from all questions missed, to all or almost all correct, and everything in between. This is expected and appropriate.  You may have a student that misses all the items, and that is okay because that reflects their current understanding and mastery.  That is just an honest reflection of them at this point in time.  Congratulate them for trying. With another year of meaningful and rigorous core instruction, they might get more right next year and that will be an awesome celebration.
I want to close by stressing that TNReady is still a valid test. We take that responsibility very seriously because we know if we want parents and teachers – along with the broader education community – to be able to use this data, it needs to be reliable.
The paper forms that were produced contained items and questions that had undergone a rigorous review process – led by Tennessee teachers – and the forms were constructed in advance, as we had always planned a paper back-up option. Though the switch from online to paper-based testing created a number of logistical challenges for administration – and we know those challenges were great – the student experience of paper-based testing was similar to our historical experience.
In our historical technical reports, as well as in this year’s report, we will conduct tests of content and construct validity to ensure the test is statistically sound. In addition, we perform tests of reliability and produce a comparability analysis. Our decision to move to a paper-based assessment was, in part, to ensure that the overwhelming majority of our students experienced the same test conditions, as opposed to the variability that would have come with technical disruptions. We have two full-time psychometricians on our staff to ensure we are maintain the integrity of our testing program, and we are confident that the psychometrics, logistics, and design processes we have completed will allow the prudent use of student assessment results from the 2015-16 school year.
I hope this has helped to address some of your concerns, but I also want to reiterate that we are committed to improving our TCAP tests, including TNReady, each year, and I look forward to continuing to work with you and you educators in this work.
Thank you again for your thoughts and for your commitment to high expectations for our kids. Thank you as well for your and your educators’ efforts during this transition. I continue to be proud and grateful to see our educators and leaders go above and beyond every single day.
Dr. Candice McQueen

Commissioner of Education



For more on education politics and policy in Tennessee, follow @TNEdReport