More than measuring sticks, tests can be powerful tools for improving learning.
Extended school closures due to COVID-19 have renewed calls to reduce or eliminate testing on the grounds that tests displace learning. While these calls have frequently targeted federally required assessments, in recent years various groups and organizations have criticized a wide range of tests, including local assessments. As a reminder of how precious instructional time is, the COVID-19 crisis revitalizes concerns about the value of all types of tests.
As a novice teacher, I too was skeptical about testing. Like most teachers, I received little assessment training before or after entering the classroom. But that didn’t matter: I knew my students. What else could a test offer? It turns out, quite a bit. I have spent the better part of the last two decades working with students, teachers, and state departments of education; conducting research; and thinking hard about how tests are used. Today, I believe that improving student outcomes requires not just better tests, but better use of tests.
In educational settings, tests are mostly used as assessment tools following learning, typically to (1) evaluate learning or (2) inform subsequent learning and teaching. Substantial evidence shows tests used for these purposes are associated with improved student outcomes. In addition, a considerable body of research demonstrates the benefits of tests for strengthening memory when used as learning events. In the following sections, I provide an overview of the different purposes for which tests may be used, namely, to evaluate, inform, or promote learning, and to address the challenges, benefits, and research evidence associated with each. This overview is followed by set of recommendations for how to make better use of tests given each purpose.
Testing to Evaluate Learning
The most high-profile (and polarizing) tests are those used to evaluate learning. These tests are ubiquitous in the form of standards-based summative assessments administered at the end of a course, program, or school year. The No Child Left Behind (NCLB) Act (2001) significantly increased the use of tests to evaluate learning, mandating that states assess students’ reading and math achievement annually in grades 3–8 and once in high school, and disaggregate the results by race and wealth. NCLB simultaneously increased the stakes associated with tests of learning, subjecting low-performing schools to a series of sanctions up to the loss of federal funding and school closure. NCLB was replaced by the Every Student Succeeds Act (ESSA) in the 2017-18 school year; ESSA maintains the requirements for annual testing to evaluate learning and subgroup reporting, removes the more punitive consequences, and offers some flexibility for states willing to pilot “innovative” assessments.
Educational equity has been a cornerstone of standardized assessments since the Elementary and Secondary Education Act (1965) was passed with the goal of providing additional resources to underserved student populations. In addition to promoting educational equity and opportunity, results of summative tests can be used to assess return on investments in education and compare the performance of students and schools. To inform such high-stakes decisions, these tests must meet exacting technical standards. Further, they are designed to sample a wide range of content in order to generalize about students’ performance over an extended period of instruction (e.g., a course). This means that while summative tests can inform reliable and valid achievement-based decisions about students or systems, they are too removed from daily teaching and learning to provide diagnostic information or inform instruction.
Critics of standardized tests point to research showing that test-based accountability programs have been associated with several unintended consequences, including narrowing curriculum and instruction to focus on tested content, delaying or denying special education to eligible students, “strategic” teacher placement, and teacher turnover. However, focusing solely on the misuse of tests of learning ignores what they get right. In addition to highlighting achievement gaps, summative tests have produced modest improvements in student achievement, particularly as part of school-based accountability programs. Recently, researchers have even found direct learning gains attributed to the exercise of responding to test questions on large-scale tests.
Regular practice tests, quizzes, and other activities that require students to bring content to mind strengthens memory and improves learning.
Testing to Inform Learning
Tests to inform learning are commonly referred to as formative assessments. Formative assessment engages teachers and students in a process in which evidence of learning is first collected and then applied to improve teaching and learning. This process has two objectives. The first is to provide feedback to reduce the gap between what students currently know and can do and a desired goal. The second is to help students self-regulate their learning. Self-regulated learning (SRL) is a process whereby students establish learning goals and then monitor, regulate, and control their cognition, motivation, and behavior to pursue those goals.
Feedback is often thought to be teachers’ responsibility. However, the most valuable formative assessment practices position students to generate (and consume) their own feedback. For example, self-assessment is a formative assessment process in which students reflect on and evaluate (based on goals and/or criteria) the products and processes of their learning. Similarly, peer assessment is an arrangement in which students consider one another’s products or performances and provide feedback. Self- and peer-assessment practices both increase feedback and scaffold, or support, students’ self-regulated learning, helping students become more aware of how they think and learn.
The biggest challenge associated with formative assessment is developing tests that accord with the objectives of providing effective feedback and engaging students in their learning. Classroom formative assessments are also subject to various threats to validity, such as scoring bias. As formative assessment historically has been a teacher-centric activity, teachers may be reluctant to shift their practices to promote students’ involvement in assessment. However, there are promising solutions to address these challenges (supporting teachers in professional learning communities is one), and if they are addressed, formative assessments can increase the frequency at which feedback is available to students (for adjusting strategies) and teachers (for adjusting instruction), and allow students to assume greater responsibility for their learning. Self and peer assessment in particular afford more timely and individualized feedback than teachers can provide. These practices also enhance students’ monitoring and regulation of what they know and don’t know; self-direction; and learning through social interaction, all needed for success outside school.
Several meta-analyses have shown an overall positive impact of formative assessment on student achievement. This is unsurprising given the evidence showing feedback to be one of the most powerful influencers of student achievement. Researchers have further found that using formative assessment to support self-assessment enhances its effectiveness. Self-assessment interventions have shown positive effects on students’ use of SRL strategies, and both self- and peer-assessment are associated with improved performance.
Researchers have even found direct learning gains attributed to the exercise of responding to test questions on large-scale tests.
Testing to Promote Learning
The most underutilized and arguably underappreciated use of tests is to promote learning. Regular practice tests, quizzes, and other activities that require students to bring content to mind strengthens memory and improves learning. This is because the very act of bringing information to mind—by retrieving it from memory—makes information more likely to be remembered in the future and improves retention of related information. Using tests as practice tools aligns with constructivist views of learning, which emphasize the learner’s role in building and transforming knowledge through individual and social activities. But what differentiates testing used in this manner from other learning strategies? Cognitive science research has shown that the more you make complex and difficult decisions when thinking about a topic—and the more you are able to mentally connect the topic to what you already know—the more likely the information will be stored in long-term memory. Tests are ideal for providing students with retrieval practice opportunities to strengthen memory and foster long-term learning.
There are some possible pitfalls when using testing to promote learning. Effective retrieval practice is not as simple as questioning individual students during a lesson. It requires effort on the part of teachers to plan activities that engage all students in the class. In some situations, tests can impede learning (this can be counteracted by providing students with explanatory feedback). Also, not all students will benefit from retrieval practice. If these pitfalls are minimized or avoided, tests used as learning events will help students learn more and forget less than other types of study opportunities. The act of retrieval practice stimulates mental connections to what students already know. In addition, students can use cues based on how readily information comes to mind when retrieving to better judge their knowledge. In these ways, frequent testing can improve students’ ability to monitor and regulate their knowledge.
Rigorous research studies have investigated the effectiveness of tests to strengthen memory on long-term learning with positive results. Compared to techniques like lecturing, note taking, and restudying, retrieval practice produces greater transfer of learning across different contexts. Since retrieval practice increases understanding, its benefits are not limited to memory of factual information; it can improve knowledge and application of complex content. A recent meta-analysis confirmed the benefits of retrieval practice apply across multiple-choice and short-answer practice tests; to primary, secondary, and postsecondary students; and in laboratory and classroom settings. As an added benefit, students report that classroom-based retrieval practice programs even reduce test anxiety.
Making Better Use of Tests
By all accounts U.S. students are subject to excessive testing. Ironically, federally required assessments have received the brunt of criticism regarding testing time, when, in fact, students spend the most time taking locally mandated (and often lower-quality) interim and benchmark assessments. At the state level, improving the use of tests to evaluate learning begins with ensuring standards-based summative assessments maintain high technical standards for quality and assess deeper learning. Though the “next-generation” of summative assessments offered the promise of assessing higher-level skills using essays and other performance tasks, most states have since adopted shorter, narrower versions of these tests that eliminate many of the questions requiring students to construct responses. This approach trades short-term gain (shaving a few minutes off testing time) for long-term pain (using tests divorced from good teaching and learning); therefore, state decision makers should revisit the costs and benefits of shorter, narrower tests, particularly as they relate to the teaching of higher-level skills critical for college and careers. At the district level, policy makers can improve the use of tests to evaluate learning by reviewing the portfolio of tests administered to students annually in each grade. Tests that are poorly aligned to standards, redundant, and/or technically inadequate should be eliminated, so that state and local assessments form a cohesive, balanced assessment system.
Rigorous research studies have investigated the effectiveness of tests to strengthen memory on long-term learning with positive results.
Many tests currently being sold and administered in the name of formative assessments are unsuited to provide formative feedback or foster self-regulated learning. Such tests, often called interim, benchmark, or diagnostic assessments, are used to inform decisions at the class, school, or district level, and are administered periodically following a school- or district-controlled schedule. While these tests can provide some formative information, they are less instructionally relevant than true formative assessment practices which can be tailored to individual students’ needs and occur regularly and seamlessly with instruction. Better using tests to inform learning starts with ensuring teachers’ assessment practices increase feedback and promote student involvement in assessment. Here are some specific recommendations:
Teachers use a variety of activities to help students practice what they are learning, such as questioning during lessons, homework assignments, and quizzes and tests. However, these activities are often used as assessment tools rather than learning strategies. More effective use of tests to promote learning requires increasing the frequency of retrieval practice opportunities, providing feedback after each, and removing any stakes (e.g., grades) associated with these activities. Here are some specific recommendations:
Tests can be powerful tools for learning. Used appropriately, they can ensure access to educational opportunities, improve achievement, empower students to regulate their learning, and strengthen memory. It’s time to stop contrasting testing with learning and viewing tests as assessment tools exclusively, when testing can be one of the most powerful learning activities. There is no more important time than the present to better leverage tests to ensure the needs of all students are being met and to implement evidence-based practices to improve learning.
--By Corey Palermo. Palermo is Executive Vice President & Chief Strategy Officer at Measurement Incorporated
Interview with Pooja Agarwal
Ulrich Boser on TEDx Talk
Interview with Ken Koedinger
The Learning Curve publishes articles about how people learn. Please reach out with any ideas for articles on the research on learning