A program that will distinguish where -- and when -- education practice works.
We interviewed Ben Motz and Emily Fyfe about their new program, Many Classes, which they call a "collaborative research project investigating the generalizability of educational interventions in real classrooms."
Why Did You Start Many Classes?
About five years ago, we ran an extensive research study on student learning at IU. The study was embedded in 18 sections of Introductory Psychology across three semesters, with over 2000 undergraduate student participants. It was a huge undertaking, and to our surprise, the results conflicted with conventional wisdom (exam performance was better when students chose blocked study sequences instead of interleaved sequences).
This conflict revealed a paradox. Anytime a research study is embedded in a single course, and the results conflict with current theory, there are two possible explanations for the conflict: Either the theory is wrong, or the theory just doesn’t apply to the idiosyncratic context where the study was embedded (which, in our case, was a lesson about measures of central tendency in Introductory Psychology at Indiana University Bloomington).
There’s no way of disentangling these two possibilities when a study is embedded in only one context. Unless the study is embedded in many contexts (or in other words: in many classes), there’s no way to make generalizable inferences from an embedded experiment.
This isn’t just an issue for researchers — generalizability also presents a challenge for those who want to promote the science of learning, or to translate learning theory into practice.
'What works in education' is not a universal truth. Even while our brains all work in very similar ways, learning in practice is context-dependent. A learning strategy might be beneficial in some situations, but not in others.”
While we’re working through these issues, we became inspired by the scale and success of the first Many Labs project for measuring variability in laboratory research outcomes. We started thinking that classroom research on the science of learning should also scale up in a way that captures variability in instructional practices, student populations, learning materials, and so on.
What’s the Goal of Many Classes?
Our overarching goal is to observe something that's never been directly observed before: How consistent are the benefits of educational practice across different classes? ...And if there’s variability in these benefits, can we identify characteristics of the students, the class, or the institution that might systematically explain this variability?
In this way, the results of a ManyClasses study will not only tell us whether a practice works but also where it works. We hope our research contributes to more precise theories in the science of learning and more effective recommendations for translating theory into classrooms.
The first ManyClasses study is being executed right now across three dozen college classes spanning five institutions. It focuses on the timing of feedback, and specifically, whether immediate feedback (seeing what you got right and wrong immediately after an assessment is submitted) versus delayed feedback (seeing what you got right and wrong several days after an assessment is submitted) improves authentic student learning outcomes.
Most people would probably assume that immediate feedback is better, but recent psychology research suggests otherwise. This is an ideal contrast for our first study because feedback is common to practically all learning environments, instructors must (at least implicitly) decide when to give feedback to their students, and there are opposing theoretical recommendations for what works best.
How Does the Program Work, Briefly?
ManyClasses is an embedded learning experiment with several unique features. First, the experiment is carried out in dozens of classes at the same time. Second, the classes are vastly different from each other. And third, the experiment involves manipulation of original class materials and measurement of authentic learning outcomes, for more realistic estimates of the experimental effects in each class. Combined, these features mean that there is a common experiment happening in many classes, but the ways it gets implemented and the context in which it occurs are vastly different.
We began by gaining approval from several universities (who were all members of the Unizin Consortium) and distributing a call for applications from instructors. We met individually with each applicant to make sure that the experiment would work within the normal conduct of their class. Next, we implemented the experiment in each instructor’s Canvas course site, by randomly assigning some quizzes to have feedback released immediately and other quizzes to have feedback released several days later.
From the student’s perspective, they complete these quizzes as they usually would and sometimes receive immediate feedback and occasionally receive delayed feedback. Students were informed (in class and their syllabus) about the research study, and they had the option to consent to share their data with the researchers. At the end of the semester, we will gather information about student outcomes (e.g., their performance on in-class exams) and run our preregistered data analyses on pooled data.
In addition to the two of us (Ben and Emily), the ManyClasses research team also includes Josh de Leeuw, Paulo Carvalho, and Rob Goldstone.
What’s Been the Biggest Surprise So Far?
Ben: When we were getting ready to launch the project, we reached out to David Mellor (at the Center for Open Science) for advice. He replied:
Plan twice as much time as you think you'll need, and make sure there are folks dedicated to coordinating and communicating.”
At the time, I remember rolling my eyes, thinking, “How hard can it be? We just implement an experiment in a few dozen course sites!” But holy moly, David was right.
We were lucky to have top-level contacts at other Unizin institutions who were eager to support the study. However, still, the approval process was more time- and labor-intensive than anyone on the research team could’ve anticipated. We’re talking dozens of meetings, including lawyers, privacy and security officers, and IT administrators. Unlike clinical trials, for example, where there are transparent processes for multi-institutional distributed research, the procedure for approving research across institutional boundaries in college classrooms doesn’t exist. We hope that our efforts with ManyClasses help pave the way for future collaborative research initiatives on the science of learning to be more streamlined.
Emily: Similar to Ben, I’ve been most surprised by the hurdles we’ve encountered in getting this project off the ground. But on a more positive note, I’ve been pleasantly surprised by the enthusiasm and gratitude of the participating instructors.
As researchers, it’s easy for us to get excited about our project, to believe in its merits, and to assume that the contributions to theory and practice will far outweigh the current difficulties of running the experiment. However, these difficulties are not trivial.”
And we realize that may be especially true for the participating instructors who are working on modifying their course materials, on troubleshooting technical challenges as they arise, and to address student questions and concerns.
Despite these issues, the instructors have expressed remarkable support for the research team, gratitude for being included in this new large-scale operation, and avid interest in the research findings. Their enthusiasm is contagious and certainly keeps us motivated to continue conducting high-quality collaborative research on the science of learning.
Interview with Pooja Agarwal
Ulrich Boser on TEDx Talk
Interview with Ken Koedinger
The Learning Curve publishes articles about how people learn. Please reach out with any ideas for articles on the research on learning