Review article opening a special issue of the journal on the limitations of the Testing effect with high Element interactivity material. With John Sweller.
The authors point out that almost all the literature about the Testing effect has worked with quite low complexity material, even when testing with students in classrooms. But there are studies dating back to 1914 (Kühn) and 1917 (Gates) which show diminished or absent testing effects for material with higher Element interactivity.
The authors suggest a few possible explanations, informed by some of the relevant experiments:
Note that the authors aren’t claiming that testing doesn’t work—few of the cited studies find inverted testing effects. It’s just that the advantage over re-studying is diminished or absent.
The special issue includes two commentaries criticizing Van Gog and Sweller’s conclusions.
Karpicke, J. D., & Aue, W. R. (2015). The testing effect is alive and well with complex materials. Educational Psychology Review, 27, 317–326. https://doi.org/10.1007/s10648-015-9309-3
Jeffrey Karpicke and Aue reply in the same issue with their disagreement:
Finally, they argue that the studies which VG+S did include as high-element-interactivity mostly fail for methodological reasons but that they did in fact demonstrate small positive testing effects.
Rawson, K. (2015). The Status of the Testing Effect for Complex Materials: Still a Winner. Educational Psychology Review, 27. https://doi.org/10.1007/s10648-015-9308-4
She echoes K+A’s criticism that the the articles which VG+S cite actually have a small positive effect, not a null effect.
Regarding the prior literature, her conclusions are more moderated than those of K+A: she expresses concern that complex problem solving isn’t well represented in the literature, and notes that the experiments which exist report somewhat conflicting results. But on text material, she argues, the prior literature is on firmer footing (with feedback or high learning performance, g=0.73!)
Roelle and Berthed mention van Gog and Sweller’s concerns here. They explain the lack of testing effect in the cited experiments by pointing out that the conditions didn’t have the same kind of knowledge construction. The re-study “might have had an advantage because their task (rereading) made it possible for them to make connections between the target content items, whereas the learners in the retrieval groups, who were instructed to retrieve keywords missing from the content, probably did not engage in this relational processing.” In R+B’s experiments, the tasks are the same, and retrieval is controlled by making study open vs closed book. And they do indeed find a Testing effect.