Spaced repetition memory systems make you feel like your memory is worse than it really is

One challenge for the adoption and day-to-day experience of a Spaced repetition memory system is that you will spend almost all of your review time on material which you find difficult to remember. Easily-remembered material will rapidly accelerate to intervals of many months, so you’ll spend little time on that, compared to a prompt which is stuck at days-or-weeks-length intervals. Your expected accuracy rate across your entire collection may be 95+%, but your accuracy rate in a given session is likely to be much lower.

The effect Is magnified if you add many questions in a batch, then don’t add any more questions for a while. You’re likely to have several sessions composed almost exclusively of questions you’ve struggled to answer from that batch. (This is an issue in user onboarding for the Mnemonic medium).

This phenomenon leads to the mistaken perception that spaced repetition doesn’t “work” very well. And sensibly so! The only direct feedback students get about the efficacy of the process is in their ability (or lack thereof) to remember the questions asked, and in SRS that feedback will be fairly mixed.

Perhaps this can be partially resolved through information presented at the end of review sessions, but I worry that venue will have a less potent impact: it’s (presumably) just something people will read, as opposed to feedback they experience.

Another solution may lie in regularly adding new material, which at least will shift the proportion of review sessions towards material with probably-higher accuracy rates. Related: Spaced repetition review sessions often become boring and detached without a steady stream of new prompts

Another approach altogether would be to add a “hints” functionality or something similar which could allow users to succeed more often (hopefully without sacrificing impact on retention); see Vaugh and Nate Kornell, 2019.

In other words, difficult test trials were like black coffee (revolting) while test trials with hints were like coffee with milk and sweetener (heavenly).

See Desirable difficulties, after Bjork

Some empirical examples

Robert A. Bjork (1994) found:

Baddeley and Longman (1978), for example, found that British postal workers who were taught a keyboard skill under massed-practice (and less efficient) conditions actually were more satisfied with their training than were workers taught under spaced-practice (and more efficient) conditions.

Nate Kornell (2009) found in a study of GRE-type vocabulary:

Across experiments, spacing was more effective than massing for 90% of the participants, yet after the first study session, 72% of the participants believed that massing had been more effective than spacing.
…participants predicted that final test accuracy—which was 31 percentage points higher in the spaced condition than the massed condition—would be 14 percentage points lower in the spaced condition than the massed condition.
… Thus spacing can reduce performance levels during learning and simultaneously enhance long-term learning. When people make the mistake of assuming that short-term performance equals long-term learning—which they often do (Bjork, 1994, 1999)—they may convince themselves that as a study strategy, massing is more effective than spacing.


References

Bjork, R. A. (1994). Memory and Metamemory Considerations in the Training of Human Beings. In J. Metcalfe & A. Shimamura (Eds.), Metacognition: Knowing about Knowing (pp. 185–205). MIT Press.

Kornell, N. (2009). Optimising learning using flashcards: Spacing is more effective than cramming. Applied Cognitive Psychology, 23(9), 1297–1317. https://doi.org/10.1002/acp.1537

Vaughn, K. E., & Kornell, N. (2019). How to activate students’ natural desire to test themselves. Cognitive Research: Principles and Implications, 4(1), 35

Last updated 2023-07-13.