Desirable difficulties, after Bjork

Training activities can often be made more effective by introducing difficulties.

Some examples which Robert A. Bjork (1994) cites as improving long-term performance in experiments:

  • Shuffling heterogenous tasks rather than grouping repetitions by task type
  • Varying the parameters of a task
  • Varying the environmental context of learning sessions
  • Creating “contextual interference” which requires more attention: e.g. by providing resources which differ in structural organization
  • Reducing the frequency of feedback (e.g. by providing aggregate feedback every N trials or by reducing the frequency of feedback over time)

One consistent theme with all of these manipulations is that they reduce performance during the training activity itself, but they yield better longer-term outcomes. This reversal probably makes it difficult to adopt some of these manipulations in learning activities, since they’ll make learners feel like they aren’t learning as well. e.g. (Bjork, 1994 again):

Such a conditioning process, over time, can act to shift the trainer toward manipulations that increase the rate of correct responding — that make the trainee’s life easier, so to speak. Doing that, of course, will move the trainer away from introducing the types of desirable difficulties summarized in the preceding section.

Worse, there may be difficult institutional barriers: trainers may be evaluated by immediate (not long-term) performance; or they may not have a chance to observe long-term performance.

Besides this short-term reversal, students may avoid “desirable difficulties” because they believe their memory/understanding to be stronger than it really is. Bjork suggests that “training is frequently non-optimal because it fails to incorporate the variability, delays, uncertainties, and other challenges the learner can be expected to face in a real-world job setting of some kind” (see also Transfer learning).

Q. What counter-intuitive effect does increased difficulty have on performance in training tasks?
A. It typically harms performance during training but improves it in long-term measures.

Q. Why might learners believe that introducing difficulties into practice sessions makes them learn less well?
A. Added difficulties will often harm performance during practice (while increasing long-term performance).

Q. Why might coaches be institutionally disincentivized from adding desirable difficulties to practice activities?
A. They might be evaluated by students’ short-term performance.

Q. Why might trainers be unable to see the long-term impact of difficulties added to training exercises?
A. The training timeline may be too short to see improvements which may come from more difficult training activities. Alternately, they may never get to see real-world performance of their pupils.


Bjork, R. A. (1994). Memory and Metamemory Considerations in the Training of Human Beings. In J. Metcalfe & A. Shimamura (Eds.), Metacognition: Knowing about Knowing (pp. 185–205). MIT Press.

Pyc, M. A., & Rawson, K. A. (2009). Testing the retrieval effort hypothesis: Does greater difficulty correctly recalling information lead to higher levels of memory? Journal of Memory and Language, 60(4), 437–447

* Or is the best study format about retrieval effort (Desirable difficulties, after Bjork)? Is the additional effort involved in the short answer questions the cause of increased memory performance? The data seem to suggest this interpretation, since studying via short answer produced the most reliable retention in any kind of post-test.]]
* Various experiments suggest that increased retrieval effort enhances later retention; this suggests that the retrieval process itself is at play in the testing effect. See also Desirable difficulties, after Bjork.

  • [[Slamecka, N. J., & Graf, P. (1978). The generation effect: Delineation of a phenomenon. Journal of Experimental Psychology: Human Learning & Memory, 4(6), 592–604.]]
* In the discussion, the authors note that the benefit can’t be attributed solely to a depth of processing argument (e.g. Desirable difficulties, after Bjork), since the rhyme rule doesn’t exhibit the effect less strongly than other much more complex rules.