2023-06-30 Patreon letter - Reading comprehension and memory systems

Private copy; not to be shared publicly; part of Patron letters on memory system experiments

One surprising difficulty in “making it easy for people to remember what they read” is that often, for large swaths of the text, people never really knew what was said in the first place.

Long-term memory doesn’t enter the picture here. The problem is that the reader’s eyes skipped like stones across the surface of the page. They never processed those words beyond visual decoding, if that. The ideas were never represented in working memory. We could roll our eyes and chalk this up to “poor reading skills”, but I wouldn’t want to be too dismissive: I suspect most highly-paid knowledge workers routinely fail at basic reading comprehension in this way.

My reading has improved quite a lot as I’ve investigated reading as a research problem. Still, I’m humbled by basic failures, surprisingly often. I enrolled last year in the University of Chicago’s four-year program on the Great Books. It’s great fun: we meet weekly for deep discussions of challenging texts. But in most classes, at least one of the facilitator’s questions will make me realize that I simply hadn’t comprehended what the text said in some important passage—and I hadn’t noticed.

Getting the gist

Can it really be that educated adults so routinely encounter basic reading comprehension problems? Maybe the trouble is that the scenarios I’m thinking about are unrealistically demanding. In my research, I’ve been observing readers’ comprehension with a goal of helping people internalize a text in great detail. My University of Chicago course covers unusually challenging texts from antiquity, and perhaps the discussion questions are more probing than the ones you might ask of “normal” reading.

Besides, some say, most people are just “reading for the gist” anyway. They want the takeaways. They don’t feel the need to really understand—much less memorize—all the fine details of an author’s explanation. Fine; say I accept that for the moment. Well: do people “get the gist”? Are they aware of whether they got it?

In one of the seminal experiments on adult reading comprehension, Michael Pressley and colleagues asked university students to read short SAT-style text passages (188-520 words). They were instructed to read at a pace which would allow them to answer a question about its contents. Some of the questions were about “the gist”—e.g. to state the main idea, or its primary purpose; others were about details. After students finished reading each passage, they were given a question to answer. Then students were given the opportunity to re-read the passage and change their response, if they thought it might be wrong.

Now here’s the central finding: when readers gave an incorrect answer to a question about “the gist”, they were very unlikely to choose to re-read and try again—they did so only 20%/27% of the time (there were two relevant conditions). And even in those instances where students re-read, only 50%/57% ended up with a correct answer.

Maybe students are just lazy, and they’re choosing not to re-read because this is an artificial experiment? Perhaps, but when the question was about a specific detail in the passage, students with wrong answers did re-read and change their answer 60%/75% of the time. That difference is hard to square with the hypothesis that students just didn’t care about getting the right answer. To explore that hypothesis further, the authors ran a second experiment. In this round, students just rated their confidence in their answesr; they didn’t get a chance to re-read and alter it. Students reported high confidence in 60%/64% of wrong answers to “gist” questions! In fact, they were almost as confident in their wrong answers as in their right answers.

The authors also wondered: is comprehension awareness just a matter of high “verbal ability”? Not quite, it turns out. They administered an abbreviated verbal SAT to the same subjects and found a moderate correlation between verbal performance and response accuracy, but no significant correlation with the accuracy of students’ confidence ratings. The authors’ interpretation here is that comprehension awareness is at least partially separate from traditional performance measures. Students who score well on SATs do so because their first attempt is more likely to be right; when they do give a wrong answer, they’re not more likely to notice that than their lower-scoring peers.

The same authors published a different study 1 with a charming title: “Being really, really certain you know the main idea doesn’t mean you do”. They ran another experiment like the ones we've discussed, this time asking one group of students to read and re-read each passage as many times as necessary to confidently answer a question about it. Compared to a control group which just read each passage once, the “high-certainty" group took more time and had more confidence… but didn't perform significantly better.

Now, we shouldn’t conclude that no one “gets the gist” from what they read. Skilled readers do exist—e.g. actively publishing professors are often quite sophisticated—and we can learn plenty by studying their behavior. But I’d guess that most knowledge workers would often exhibit the same comprehension problems that we repeatedly observe2 in university students.

From “what it says” to “what it means and why”

I’ve been thinking about the problem of reading comprehension since my work with Alex a few months ago, but another recent experience forced it to the front of my mind.

Dwarkesh Patel is the host of The Lunar Society, a podcast focused on interviews with scientists and domain experts. Dwarkesh differentiates himself by asking probing, well-researched questions of his guests, to go beyond the usual shallow conversations. He had a few physicists scheduled to come on the show in the coming months, so he decided to embark on a serious study of more physics to prepare himself. In conversation about how he might do that most effectively, I suggested that Dwarkesh might enjoy watching me study the quantum mechanics book he’d already started reading. I’d verbalize my thought process, and he could pester me with questions3.

Dwarkesh was quite surprised by my approach to the book. I moved at a pace of about fifteen minutes per page, while he had spent a few minutes or less. More importantly, I was constantly asking questions of the text and of myself. Some examples:

What does this sentence mean? Can I explain it in my own words?
Which ideas are particularly important here?
The author clearly thinks I should see why this claim is true—so why is it true?
The author’s emphasizing this detail—so why is it important?
The author seems to be setting up a contrast here—so what is it, exactly?
How does this detail relate to my prior knowledge in physics?
If I hide all but the beginning of this worked example, can I produce the rest myself?
I made a mistake a moment ago—do I understand why? Can I explain my misapprehension?
And of course: can I simply recall what was said on the previous page?

These questions will sound familiar to scholarly readers. In How to Read a Book, Adler and van Doren suggest that the essence of reading for understanding is asking questions of the book, and trying to answer them. An undemanding reader “asks no questions—and gets no answers.” I was being demanding, in fairly ordinary ways.

Now, Dwarkesh is a sophisticated, motivated thinker with a university education and a job which demands piles of careful reading. Yet these strategies—these ways of interrogating a text—were startlingly new to him. Not only had he not asked these questions while he read the text, and not only had he not fully understood the meaning of many of the phrases I was interrogating, but he hadn’t realized that he hadn’t understood what the author meant in those phrases. He wasn’t making a conscious choice not to dig deeper into those sentences (say, for the sake of time). Rather, it just wasn’t salient that he was making an implicit choice as a reader about how deep to go.

I don’t think this is unusual: I’d guess that most knowledge workers (particularly in STEM) read this way most of the time, unaware of the tradeoffs they’re making. I certainly did before I started my research on learning.

Adler and van Doren give us one over-simplified contrast between “reading for information” and “reading for understanding”: it’s the difference between being able to say what the author says (information), and being able to say what he means and why he says it (understanding). In the previous section, we focused mostly on problems with knowing what the author says. We looked at experiments where students were asked to state the text’s main idea, not to make sophisticated inferences. By contrast, Dwarkesh was missing details around meaning and implication.

These deeper levels are tougher to access because they demand that readers go beyond what’s printed on the page. If a sentence makes no sense to you, the words themselves will trip you up, so long as you’re paying attention. On the other hand, if you understand what a sentence says but don’t grasp its implications for the author’s explanation, the literal words won’t necessarily trigger confusion. You’ll only notice if some voice in your head is continuously demanding answers to questions like “how does this part fit into the whole?” The question isn’t on the page. The answer usually isn’t, either, at least overtly.

Interactions between reading comprehension and memory systems

So: reading comprehension is a bigger problem than many people expect, particularly at deeper levels of understanding. One implication is that if my goal is to help people reliably internalize difficult texts—not just what they say but what they mean and why—a direct focus on memory may put the cart before the horse.

I tried to help Alex by giving him a memory system stocked with all the important details from the physics chapters he was studying. But his review sessions were often unpleasant ordeals: for many prompts, he found the expected response confusing, or didn’t see why it mattered, or felt like he was parroting the answer without really understanding. Review sessions took longer, felt harder, and delivered less benefit than I’d expected. Understandably, Alex developed a somewhat aversive relationship with memory practice.

I’m increasingly inclined to see these issues as rooted in reading comprehension. When Alex found a prompt’s answer confusing, I think he also would have found it confusing immediately after reading the relevant explanation in the text. So these problems were probably not caused by forgetting, except insofar as Alex might have understood the reading less well because weak long-term memory of prerequisite concepts produced excess cognitive load as he read. That is, if you have unreliable recall for foundational details about electric fields, you’ll have trouble understanding explanations of Gauss’s Law in the next chapter. But we had problems with prompts about the first chapter, so this can’t fully explain what’s going on.

Very naively, we might say: first, understand the material; then, we’ll ensure you remember it. These are separate problems. Piotr Wozniak, the creator of SuperMemo, suggests as much. This is a good simplifying heuristic, but memory system prompts have—or can have—a more complicated relationship with the process of understanding.

In-text questions promote understanding

In mnemonic texts like Quantum Country, we interleave retrieval practice directly into the text, so that every few minutes of reading, you pause to answer questions about what you just read. Readers told us that this embedded practice dramatically altered the way they read. Prior research on “adjunct questions” in texts has isolated four distinct effects:

specific backward effects: actively recalling information makes it more likely that you’ll be able to recall that information in the future
general backward effects: the questions induce mental review and deeper processing of the surrounding text and related ideas; poor performance may cause you to re-read the text4
specific forward effects: in the subsequent text, you’ll be more attentive to the kinds of things the questions ask about; you’ll have better recall for related later material5
general forward effects: you’ll pay more attention in general, including to unrelated material; you may read more slowly and carefully if you learn you performed poorly

My understanding is that all of these effects occur, at least to some extent, even if you didn’t understand the material very well. In fact, encountering the embedded questions may reveal to you that you don’t understand the material and hence cause you to understand the material (e.g. by re-reading more carefully). These effects suggest a more complex model than a linear “understand, then remember” process.

Specific strategies aside, reading comprehension is largely about self-regulation. How quickly should you read? What should you focus on? What kinds of questions should you be asking of the text? How well is your current behavior producing the results you want? It’s hard to answer metacognitive questions like these while your mind is occupied with difficult material, especially if that’s not a habit you’ve already built. Embedded practice partially outsources the asking and answering of these questions. The prompts model (a certain type of) “successful” reading behavior, offer feedback, and create a natural pause for reflection and integration.

Review sessions promote understanding

Let’s look at the review sessions which occur over the days and weeks following your initial reading.

If you find a prompt utterly baffling at this stage, then recall practice is probably not going to help your understanding, except insofar as it causes you to re-read the relevant text. But this is an unpleasant way to discover that you didn’t understand. You’re not sitting in front of the book anymore; it may not be easily accessible at all; you must either interrupt review to re-read (awkward), or flag the concept for later study (unreliable). It’s too tempting to just mark the prompt as “forgotten” and move on. (There’s no button for “I don’t understand this”!) When I was working with Alex, I hadn’t embedded the questions into the text, as we did in the mnemonic medium, so this was often his experience—stumbling on comprehension issues days later.

But for less extreme examples, review sessions offer a good opportunity to understand more deeply. The first time you answered the question, while reading the text, the idea was still raw in your mind. But when you answer it again a few days or weeks later, you’ll probably look at it somewhat differently. Maybe you’ve read more material which depended on this idea, or you’ve used it to solve problems, or it’s come up in a conversation. Some of those experiences will re-surface alongside the original detail. They may help you notice new connections. Even when I’m the one writing the prompts, I often realize some important aspect of what the author means only after several rounds of review.

Retrieval enables activities which promote understanding

Say that you’ve just read about Gauss’s law. You feel you understand what the author is saying. You can explain it to another person. You follow how it’s used in the examples given, and see how it relates to Coulomb’s law. In other words, you know “what the author means and (some of) why he is saying it.” In the linear understand-then-memorize model, you’ve finished part one.

But then you try to use Gauss’s law in a simple practice exercise, and you find that you struggle. You’re constantly flipping back to look at the definition and examples. If you could solve this problem, and half a dozen more, you’d understand Gauss’s law much more deeply. For example, you’d viscerally grasp the consequences of the dot product inside the surface integral. But your wobbly memory of the material is making it hard to solve the exercise.

If you reviewed the relevant prompts over a few days, you’d consolidate those details into higher-level chunks, and you’d build relevant connections that would help you retrieve the right details at the right time during problem solving. After you solved a handful of practice problems, you’d build automaticity for some of the problem-solving processes surrounding Gauss’s law, which would make it easier to solve more problems, which would in turn would facilitate more understanding.

In this story, we invert the understand-then-memorize model: memorizing helps you create certain kinds of understanding.

Prompt-writing promotes understanding

So far, I’ve adopted the frame of the mnemonic medium, in which an expert writes memory system prompts for you, the reader. But if you’re willing and able to write your own memory prompts, that process can have a profound effect on your understanding.

The process is very demanding. To write good prompts, you must constantly ask: Which parts of this text are crucial, and which not? Can I restate this idea in my own words? How does this idea connect to other things I know about? How does this connect to my interests? Can I find boundary conditions I should write prompts about? Can I generate examples to use in a practice question? Can I observe anything important about the author’s mental model or motivations? These prompt-generating questions overlap substantially with the set of questions one must ask to read for understanding. And in both cases, you’ll often find that you can’t answer the questions, because you haven’t read carefully enough. That gives you some of the feedback you need to regulate your reading.

The questions in the last paragraph are ones you’d ask of the text, but as you better understand the natural “grain” of the prompt-writing medium, you’ll notice that you also ask questions of your prompts. You’ll look at a prompt you’ve drafted and think: does this distill the heart of the idea? You’ll try to polish prompts: are any of these details unnecessary, removable? You’ll wonder: have I captured all the important connections? You’ve written a prompt stating that something is the case; so now you naturally ask: can you write one about why it is the case? And in answering these questions, you’ll end up with a sharper picture of the material.

What’s odd about all this is that ostensibly, you’re making these prompts so that you can remember these details later. But in many cases, the process of creating the prompts may have much more impact on your learning experience than any subsequent memory practice. Maybe you could just throw out the prompts afterwards—treat them like a structured note-taking method. But Michael Nielsen points out that the downstream practice has a powerful motivational role. Note-taking can sometimes feel like abstract homework. You know you’re probably never going to look at the notes again, and that can erode your motivation. But if you’ve had experiences where memory practice has helped you effortlessly retain ideas in great detail, you start to attach a kind of automatic value to writing new prompts about interesting ideas. You adopt the belief: “Prompts do good things for me.” You know you’re going to see these things again. You have confidence that you’re going to durably remember all the details you’re writing about, so the process feels more real, more purposeful6. That’s potentially true even if many of the prompts are actually just “chaff”, kindling you needed to write and later discard. You used them to understand the material, and to compose a few prompts that you actually care to review.

Of course, this is a tremendously effortful and difficult process. That’s roughly why the mnemonic medium automates the prompt-writing process away completely. Some scaffolded middle ground might be interesting territory to explore.

Implications for my research

I opened with one simple way to frame my goal: “making it easy for people to remember what they read.” But in the context of learning and explanatory texts, I’m really more interested in “making it easy for people to internalize and make use of complex ideas.” It’s certainly a more interesting goal. But it’s also scope creep. Last month I wrote about how memory systems might want to expand into problem-solving practice. Here I’m writing about the possible need to expand into scaffolding reading comprehension.

It’s not obvious that a broader scope is a good idea. In fact, it’s probably a terrible idea. Countless entire careers have been spent on recall, problem-solving, and reading comprehension individually. So I’m not really planning to bite off that whole problem. But there are obvious interactions between these problems. By thinking hard at the unusual point of their intersection, I may end up with a more powerful solution to some subset of that space than I would by fixating on, say, memory alone.

Comprehension-centric in-text questions

Earlier, we discussed how the mnemonic medium’s embedded questions can help improve reading comprehension. But it’s worth noting that we weren’t exactly trying to do that. Speaking just for myself—Michael’s design goals may have differed—I was trying to test and reinforce readers’ memory for the specific material in the prompts, and I was creating an on-ramp for subsequent practice. We talked about the notion of giving people “feedback” while they read, and of modeling what good memory prompts looked like, but we didn’t adopt an overt frame of systematically facilitating reading comprehension.

It’s interesting to ask: if your primary goal were to enhance reading comprehension, what design would you produce? For example, you might not need to ask nearly so many questions to get the same metacognitive benefits. Maybe it would be better to save the detailed retrieval practice for the next day’s review session; instead, you might ask a higher-level interpretation question or two. The point wouldn’t be to remember the answer—the answer wouldn’t be in the text at all. Instead, such questions would be aimed at promoting deeper processing and reflection of the text. For recall prompts, our design discourages looking back at the source text; but for these comprehension prompts, we’d want the interface to encourage scrolling back up for another read.

There’s evidence that these kinds of questions can help reading comprehension, in controlled experimental settings. But I’m not terribly excited about this direction. Lots of books have interpretive questions like this already; they don’t seem to have a powerful impact; what’s different about my proposal? For that matter, lots of textbooks have retrieval practice-like review questions embedded in the text. What makes the mnemonic medium different is that when you answer a question which appears in a mnemonic essay, you’re setting yourself up to remember that answer forever.

I find that I mostly hate answering comprehension-oriented questions in books. They often feel condescending, boring; like unpleasant homework. I don’t care about answering them, and (rightly or wrongly) I generally don’t feel that doing so will help me in a way I care about.

Discussion questions

Interpretive questions in textbook exercise listings? Nope; boring; I don’t care. But when I show up at my University of Chicago class, the facilitator asks me interpretive questions, and I find I want to answer them. Sometimes that’s because they’re unusually interesting, but often they’re simple questions like: “What justification does Aristotle give for X? Do you believe it?”

I think the difference is mostly about social context. A real person I respect is asking the question; they’re going to genuinely engage with my answer; other students might build on my answer or have interestingly different answers; the facilitator will connect our answers to subsequent questions; etc. It’s also partially about the framing of my activity. I’ve shown up to have a discussion about the book, so that’s what I’m going to do. I’m “discussing”, not “answering boring comprehension questions.”

Of course, these simple “discussion” questions routinely reveal that I’d failed to understand what the author was saying. I want to make clear: I have trouble with reading comprehension too! In the scenario with Dwarkesh earlier, he saw me in “super careful expert reader mode”. I was “on” because I knew that’s what he wanted to see, and the social setting sharpened my engagement. And when I’m in “carefully processing a text into prompts mode”, my comprehension is solid. But in less extreme scenarios, I slip up pretty regularly. I’d love to have reading augmentation which helps me make sure that I’m understanding a text as deeply as I intend to—so long as that augmentation isn’t too burdensome.

These discussions really do reinforce my comprehension, albeit very slowly, and with spotty coverage. And I can’t easily orchestrate a well-facilitated discussion for everything I read. The technologist’s snap answer here is: use large language models! Have a bot ask me questions and give me feedback about my answers. Maybe it could work in some future where I have an ongoing relationship with the bot across time, but for the time being, I find these kinds of interactions leave me totally cold. I just don’t care about answering the bot’s question; I know that it doesn’t really “want” to know the answer; it doesn’t “care” about my answer; my answer isn’t “meaningful” to it; etc.

One more promising alternative might lie in something closer to “elaborative vocal rehearsal”. It's a pretty simple practice: after you finish a passage, close your eyes and explain it back in your own words. This reinforces basic comprehension in a similar way to question-answering, but it has a very different emotional feel. In particular, it doesn't make me feel like I'm answering a boring question I don't care about. There are some interesting opportunities for augmentation here. For example, the text could highlight important details which you didn't include in your explanation. It could scribble over a realtime transcript of your explanation, circling bits which seem to conflict with the text. Maybe lightweight badges could reflect that you captured what the text means but not why it matters.

Very pragmatically, there’s an important problem to be solved: it’s extremely unpleasant to be asked to remember details you never understood in the first place. One way to avoid that is to facilitate comprehension, as we’ve been discussing; another way is to avoid asking questions about non-comprehended material. Perhaps elaborative vocal rehearsals offer a way to orchestrate the latter: we could only ask you to remember details you included in your explanation.

Open-book practice

My ideal memory system would not only reinforce my recall of an idea, but actually deepen my understanding of it over time. We’ve discussed a few ways it might do that: scaffolded problem-solving practice, reflection prompts, synthesis prompts, etc. These tasks aim to produce understanding from within—that is, by solving a certain kind of problem, you’ll acquire a certain kind of insight.

But I could probably also deepen my understanding by returning to the text two weeks later, with reinforced memory and a fresh perspective. I might better grasp the significance of one of the author’s points, or see some connection I missed the first time through. One limitation of the review session modality as it exists today is that it exists apart from the text; it basically assumes you “got” everything on your first read through the text. If you didn’t, it must be a simple failure of memory. And the review session can’t really include a probing question which would send you back through the text for a new interpretation.

I can imagine designing an alternative review session interface around “open-book” review. Imagine that a prompt floats along one edge of the screen, while the rest is given over to the original text. Structured, scaffolded, purposeful re-reading could be integrated directly into the experience. And it seems that open-book tests can produce comparable effects on long-term memory.

On its surface, this suggestion seems quite similar to the integrated comprehension questions I suggested (and dismissed) earlier. But I have two important differences in mind. First, I’m imagining that these re-reading prompts wouldn’t be simple comprehension questions—they’d be probing questions which would encourage the reader to see the text in a new way, or to make connections with subsequent material. Second, I think it matters that you’d be encountering the question in the context of a review session. You’ve signed up to practice, and here’s a practice question. When you’re reading, you might feel that you’ve signed up to read, not answer review questions.

As should be obvious, I’m still figuring out what I’d like to do with my recent observations about reading comprehension and problem-solving practice. I suspect there’s some powerful synthesis with memory systems, waiting to be discovered.

Thanks for Michael Nielsen and Russel Simmons for helpful conversations on this topic. Thanks also to Dwarkesh Patel and Alex for sharing their learning experiences.

Footnotes

1 Thanks to Gwern for finding the fulltext here and for documenting that search as a case study.

2 For more, see the bibliographic litany in this review, page 9, first paragraph.

3 Video of this should be available in the coming weeks, alongside a more traditional interview.

4 The latter claim isn’t discussed in Hamaker’s review, but I’ve heard lots of reports like this from mnemonic medium readers. Interestingly, I haven’t yet found studies exhibiting this effect in the adjunct question or reading comprehension literature. Probably I just haven’t yet found the right term of art.

5 The latter claim isn’t discussed in Hamaker’s review, and it’s not as well studied as some of the other effects I’ve mentioned, but it’s usually called “test-potentiated learning” or the “forward testing effect”. See e.g. Arnold and McDermott, “Test-Potentiated Learning: Distinguishing Between Direct and Indirect Effects of Tests” (2013).

6 This belief-building process is part of why the experience of “just parroting back answers” so harmful. It’s not just that it feels like you’re wasting time in that moment, or that you’re not understanding the thing you want to understand. It (rightly) undermines your belief in the value of memory practice.

Last updated 2023-07-13.