Recoding can increase chunk size

Human channel capacity increases with bits-per-chunk. But we don’t need to rely on the “intrinsic” chunk size of a stimulus. It’s possible to increase the effective chunk size of stimuli by recoding them—that is, mentally regrouping them into chunks representing larger patterns. These chunk schemas are also called Mental representations, after Ericsson and Pool.

For example, when trying to memorize a sequence of binary digits, one can instead memorize them as octets (e.g. 010 = 2, 101 = 5, etc), which will roughly triple your capacity (Miller, 1956). Chase and Ericsson (1982) used this technique to build a student’s digit span to 80 digits, via hierarchical recoding into 4-decimal-digit chunks.

This process is important not just because it helps us remember useful information, but because it’s likely the key to how anyone processes any kind of complex material (in particular, Expertise requires building sophisticated chunk recoding schemes). A pianist initially reads individual notes (C, E, G—ah, a C chord!) but later sees that shape as a single chord (ah, a C major triad). It’s not possible to sight-read music of any real complexity with the former-approach.

These patterns (e.g. the shape of a major triad) can only be used as “chunks” once they’re stored in long-term memory.

Chase and Simon - Perception in chess record experimental data suggesting that chess masters use larger chunk sizes (and possibly hierarchical chunk configurations).

References

Chase, W. G., & Ericsson, K. A. (1982). Skill and Working Memory. In G. H. Bower (Ed.), Psychology of Learning and Motivation (Vol. 16, pp. 1–58). Academic Press. https://doi.org/10.1016/S0079-7421(08)60546-0

Miller, G. A. (1956). The magical number seven, plus or minus two: Some limits on our capacity for processing information. Psychological Review, 63(2), 81–97. https://doi.org/10.1037/h0043158 Miller - The magical number seven, plus or minus two

Last updated 2023-07-13.

Expertise requires building sophisticated chunk recoding schemes

In many fields, experts become experts mostly by developing more sophisticated mental representations (Mental representations, after Ericsson and Pool), which amounts to increasing the size of their mental chunks (“Chunks” in human cognition). This increases their information processing capacity (Human channel capacity increases with bits-per-chunk, Recoding can increase chunk size). This happens through practice: Good practice encodes more effective chunk recoding schemes

What sets expert performers apart from everyone else is the quality and quantity of their mental representations.

(Ericsson and Pool, 2016, p. 62, not well-cited)

For example, the model developed by Simon and Gilmartin (1973) suggests that chess masters have encoded order tens of thousands of chunks. (See also Chase and Simon - Perception in chess)

Knowledge work often requires solving search problems. Ericsson and Pool suggest that expert search performance comes from more complex chunk schemas (2016, p. 70-72). The argument’s not made very strongly, but because Human channel capacity increases with bits-per-chunk, this would seem to explain superior culling and feedback-uptake performance.

References

Simon, H. A., & Gilmartin, K. (1973). A simulation of memory for chess positions. Cognitive Psychology, 5(1), 29–46. https://doi.org/10.1016/0010-0285(73)90024-8

Ericsson, A., & Pool, R. (2016). Peak: Secrets from the New Science of Expertise (1 edition). Eamon Dolan/Houghton Mifflin Harcourt. Peak - Ericsson and Pool

Last updated 2023-07-13.

Human channel capacity increases with bits-per-chunk

One common workaround for Channel capacity of humans as information processors appears to be making a sequence of smaller observations, rather than a single complex absolute judgment. This only works if you can hold the sequence in your head, so it’s limited by your Span of working memory. Happily, Working memory span is mostly independent of item complexity. So you can increase your effective channel capacity by increasing the number of bits in each observed chunk (“Chunks” in human cognition).

In this figure depicting data from Pollack (1953), channel capacity expands almost linearly with bits-per-chunk (Miller, 1956, p. 92).

This effect is still limited by the Span of absolute judgment, so to expand bits-per-chunk beyond 5, you’ll need to make chunks multidimensional (Human channel capacity increases with stimulus dimensionality).

Q. How does human channel capacity vary for a sequence of elements, as the number of bits transmitted in each element increases?
A. It increases roughly linearly.

Q. Why does it matter that the span of working memory is roughly independent of the span of absolute judgment?
A. It suggests that we can hold more information in working memory by increasing the “chunk” size of the items held in memory.

References

Pollack, I. (1953). Assimilation of Sequentially Encoded Information. The American Journal of Psychology, 66(3), 421–435. JSTOR. https://doi.org/10.2307/1418237

Last updated 2023-07-13.

Human channel capacity increases with stimulus dimensionality

For unidimensional stimuli, the Channel capacity of humans as information processors is only a couple bits, but in everyday life, it seems that we routinely reproduce much more complex stimuli than that. One explanation for this discrepancy is that human channel capacity {increases} with the {dimensionality} of the stimulus.

For example, Miller’s analysis (1956, p. 85-87) of data from Hake and Garner (1951) and Coonan and Klemmer (unpublished communication with Miller) suggest that human channel capacity for points on a line is between {3.2 and 3.9 bits (10-15 categories)}, whereas data from Klemmer and Frick (1953) suggest that human channel capacity for points in a square is about {4.6 bits (~24 categories)}.

Miller’s figure (1956, p. 88) summarizing the data for independently-varying dimensions:

The channel capacity does not increase linearly with the number of dimensions. In fact, as dimensionality increases, channel capacity for any individual variable reliably {decreases} (Miller, 1956, p. 89), so long as {the number of categories to be judged is greater than the dimensionality} (Erikson, 1955, p. 327-329)

This effect appears to persist even when the added dimensions aren’t independent, e.g. when correlating size, brightness, and hue simultaneously to a single variable, the channel capacity was 4.1 bits, versus 2.7 bits for any isolated attribute (Eriksen, 1955, as aggregated by Miller, 1956, p. 88).

Miller conjectures (1956, p. 91) that this effect asymptotes around 10 dimensions, but there was no evidence at that time. Halford et al (1998) review the intervening evidence and suggest that the limit is closer to {4}.

References

Eriksen, C. W., & Hake, H. W. (1955). Absolute judgments as a function of stimulus range and number of stimulus and response categories. Journal of Experimental Psychology, 49(5), 323–332. https://doi.org/10.1037/h0044211

Hake, H. W., & Garner, W. R. (1951). The effect of presenting various numbers of discrete steps on scale reading accuracy. Journal of Experimental Psychology, 42(5), 358–366. https://doi.org/10.1037/h0055485

Halford, G. S., Wilson, W. H., & Phillips, S. (1998). Processing capacity defined by relational complexity: Implications for comparative, developmental, and cognitive psychology. Behavioral and Brain Sciences, 21(6), 803–831. https://doi.org/10.1017/S0140525X98001769

Klemmer, E. T., & Frick, F. C. (1953). Assimilation of information from dot and matrix patterns. Journal of Experimental Psychology, 45(1), 15–19. https://doi.org/10.1037/h0060868

Last updated 2023-07-13.

Channel capacity of humans as information processors

One way to examine the limits of human information processing is to ask how much information can a person can reproduce from some stimulus they observe. In this framing, we can model the observer as a communications channel using tools from information theory. This figure (Pollack, 1953, p. 422) depicts the model:

A perfect communications channel could reproduce any input you gave it. In practice, most channels (including humans) produce more errors as inputs contain more information. The behavior is usually asymptotic: a channel transmits its input perfectly until some threshold. Past that threshold, which we call the {channel capacity}, the correlation between outputs and inputs falls, and the total number of bits of transmitted information remains constant.

Experiments on human Span of absolute judgment can be used to model humans in this way. Miller’s review (1956) of the empirical data suggested that human channel capacity for unidimensional stimuli is about {2.6} bits.

For example, here’s a figure from Miller (1956, p. 83), using experimental data from Pollack (1952, 1953) on human absolute judgment of pitches, reframed with an information-theoretic approach.

Q. If you know a subject’s span of absolute judgment (for single-item, unidimensional magnitudes), how would you find their channel capacity?
A. channel capacity = log_2(span of absolute judgment)

Q. Why is the span of absolute judgment related to human channel capacity by a log2 relation?
A. Channel capacity is expressed in bits. If the span of absolute judgment is 8 categories, you need log2(8) bits to represent every state.

References

Pollack, I. (1952). The Information of Elementary Auditory Displays. The Journal of the Acoustical Society of America, 24(6), 745–749. https://doi.org/10.1121/1.1906969

Pollack, I. (1953). Assimilation of Sequentially Encoded Information. The American Journal of Psychology, 66(3), 421–435. JSTOR. https://doi.org/10.2307/1418237

Last updated 2023-07-13.