Shuffling questions didn’t change reader accuracy (on first two repetitions)

We worried that by Shuffling questions in Quantum Country review sessions, we would reveal that readers’ memories weren’t nearly as good as they seemed.

We now have enough data to analyze performance of shuffled users vs. non-shuffled users on their first two post-initial-read repetitions. The % of questions answered correctly (excluding retries, no attempt to control for different users having answered different numbers of questions):

First repetition:
- Shuffled: 83±0.80% (N=8417)
- Control: 81±1.3% (N=3303)
Second repetition:
- Shuffled: 87±1.4% (N=2281)
- Control: 87±2.2% (N=871)
Query

These numbers actually suggest that readers got slightly more accurate with shuffling, but I don’t really believe that. There’s enough covariance here that I’ll abstain from any faux-serious analysis and interpret this as “no meaningful change.”

Having seen this, I’ll now end the experiment and convert all users to the new shuffled condition.