Does studying QCVC help readers understand the subsequent Search essay?

It’s hard to answer that question with the data we have, but I got curious on 2020-03-24 and took a simple swing.

I decided to compare readers who reached the in-text level in QCVC before starting Search to readers who reached (at least) the 5-days level in QCVC before starting Search. I looked at their accuracy on the in-text questions while reading the Search essay.

  • 25th / 50th / 75th percentile accuracies of in-text Search questions (among readers who answered at least 25 of the 37 search questions):
    • QCVC in-text level completed before Search: 78% / 87% / 94% (N=158)
    • QCVC 5 days level or higher completed before Search: 83% / 90% / 97% (N=25)
    • Query

This probably isn’t just a selection effect. Certainly, reaching the 5-day level selects for diligence, but in this case that criterion also slightly selects against enthusiasm. Readers who were super-thrilled about the material are likely to read the essays back-to-back, without waiting to reach the 5-day level in QCVC first.


==Update== on 2021-03-03:

  • QCVC in-text level completed before Search: 80% / 88% / 94% (N=192)
  • QCVC 5 days level or higher completed before Search: 82% / 89% / 97% (N=34)

Eh. Not terribly persuasive.

WITH

searchReviews AS (SELECT userID, cardID, reviewMarking, r.timestamp AS timestamp, essayName, sessionID, isRetry FROM `logs.reviews` AS r JOIN `logs.latestEssaysCards` USING (cardID) WHERE essayName="search"),

searchReaders AS (SELECT DISTINCT userID, COUNT(DISTINCT cardID) AS cardCount, MIN(timestamp) AS firstSearchTimestamp FROM searchReviews GROUP BY userID HAVING cardCount >= 25),

usersHadReadQCVC AS (SELECT userID, ANY_VALUE(firstSearchTimestamp) AS firstSearchTimestamp, MIN(r.timestamp) < ANY_VALUE(s.firstSearchTimestamp) AS hadReadQCVC FROM searchReaders AS s JOIN `logs.reviews` AS r USING (userID) JOIN `logs.latestEssaysCards` USING (cardID) WHERE essayName="qcvc" GROUP BY userID),

usersHadReached5DaysInQCVC AS (SELECT userID, MAX(level) AS maxLevel FROM `logs.levelAttainmentEvents` JOIN searchReaders AS s USING (userID) WHERE essayName="qcvc" AND studyTimestamp < s.firstSearchTimestamp GROUP BY userID),

sessionsCompletedBeforeSearch AS (SELECT userID, MAX(sessionNumber) AS sessionsBeforeSearch FROM `logs.compliance` AS compliance JOIN searchReaders USING (userID) WHERE studyTimestamp < firstSearchTimestamp GROUP BY userID HAVING MAX(compliance.cardCount) >= 70),

-- accuracySamples AS (SELECT ANY_VALUE(sessionsBeforeSearch) AS sessionsBeforeSearch, COUNTIF(reviewMarking="remembered") AS rememberedCount, COUNT(*) AS answeredCount FROM searchReviews JOIN sessionsCompletedBeforeSearch USING (userID) WHERE sessionID IS NULL AND isRetry IS NOT TRUE GROUP BY userID)

-- SELECT APPROX_QUANTILES(rememberedCount/answeredCount, 4), sessionsBeforeSearch, COUNT(*) AS N FROM accuracySamples GROUP BY sessionsBeforeSearch ORDER BY sessionsBeforeSearch ASC


accuracySamples AS (SELECT ANY_VALUE(maxLevel) AS maxLevel, COUNTIF(reviewMarking="remembered") AS rememberedCount, COUNT(*) AS answeredCount FROM searchReviews JOIN usersHadReached5DaysInQCVC USING (userID) WHERE sessionID IS NULL AND isRetry IS NOT TRUE GROUP BY userID)

SELECT APPROX_QUANTILES(rememberedCount/answeredCount, 4), maxLevel, COUNT(*) AS N FROM accuracySamples GROUP BY maxLevel ORDER BY maxLevel ASC

/*accuracies AS (SELECT SUM(rememberedCount)/SUM(answeredCount) AS accuracy, maxLevel, COUNT(*) AS userCount, SUM(answeredCount) AS N FROM accuracySamples GROUP BY maxLevel HAVING userCount >= 10),

  cis AS (
  SELECT
    *,
    1.96 * SQRT((accuracy * (1 - accuracy)) / N) AS CI95
  FROM
    accuracies)
SELECT
  *,
  accuracy - CI95 AS lower,
  accuracy + CI95 AS upper
FROM
  cis
ORDER BY
maxLevel ASC

*/
Last updated 2023-07-13.