Spaced Repetition in 2026: What Actually Works for Language Learners

最終更新日: 2026年5月3日

You've probably heard spaced repetition described as the closest thing language learning has to a cheat code. That framing sets the wrong expectation. Spaced repetition is a scheduling technique, and on its own it does not teach you a language. What it does, very well, is keep vocabulary and grammar patterns you've already encountered from falling out of your head. This article walks through what the research actually says, how modern algorithms like FSRS-6 and SM-20 differ from the 1980s-era SM-2, and how to fold all of it into an immersion-based routine without drowning in reviews.

What Spaced Repetition Actually Is
The Algorithms: SM-2, FSRS-6, and SM-20
How FSRS Actually Differs Under the Hood
Optimal Intervals (and Why the Defaults Are Usually Fine)
Where SRS Fits in an Immersion Routine
Beyond Vocabulary: Spacing Grammar, Listening, and Output
Card Design: Small Choices That Compound
Cultural and Language-Specific Wrinkles
Worked Example: A Week of Mining from One Show
Common Failure Modes

What Spaced Repetition Actually Is

The core finding is almost a century old. In 1939, Herbert F. Spitzer tested 3,605 sixth-graders across Iowa on their retention of short factual articles about peanuts and bamboo, and published the results in the Journal of Educational Psychology. Students who reviewed the material at spaced intervals remembered far more than students who crammed. The effect has been replicated relentlessly since. Cepeda and colleagues' 2006 meta-analysis in Psychological Bulletin pulled together 839 effect-size contrasts from 317 experiments across 184 articles, and found the same thing every time: distributing practice beats massing it.

The mechanism behind this is the forgetting curve, first sketched by Hermann Ebbinghaus in the 1880s and replicated in 2015 by Murre and Dros at the University of Amsterdam (one subject, 70 hours of testing, curve closely matching the original, plus a small bump at 24 hours that the authors attribute to sleep consolidation). Memories decay predictably. Each time you successfully retrieve a memory just before it would have faded, the decay slope gets shallower. Spaced repetition software (SRS) is simply bookkeeping: it tracks when each fact is about to be forgotten and surfaces it at that moment.

One related idea worth separating out is the testing effect. Roediger and Karpicke's 2006 study at Washington University in St. Louis found that after one week, students who practiced active retrieval recalled roughly 60% of material, versus roughly 40% for students who only re-read it. This is why SRS uses flashcards rather than passive re-reading. The retrieval itself is part of the medicine.

The Algorithms: SM-2, FSRS-6, and SM-20

Piotr Woźniak wrote the first computerized spaced repetition algorithm, SM-2, in Turbo Pascal 3.0 over sixteen evenings in late 1987. He placed the full description in the public domain as an appendix to his 1990 master's thesis, where he reported memorizing 10,255 English vocabulary items in the first year of using it, at 41 minutes per day, with 92% retention (excluding items in their first three weeks). That algorithm, essentially unchanged, is still what most SRS apps fall back to.

SM-2 has well-known problems. In Anki's implementation, every time you press "Again" on a card, its ease factor drops by 20%. Cards that you fail a few times in a row get stuck at ease factors around 130%, where their intervals barely grow. The Anki community calls this "ease hell." You end up seeing the same stubborn cards every other day forever.

FSRS (Free Spaced Repetition Scheduler) addresses this directly. FSRS-6 shipped in late 2025 and was trained on roughly 700 million reviews contributed by about 20,000 volunteer Anki users. At a matched 90% retention target, it reduces the number of reviews required by 20 to 30% compared to SM-2. As of Anki 23.12 and later, FSRS is available as the scheduler, and it's the default for new profiles on recent installs. If you're still on SM-2 out of inertia, switching is one of the higher-leverage moves you can make.

SuperMemo has continued its own line of development. SM-20, announced in 2026, is the first SuperMemo version where all scheduling parameters are computed by machine learning rather than hand-tuned heuristics. The SuperMemo API launched in early access on March 31, 2026, exposing SM-20 to developers with free usage limits of 100 repetitions per day and a one-time import of up to 10,000 historical repetitions. Expect to see SM-20 showing up in third-party tools over the next year.

The practical takeaway for a language learner: you do not need to care deeply about which algorithm your app uses, as long as it's FSRS or something newer. The differences between FSRS and SM-20 are small compared to the difference between using any modern scheduler and not doing spaced review at all.

How FSRS Actually Differs Under the Hood

It helps to understand, at a high level, why FSRS behaves better than SM-2 in practice. SM-2 has two state variables per card: an interval and an ease factor. It updates them with a fixed formula regardless of your own review history or the specific card's difficulty. FSRS uses a three-component memory model borrowed from the cognitive science literature: stability (how slowly the memory decays), difficulty (how hard the item is for you specifically), and retrievability (the probability you'd recall it right now). These are updated with parameters fit to your own review log once you have around 1,000 reviews on the books.

The concrete consequences:

Cards you find easy graduate to multi-month intervals quickly. Under SM-2, an easy card might sit at 45-day intervals for a year before reaching 90 days. Under FSRS-6 at 90% retention, the same card could jump to 6 months after four or five successful reviews.
Cards you fail don't get catastrophically penalized. FSRS recalculates stability based on what the failure actually tells it, rather than slashing an ease factor by a fixed 20%.
The retention knob is honest. If you set FSRS to 90%, your measured long-run retention on mature cards will land within a couple of points of 90%. SM-2 has no equivalent dial; you get whatever retention its fixed heuristics produce, which is typically higher than needed and costs you extra reviews.

For most learners this translates into roughly a quarter fewer daily reviews for the same recall, which compounds over the years you'll actually be using the system.

Optimal Intervals (and Why the Defaults Are Usually Fine)

The most useful concrete finding on intervals comes from Cepeda et al.'s 2008 follow-up, which tested over 1,350 participants and found that the optimal gap between study sessions scales to roughly 10 to 20% of the desired retention interval. If you want to remember a word for a year, the gap between successful reviews should eventually stretch to about 5 to 10 weeks. If you want to remember it for a week, gaps of a day or so are appropriate early on.

Modern schedulers handle this math for you. What you control is the retention target. FSRS lets you pick it: 90% is the default, 95% means shorter intervals and more daily reviews, 85% means longer intervals and more forgetting. For a language learner reviewing 20 to 40 new cards a day, 90% is almost always the right setting. Pushing to 95% roughly doubles your review load for marginal retention gain. Dropping to 85% saves time but you'll fail more cards, which feels worse and can sap motivation.

A few patterns that show up in practice:

New cards per day matters more than interval tuning. Twenty new cards a day is sustainable long-term. Fifty is sustainable for maybe two months before the review queue buries you.
Leeches deserve surgery, not more reviews. If a card has failed eight or ten times, the card itself is the problem. Rewrite it, add an example sentence from real content, or delete it.
Missing a day is fine. Missing a week is expensive. FSRS handles a skipped day gracefully. A week off means a review queue three to four times its normal size when you come back.

Where SRS Fits in an Immersion Routine

Here's where the method gets misused. Learners hear about spaced repetition, download Anki, grind through a pre-made 10,000-word deck, and wonder why they still can't understand a podcast. The reason is that an SRS card of a word you've never heard in context is close to useless. You're memorizing a gloss, not a word.

The version that actually works looks more like this. You watch an episode of Terrace House or read a chapter of Kiki's Delivery Service. You hit a word you don't know, look it up, and if it seems useful you make a card from the exact sentence you saw it in. The card has the sentence on the front, the target word bolded, and the audio if available. The back has the definition and a screenshot or image. When the card comes up in review two days later, you're not recalling an abstract dictionary entry. You're recalling the scene.

This is the sentence-mining approach, and it's been the standard among serious immersion learners for over a decade. For a broader treatment of how to structure a full routine around this, How to Actually Learn a Language walks through the daily schedule. For Japanese specifically, which has its own quirks around kanji and pitch accent, How to Learn Japanese Practically covers what to mine and when.

A reasonable daily split for an intermediate learner:

60 to 90 minutes of immersion (video, audio, reading) with lookups.
15 to 25 minutes of SRS review, mostly cards mined from that immersion.
5 to 10 new cards added per day, chosen from words that showed up in content you cared about.

Notice the ratio. SRS is maybe 20% of your time. The content is the meal. Flashcards are the vitamins.

Beyond Vocabulary: Spacing Grammar, Listening, and Output

Spaced repetition isn't only for words. Grammar patterns respond to the same treatment when you card them as example sentences. A card for the Japanese ～てしまう pattern might look like:

宿題を忘れてしまった。 (I ended up forgetting my homework.)

Front: the sentence with audio. Back: the English gloss and a one-line note on what ～てしまう conveys (completion, regret, unintended result). You're not memorizing a rule. You're memorizing a concrete instance, and the rule falls out of seeing enough instances.

Listening deserves its own spaced approach. One trick: when a card's audio gives you trouble, don't just grade it and move on. Shadow the sentence three or four times before clicking through. This layers pronunciation practice onto your review queue without adding a separate study block. For a deeper treatment of the technique, Shadowing as a Learning Technique gets into the mechanics.

Output (speaking and writing) is the one area where SRS helps least directly. You can card production prompts ("how do you say X?"), but the real gains come from conversation practice. Use SRS to keep vocabulary ready for retrieval, then use that vocabulary in actual exchanges.

Card Design: Small Choices That Compound

The difference between a deck that feels maintainable at year three and a deck you abandon at month four is almost entirely about card design. A few principles that hold up across languages:

One target per card. If the front contains more than one unknown word, the card is testing two things at once, and you can't tell which one caused the failure. Mine the first unknown word from the sentence, then revisit the sentence later if a second word is still blocking you.
Cloze deletion for grammar, recognition for vocabulary. For vocabulary in context, put the full sentence on the front and let your eye find the bolded word. For grammar patterns, blank out the pattern itself and force yourself to reconstruct it.
Audio on the front, not the back. If you're learning a spoken language, hearing the sentence should be the prompt, not the reward. This builds listening retrieval, which is what you actually need in conversation.
Images over translations when possible. A screenshot of the scene where a word appeared anchors the card in episodic memory. An English gloss anchors it in translation. The first transfers to real comprehension; the second often doesn't.
Keep the back short. Two lines maximum. If you need a paragraph of notes to explain the card, the card is too ambitious. Split it.

A small test: pick ten of your oldest cards and ask whether you remember where each sentence came from. If you can picture the scene or the page, the card is doing its job. If the sentence feels context-free, it probably is, and it will leech.

Cultural and Language-Specific Wrinkles

SRS advice tends to be written as if all languages were the same. They aren't, and a few common situations deserve calling out:

Japanese and Chinese kanji/hanzi. Some learners run a separate deck for individual characters with keyword mnemonics, alongside their vocabulary deck. This can work, but it frontloads abstract memorization before you've seen the characters in words. The alternative, learning characters as they appear inside mined sentences, is slower at first but tends to stick better because each character arrives with its real phonetic and semantic neighbors.
Tonal languages. For Mandarin, Thai, Vietnamese, and Cantonese, audio on the front of the card isn't optional. A word written without its tone is a different word than the same word heard with its tone. Learners who rely on pinyin or romanization end up with cards they can "read" but can't hear or say.
Languages with heavy inflection. In Russian, Finnish, or Turkish, a single dictionary form expands into dozens of surface forms. Mine the form you actually encountered, not the lemma. You'll see the other forms in other sentences, and the paradigm will assemble itself from examples.
Cultural references inside sentences. A mined card that depends on knowing who a Japanese comedian is, or what a specific holiday involves, will fail in a way that has nothing to do with vocabulary. Either add a one-line cultural note to the back, or pick a different sentence. The goal is a card that tests language, not trivia.
Script-learning stages. For Arabic, Korean, Hindi, or Georgian, the first few weeks of SRS should focus on the script itself before any vocabulary deck starts. Trying to learn new words in a script you can't yet decode means every card is really two cards, and the failure rate spikes.

Worked Example: A Week of Mining from One Show

To make this concrete, here's what a week of realistic SRS usage looks like for an upper-beginner Spanish learner watching La Casa de Papel.

Monday: 45 minutes of one episode. Seven lookups, four of which become cards. Two minutes per card to set up front, back, audio, and screenshot. Total SRS time: 8 minutes adding, 12 minutes reviewing older cards. 20 minutes of SRS, 45 minutes of content.

Wednesday: Another episode. Six new cards. The Monday cards reappear for their first review. Two of them are graded "Good," one "Hard," one "Again." The one you failed had two new words in it, which is the lesson: next time, pick a simpler sentence.

Friday: Lighter day. No new episode, just reviews. 25 minutes total. Monday's cards are now at 6-day intervals, Wednesday's at 3 days.

Sunday: One more episode, five new cards. Review queue is around 35 cards. Everything fits in 30 minutes.

Over the week: four episodes watched, 22 new cards added, roughly 140 reviews completed. Compare this to a learner doing 50 new pre-made cards a day, who ends the week with 350 cards and zero episodes watched. The mined-card learner has worse raw numbers and better actual comprehension.

Common Failure Modes

A few ways this goes wrong, ranked by how often we see them:

All deck, no content. Someone does 200 reviews a day for six months and has never watched a single episode of TV in their target language. The vocabulary slowly slides back out because it was never anchored to anything.
Cards too dense. Front of card has four new words, three grammar points, and a cultural reference. You fail it. You fail it again. You hate the card. One target word or pattern per card is the rule.
Ignoring leeches. See above. Kill or rewrite cards that fail repeatedly.
Chasing retention percentages. A 98% retention rate on 3,000 cards means nothing if you can't follow a conversation. The metric that matters is "can I understand more this month than last month."
Switching algorithms every two weeks. Pick FSRS with default settings, use it for three months, then evaluate. Algorithm tourism is a form of procrastination.
Importing a 10,000-card deck at the start. The cards aren't yours. The sentences came from someone else's life. Motivation collapses around week three, and the deck becomes a monument to intentions.
Grading dishonestly. Pressing "Good" when you actually paused and half-guessed corrupts the scheduler's model of your memory. If you hesitated, it was "Hard" at best. The algorithm is only as honest as your grading.
Reviewing in batches instead of daily. Doing all your reviews Sunday evening defeats the point. The schedule assumes cards come due on the day they come due. Batched reviewing means half your cards are reviewed too early and half too late.

The underlying point: SRS is scheduling. It schedules whatever you put into it. If you put in cards mined from content you love, it keeps that content accessible. If you put in a context-free word list, it keeps a context-free word list half-memorized.

Frequently Asked Questions

How many cards a day should I add as a beginner?

Five to ten. The review load compounds: ten new cards today means roughly forty reviews due in a week and seventy in a month once cards start reappearing at mixed intervals. Beginners routinely set thirty or fifty new cards a day, feel fine for two weeks, and then quit when the review queue hits 300. Start small, feel the load settle, and raise the number only when reviews fit comfortably into your day.

Should I use a pre-made deck or mine my own cards?

A small pre-made frequency deck (the first 1,000 to 1,500 words) can be a reasonable shortcut to get you to a point where immersion is productive. After that, mined cards outperform pre-made ones because they're linked to scenes and sentences you actually remember. The hybrid pattern that works: pre-made deck for the first couple of months, then taper it off as your own mined deck takes over.

What do I do when my review queue gets out of control?

Stop adding new cards entirely until the queue is clear, which usually takes one to two weeks. Do not try to "catch up" by doing 400 reviews in one sitting; you'll grade sloppily and corrupt the scheduler's data about your actual recall. Cap reviews at your normal daily number plus 30%, and accept that clearing takes time. If the queue is over a thousand and the cards no longer feel like yours, it's often faster to reset mature-card intervals and treat them as relearning than to grind through.

Does spaced repetition work for adults as well as children?

Yes. The Cepeda meta-analysis covered participants from age 5 to over 70, and the spacing effect shows up at every age tested. What changes with age is raw encoding speed, not the benefit of spacing. Adults often do better with SRS than children because they're more consistent with daily review, which is the single biggest factor.

Can I use SRS for speaking practice?

Indirectly. SRS keeps vocabulary and set phrases available for retrieval, which is a precondition for fluent speech, but it doesn't build the motor and timing skills of actual conversation. A workable pattern: use SRS to maintain vocabulary, and spend separate time on output through tutoring, language exchange, or writing practice. Production cards ("how do you say X in your target language?") can bridge the gap but shouldn't replace real conversation.

How long until I notice results?

For vocabulary recognition in context, two to three months of consistent daily review (twenty minutes, mined cards, daily content). You'll notice it first when a word you carded three weeks ago shows up in a new episode and you understand it without stopping. For broader comprehension gains, six months is a more honest timeline. Anyone promising faster is selling something.

Is Anki still the best tool, or has something replaced it?

Anki is still the bedrock because it runs FSRS, is free, and has a massive ecosystem. The weakness is card creation: building good sentence cards by hand is slow enough that most learners give up. Tools that integrate lookup, audio capture, and card creation into the content-consumption step solve this friction, which is why the modern workflow looks more like "watch show, one-click card" than "open Anki, type fields."

Migaku was built around the sentence-mining loop described above: you watch or read native content, hover to look up anything you don't know, and turn useful sentences into SRS cards with one click, audio and screenshot included. If you want to see how the immersion side and the review side fit together in one workflow, take a look at how Migaku works.

Learn with Migaku