Spaced Repetition in 2026: How It Actually Works
Última actualización: May 3, 2026

If you've been reviewing flashcards for more than a few weeks, you've probably sensed that something about the scheduling matters more than the cards themselves. That something is spaced repetition, the principle that memory is strengthened most efficiently when reviews happen just before you'd otherwise forget. This article walks through what the research actually says, how modern algorithms like FSRS-6 and SM-20 differ from the classic SM-2, and how to plug all of it into a language-learning routine built around native content.
- What Spaced Repetition Actually Is
- Why It Works (The Short Version)
- The Algorithms: SM-2, FSRS, SM-20
- A Worked Example: What a Week of Reviews Actually Looks Like
- Common Mistakes Learners Make With Spacing
- Cultural Context: Why This Matters Differently in Different Languages
- Where Spaced Repetition Fits in a Language Routine
- Making Good Cards (The Part That Actually Decides Whether This Works)
- Common Traps
- A Realistic Weekly Schedule
What Spaced Repetition Actually Is
Spaced repetition is the practical application of the spacing effect: the finding that information reviewed at expanding intervals is retained far better than information crammed in a single session. The idea predates computers by about a century. Hermann Ebbinghaus plotted the first forgetting curve in 1885 using himself as a subject on nonsense syllables. In 2015, Jaap Murre and Joeri Dros at the University of Amsterdam replicated the experiment (one subject logged roughly 70 hours of testing) and reproduced the curve almost exactly, with a small upward bump at the 24-hour mark that they attributed to sleep consolidation.
The classroom-scale evidence is older than most learners realize. Herbert F. Spitzer's 1939 study in the Journal of Educational Psychology tested 3,605 sixth-graders across Iowa on their retention of short passages about peanuts and bamboo, and found clear advantages for distributed review. The modern synthesis is Cepeda et al.'s 2006 meta-analysis in Psychological Bulletin, which pooled 839 effect-size contrasts across 317 experiments and 184 articles. The median effect size for distributed over massed practice was d = 0.60, which is large by behavioral-science standards.
A follow-up by Cepeda and colleagues in 2008, with over 1,350 participants, gave us a useful rule of thumb: the optimal gap between study sessions scales to roughly 10 to 20 percent of the retention interval you want. If you want to remember a word for a year, reviews roughly 5 to 10 weeks apart are in the right ballpark. If you want to remember it for a week, a gap of about a day is right.
Why It Works (The Short Version)
The cognitive-psychology answer is that retrieval itself strengthens memory, and spacing forces your brain to actually retrieve rather than recognize. Roediger and Karpicke's 2006 study at Washington University in St. Louis is the clean demonstration: students who studied a passage once and then took three recall tests remembered about 60 percent of the content after a week, while students who studied the same passage four times without testing remembered only about 40 percent. Their 2008 follow-up in Science extended this directly to foreign vocabulary: once you can recall a word correctly, additional studying does almost nothing for long-term retention, while additional testing does a great deal.
There's also a biological story. Esteban Kramár and colleagues at UC Irvine showed in 2012 that theta-burst stimulation of rat hippocampal slices, when applied 60 minutes after an initial potentiation event, produced further synaptic strengthening, whereas stimulation within a few minutes did nothing additional. The synapse, in other words, has a refractory window during which repeated input is wasted. Heather Sisti, Anthony Glass, and Tracey Shors at Rutgers showed in 2007 that spaced training in rats actually rescued newborn neurons in the dentate gyrus from programmed cell death, with surviving-neuron counts correlating with memory performance two weeks later.
You don't need to believe the neural story to use spaced repetition effectively, but it's useful to know that the behavioral effect has a plausible substrate. Cramming feels productive because recognition feels like knowing. Spacing feels harder because it forces the retrieval your brain actually needs.
The Algorithms: SM-2, FSRS, SM-20
Most spaced-repetition software you've used is running a descendant of SM-2, the algorithm Piotr Woźniak first computerized in Turbo Pascal 3.0 in late 1987 and released into the public domain as an appendix to his 1990 master's thesis. SM-2 is simple enough to understand on a napkin: each card has an ease factor (initialized at 2.5) that gets multiplied into the current interval whenever you answer correctly, and reduced when you fail. About six parameters total, all hand-tuned.
FSRS (Free Spaced Repetition Scheduler) is the major algorithmic upgrade of the last few years. FSRS-6 shipped in late 2025 and was trained on roughly 700 million reviews contributed by about 20,000 volunteer Anki users. Instead of one ease factor, it models three per-card variables (stability, difficulty, and retrievability) and uses 17 trainable weights that get optimized against your personal review history. Anki has shipped FSRS as a scheduler since version 23.12 and makes it the default on new profiles; existing profiles stay on SM-2 unless you switch in deck options.
On the SuperMemo side, SM-20 was announced in 2026 and is the first version in which all parameters are computed by machine learning rather than hand-tuned heuristics. SuperMemo also launched its SuperMemo API in early 2026, which as of March 31 is in early access with a free tier capped at 100 repetitions per day and a one-time import of up to 10,000 historical repetitions, running on SM-20.
The practical takeaway for a language learner: if you're on Anki and still running SM-2, switch to FSRS and let it optimize against your history after you've built up a few thousand reviews. You'll see intervals that feel surprisingly long for easy cards and surprisingly short for hard ones. That's the point.
A Worked Example: What a Week of Reviews Actually Looks Like
Abstract percentages and effect sizes are fine, but the way spacing plays out in practice is simpler than the research makes it sound. Imagine you added 20 new Japanese cards on Monday morning. Under FSRS with a default 90 percent retention target, the schedule for a single card might look roughly like this: you see it again that evening (a few hours later, in the learning queue), again on Wednesday, again the following Monday, again about three weeks later, and then the gap stretches to two months, four months, and out past a year if you keep answering correctly.
Now imagine one of those 20 cards is giving you trouble. You fail it on Wednesday. It drops back into the learning queue, you see it again in 10 minutes, then later that day, then tomorrow. Stability resets downward, and the card has to earn its long intervals back. Meanwhile the 19 cards you got right keep drifting further apart, freeing up minutes of your daily review budget for the next batch of new cards.
Multiply that across a deck of 5,000 mature cards and the rhythm becomes clear. On a typical day you might see 80 to 150 reviews, of which 90 percent are cards you haven't laid eyes on in weeks or months, plus 10 to 20 new cards. You are not re-studying the same material constantly. You are maintaining a slowly expanding library with a decreasing per-card cost.
Common Mistakes Learners Make With Spacing
After watching a lot of learners thrash with their decks, the same handful of mistakes come up over and over. These are distinct from the traps listed later in this article; they are specifically about misreading what the algorithm is telling you.
Setting retention too high. FSRS lets you pick a desired retention rate. The default is around 90 percent. Setting it to 97 percent feels safe, but it roughly doubles your daily workload for a tiny gain in recall. For most learners, 85 to 90 percent is the sweet spot. Below 80 percent you start forgetting words before they consolidate; above 95 percent you are grinding for diminishing returns.
Optimizing FSRS too early. FSRS needs data to personalize. Running the optimizer on 200 reviews gives you weights that are barely better than the defaults. Wait until you have at least 1,000 reviews, ideally more, before you run it. Re-optimize every few months as your history grows.
Ignoring the "Again" button. A lot of learners press "Hard" on cards they actually failed, because "Again" feels like a penalty. FSRS uses your grade as a signal about the card's true difficulty. Lying to the algorithm means it schedules cards incorrectly, which means more forgetting later.
Burying mature cards under new ones. If you add 50 new cards a day for a month, your review queue will balloon six weeks later when those cards start hitting their medium-length intervals. A sustainable pace for most learners is 10 to 20 new cards per day, adjusted downward when the review queue grows past your daily budget.
Grading on feel instead of retrieval. A subtle one: if you glance at a card, feel a flicker of familiarity, and press "Good," you are grading recognition rather than recall. The fix is mechanical. Read the prompt, look away or cover the back, produce the answer out loud or in your head, and only then reveal the back. If the answer didn't come out cleanly, it's a fail.
Cultural Context: Why This Matters Differently in Different Languages
Spaced repetition as a mechanism is language-agnostic, but the work you ask it to do shifts substantially depending on what you're learning. A Spanish learner coming from English has maybe 3,000 to 5,000 cognates to lean on before any flashcards are needed; the spacing system is mostly for false friends, conjugation patterns, and genuinely novel vocabulary. A Japanese learner from English has almost no cognates and has to acquire a 2,000-plus character kanji system alongside vocabulary and grammar, so the deck grows faster and the scheduling load is heavier earlier.
This has practical consequences. A Mandarin learner might reasonably keep separate decks for character recognition, tone-plus-pronunciation, and sentence meaning, because those are distinct retrieval tasks. A German learner can usually get away with a single sentence-mining deck because the script and phonology are close enough to English that the cognitive load per card stays low. Neither approach is more correct; they reflect different information densities per flashcard.
The scheduling algorithm does not know any of this. FSRS treats a card about 手を引く the same way it treats a card about wegen plus genitive. That's fine, because the algorithm is optimizing retention, not meaning. Your job is to make sure the cards you feed it match the language's actual demands: more cards with richer context for distant languages, leaner cards for close ones.
Where Spaced Repetition Fits in a Language Routine
The failure mode most intermediate learners hit is treating the SRS as the main event. You fire up your deck, grind 200 reviews, feel virtuous, and call it language study. But flashcards are a compression format. They're only valuable to the extent that they connect back to actual comprehension in real content.
The sequence that works, roughly, is: encounter a word in something you're reading or watching, confirm you want to remember it, make a card that preserves the original sentence and context, review it in your SRS, and then encounter it again in more content. The SRS buys you faster decay resistance between natural encounters. It does not replace them. For a longer treatment of the full loop, see our guide on how to actually learn a language.
A concrete example. Say you're watching a Japanese drama and the line 「この件はもう手を引いたほうがいい」 comes up. You hover, confirm 手を引く means "to pull out of / wash your hands of," and add a card with the full sentence on the front, the reading and meaning on the back. Two days later the SRS shows it to you. Three weeks later it shows it again. Six months from now, when you meet 手を引く in a novel, you'll recognize it in a beat. That's the full loop, and each stage is doing work the others can't.
This is also why the difficulty of a language matters less than people think once you're in the loop. The core mechanic of encounter + capture + review + re-encounter generalizes across any language; the details change. Our piece on what makes a language easy to learn gets into which details actually matter (script, grammar distance, available content) versus which are folklore.
Making Good Cards (The Part That Actually Decides Whether This Works)
A perfectly scheduled card of a badly designed prompt is a waste of time. A few rules that hold up across languages:
- One unknown per card. If the front has two words you don't know, you'll never learn which one you're recalling. Split them.
- Keep the original sentence. Context gives the card a hook. "手を引く = withdraw" is a dictionary entry; 「この件はもう手を引いたほうがいい」 with the target word highlighted is a memory.
- Audio on the back, always, for spoken languages. You want to hear the word in a native voice, not just read it. Most content-capture tools pull the audio from the source clip automatically.
- Cloze-delete sentences for grammar. If you're learning the Japanese conditional ~ば, a card like 「時間が__ば、行きます」 with あれ on the back is far better than a grammar-point explanation card. You're practicing the pattern in the wild, which is how you'll meet it.
- Kill cards that fight you. If a card has failed five times in two weeks, the problem is usually the card, not your memory. Rewrite it or delete it.
This card-making discipline is the thing that separates learners who get real mileage out of spaced repetition from those who quit after a month of thrashing. If you want a language-specific version of this, our guide on what actually works for learning Japanese goes deeper into sentence mining for Japanese in particular.
Common Traps
A few patterns show up constantly in learners who aren't getting results:
- Reviewing but not immersing. If your only exposure to the language is your SRS, you're optimizing retention of words you will rarely meet in context. The cards atrophy into trivia.
- Chasing daily streaks instead of intervals. Spaced repetition is about the gap, not the streak. Missing a day is fine; FSRS will reschedule. Missing a month is fine too, the algorithm just recalculates stability downward.
- Making cards for every unknown. Be ruthless. If a word isn't worth five seconds of your attention now, it's not worth 30 seconds of review across the next two years. The first 1,000 most-frequent words in most languages cover around 70 to 80 percent of typical conversation; prioritize coverage before rare vocabulary.
- Fighting the algorithm. If FSRS pushes a card out 90 days, let it. The algorithm has more data about your retention than your gut does. Manually re-reviewing "just to be safe" destroys the spacing effect you paid for.
- Treating re-learning as failure. Forgetting a card and having it re-enter the learning queue is part of the mechanism, not a bug. Each retrieval (including failed ones followed by the answer) strengthens the trace.
The mental shift that unlocks steady progress is to stop thinking of the SRS as a teacher and start thinking of it as a memory maintenance layer underneath your actual reading, listening, and watching. The content is where the learning happens. The spacing is what keeps it from leaking out.
A Realistic Weekly Schedule
If you want a template to adapt, here's what a sustainable week looks like for an intermediate learner juggling reviews and immersion. Monday through Friday: 20 to 30 minutes of reviews in the morning (roughly 100 to 150 cards), one hour of native content in the evening with 5 to 10 new cards mined from what you watched or read. Saturday: a longer immersion block of two to three hours with heavier mining, 15 to 20 new cards. Sunday: reviews only, no new cards, used as a buffer day to catch up if the queue grew mid-week.
The numbers are less important than the shape. Reviews are the baseline, mining produces new material, and one day per week is reserved for slack. Learners who skip the slack day tend to hit a wall around month three when a bad week spikes the queue and there is no recovery built into the routine.
Frequently Asked Questions
How many cards a day should I review?
There's no universal number, but a sustainable range for most language learners is 10 to 20 new cards per day with whatever reviews the algorithm surfaces, which usually lands between 80 and 200 reviews daily once a deck matures. If your reviews consistently exceed 250 a day, you are adding new cards faster than you can maintain them; reduce the new-card count for a few weeks and let the queue drain.
Is FSRS actually better than SM-2, or is it just newer?
Both the Anki developers and independent community analyses on millions of reviews have consistently shown FSRS producing more accurate interval predictions than SM-2, meaning cards are less often shown too early (wasting your time) or too late (causing forgetting). The advantage is small for brand-new users because FSRS needs history to personalize, but grows substantially once you have a few thousand reviews of data.
Should I delete a card I keep failing?
Usually, yes, but rewrite it first. A card that keeps failing is almost always a design problem: ambiguous prompt, missing context, two unknowns on the front, or a word you just don't have enough exposure to yet. Rewrite the card with the full source sentence and audio; if it still fails a few more times, suspend it and let natural exposure do the work before you reintroduce it.
Can I use spaced repetition for grammar, not just vocabulary?
Yes, and cloze deletions are the mechanism. Instead of a card that says "explain the Japanese conditional," make a card with a real sentence that uses the grammar point with the key morpheme blanked out. You are practicing the pattern as it actually appears in speech and writing, which is what you need to produce and understand it later.
How long until I see results?
Reliable word-level retention shows up within a few weeks for any given card once the intervals stretch past a month. Noticeable improvement in your ability to read or watch native content typically takes 3 to 6 months of consistent daily use combined with immersion, assuming roughly 15 to 30 minutes of reviews plus an hour or more of input per day. The first month often feels slow because you're building the base; the curve steepens once the mature-card count crosses a few thousand.
What happens if I take a two-week break?
The queue will look alarming when you come back, but the damage is smaller than it appears. FSRS treats missed reviews as overdue, which actually provides useful data about true retention at longer intervals. Work through the backlog at your normal pace instead of cramming; expect lapse rates 10 to 20 percentage points higher than usual for the first few days, then a return to baseline within a week. Do not add new cards until the overdue pile is cleared.
Should I use pre-made decks or build my own?
For the first few hundred words in a completely new language, a small pre-made frequency deck gets you to the point where native content is parseable. After that, cards you make from content you actually consumed outperform pre-made decks by a wide margin, because the context is already anchored in your memory. The general rule: pre-made for bootstrapping, self-made from immersion for everything past the beginner stage.
If you want to apply all of this inside content you already want to consume, Migaku handles the hover lookup, card creation with source audio, and FSRS-compatible scheduling in one loop, so your reviews stay tied to the shows and articles where the words came from. See how Migaku works for the full picture.