Users report that audio for some radical cards does not match the displayed pinyin. For example:
- 阝 (side form of 阜): pinyin "fù" — audio reportedly played "shǎn" (陝).
- 宀 (roof radical): pinyin "mián" — audio reportedly did not match.
-
Character confusion (阝/陝): The deck data contained 陝 (U+961D, shǎn) instead of 阝 (U+962D, fù). This was a data-entry error: the wrong character was stored. The manifest correctly mapped the stored character to its UTF-8 filename, so the audio matched the stored character — but the card was labeled as 阝 (阜) with pinyin fù. Fix applied: Corrected deck data to use 阝 (U+962D).
-
TTS mispronunciation of radicals: Edge-TTS (and TTS in general) may mispronounce radicals when they are spoken in isolation. Radicals like 宀, 讠, 氵 are rarely used alone in natural speech; the model may not have reliable pronunciation for them.
A RADICAL_PRONUNCIATION_FALLBACK map was added to generate-audio.py: when generating audio for certain radicals, pass a different character to TTS (e.g. 棉 for 宀, 言 for 讠) that shares the same pinyin. The generated audio is saved under the radical's filename.
- Display/audio mismatch: The user sees 宀 but hears 棉. We are playing audio for a different character than the one displayed. Pedagogically questionable.
- No verification: We have not listened to original vs. new audio. We assumed TTS mispronounces and that fallbacks fix it — unverified.
- Hidden behavior: The fallback map is opaque; the code does not make the substitution visible to readers.
- Scope creep: Extended to 11 radicals without evidence that all needed it.
- Whether edge-tts actually mispronounces these radicals.
- Whether the fallback audio is correct.
- Whether the original audio (before fallbacks) was wrong.
- How to verify audio correctness programmatically (no automated check exists).
- Edge-TTS: Does not support pinyin input for proper Chinese pronunciation. Passing "mián" as text yields Latin-style pronunciation, not Mandarin.
- Pinyin source: Deck pinyin is curated in
deck-data.json; there is no runtime hanzi→pinyin conversion for audio.
- Listen to original and fallback audio; document which (if any) are actually wrong.
- Decide whether to revert
RADICAL_PRONUNCIATION_FALLBACKand accept TTS limitations. - Consider alternatives: human recordings for radicals, different TTS/voice, or no audio for unreliable radicals.
- If keeping fallbacks: add tests or documentation that make the substitution explicit.
Design principles, quality rules, and tooling:
docs/mnemonic-curation.md
Run after any data edit:
node scripts/audit-mnemonics.js --mode all --fail-on-violations
node scripts/validate-anchor-stories.js
node scripts/test-mnemonic-curation.js
node scripts/test-hint-safety.js- Expand
phoneticAnchorAliasesfor anchors where exact token usage harms sentence quality. - Add narrative quality snapshot fixtures in CI to catch accidental template drift.
- Pilot stricter anchor-placement cap (reduce max anchor-at-start from 60% to 40% if quality holds).
- Diversify remaining "family anchors" — SHE/ZOO/YOU/SHEER still cover 15–20 syllables each; further per-syllable splits need new anchor words or alias support.
- Improve anchor grammar gate heuristic to reduce false positives on natural pronoun/adverb subjects before enabling for vocab (125 false positives at current sensitivity).
- Multi-syllable phonetic hints — Add more than one phonetic anchor to stories when pinyin has multiple syllables (e.g. 电脑 → "Think of DEAN, NOW."). See
docs/pinyin-multi-syllable-hints-investigation.md.
Improve quote quality/coverage now that tidbits are available across all decks.
- Expand tidbit corpus for more radical concepts with concrete semantic overlap.
- Curate/adjust relevance tags for better radical matches while preserving precision.
- Add spot-check fixtures for additional radicals beyond
木and水.
- More radical cards surface high-quality tidbits where overlap is meaningful.
- Weak/abstract radicals still return
null(no forced matches).
Strengthen guardrails for future data updates.
- Add stronger mnemonic fixture snapshots for larger HSK1 subsets.
- Add CI gate for
scripts/audit-mnemonics.js --fail-on-violations.
- CI fails on mnemonic audit regressions.
- Fixture tests catch narrative-quality regressions early.
- Study guidance for combining decks effectively.
- Tone imagery pilot (measure recall benefit vs added cognitive load).
- Post-lesson "Expand/Continue" option so learners can keep going after the daily queue: offer to start the next lesson immediately (same deck), with an optional small cap on extra new cards; include a simple replay of finished cards without timers.
- Quiz mode — see below.
Implemented. Quiz mode tests recall with multiple-choice questions, offered after a deck's daily queue is completed via a "Take a quiz" button (no prompt).
- Study flow: Manual Hard/Medium/Easy ratings replaced with a "Next" button. Study exposes cards (tap to reveal) but does not update SRS.
- Quiz flow: After deck completion, a "Take a quiz" button starts a multiple-choice quiz over the lesson cards. All answers are multiple choice.
- Performance-based difficulty: Quiz results drive SRS. Correct → easy; incorrect → hard. No timing; answer is either right or wrong.
- Self-rating removed: No manual difficulty buttons; quiz performance is the sole signal for scheduling.
Stories currently carry two hooks: sound (phonetic anchor) and meaning (scene). A third hook — visual shape of the character — could strengthen recall, especially for simple pictographic characters.
Approaches, ranked by feasibility:
| Approach | Difficulty | Coverage | Notes |
|---|---|---|---|
Separate shapeHint field |
Low | ~60-70 simple chars | Decoupled from story; UI renders as a small visual note |
| Shape woven into story (pictographs) | Medium | ~20-30 chars | Tight 12-word budget when story also needs anchor + meaning |
| Component layout in story | Hard | ~100+ compound chars | Conflicts with design rule: stories must not rely on radical knowledge |
| Stroke-level narratives | Very Hard | All chars | Essentially a different mnemonic system (Heisig-style) |
Recommended starting point: Add an optional shapeHint field to
mnemonicData for the 35 radicals, where visual form ≈ meaning (口 is a
square = mouth, 人 is two legs = person). Display it below the story without
touching story text. Evaluate whether it measurably helps recall before
expanding to compound characters.
mnemonicData: {
soundAnchor: "Think of RUN.",
story: "A lone traveler breaks into a RUN across the empty bridge.",
shapeHint: "Two legs, one striding forward",
components: []
}Key challenges:
- Authoring quality shape descriptions requires knowing what each character looks like to a first-time learner — a different skill from writing phonetic stories. LLM generation would need character images or reliable pictographic etymology data.
- For compound characters, shape hints risk requiring radical knowledge the learner doesn't have yet (violating the existing design rule).
- For complex/abstract characters, forcing a visual description may hurt more
than it helps — leave
shapeHintempty for those.
Add a fourth deck of full Chinese sentences to bridge the gap between single-word recall and reading comprehension.
Card shape:
{
"id": "tatoeba-1234",
"hanzi": "我喜欢西瓜。",
"pinyin": "wǒ xǐhuān xīguā.",
"english": "I like watermelon.",
"audioId": 1234,
"vocabWords": ["我", "喜欢", "西瓜"]
}Data sources (ranked):
| Source | License | Sentences | Audio | Notes |
|---|---|---|---|---|
| Tatoeba | CC BY 2.0 FR (some CC0) | ~40 k+ Mandarin with English translations | 5,814 Mandarin recordings | Bulk TSV exports for sentences, translation links, audio IDs, and pinyin transcriptions |
| LLM-generated | Original (no license concern) | Unlimited | None (syllable fallback or TTS) | Full editorial control; can constrain vocab to HSK1/HSK2 words the learner already knows |
Recommended approach — hybrid:
- Write a build script (like
build-phonetic-hints.js) that ingests Tatoeba exports, filters for sentences where all/most characters come from the HSK1 vocab set, and keeps only entries with audio recordings. - Curate ~50–100 sentences into
data/sentence-data.json. - Supplement with LLM-generated sentences for vocab words that Tatoeba doesn't cover well.
- Sentence audio at runtime:
https://tatoeba.org/audio/download/{audioId}(same CDN-at-runtime pattern ashugolpz/audio-cmn). Fallback to syllable-by-syllablespeakPinyin()for sentences without recordings.
UI/UX:
- Reuse the existing three-stage card flow: stage 0 shows the Chinese
sentence, stage 1 adds pinyin + audio, stage 2 reveals the English
translation plus word-by-word gloss chips (e.g.
我 I·喜欢 like·西瓜 watermelon). - The gloss chips reuse the existing
.chip-linkstyling. - New deck entry in
DECKS(sentence_to_english) +DECK_STORAGE_KEYS.
Implementation cost:
| Task | Effort |
|---|---|
| Build script to filter/curate Tatoeba exports | Medium |
New data/sentence-data.json with curated entries |
Medium |
Register deck in DECKS + storage key |
Low |
| Sentence audio playback (Tatoeba URL or syllable fallback) | Low–Medium |
| Word-by-word gloss chips on reveal | Medium |
Open questions:
- Should sentences be graded by lesson (e.g. "Lesson 1 sentences use only the first 10 vocab words") or offered as a single flat pool?
- Is a reverse mode (
english_to_sentence) valuable, or is comprehension-only enough to start? - Minimum audio-coverage threshold: skip sentences without audio, or accept the syllable fallback?