Commit d97ae65
whisper: set no_context to prevent quality drift over a session (#79)
* whisper: set no_context to prevent quality drift over a session
Whisper transcription quality degrades progressively over a long
push-to-talk session: short clips get mis-recognized or returned
empty, and language detection sticks to the previous language
(e.g. RU→EN switches keep producing Russian). Reloading the model
restores quality.
The cause is whisper.cpp's default prompt_past behaviour — the last
decoded tokens are fed back as a prompt for the next decode. That's
the right thing for continuous speech (lectures, meetings) where
consecutive segments are connected, but the wrong thing for
push-to-talk and similar workloads where each call to transcribe
is an independent utterance: stale prompt tokens bias the next
decode. Short clips suffer most because they have less acoustic
evidence to overcome the stale prompt; language switches suffer
because the prompt is in the previous language and steers detection.
Set no_context = true so each decode starts from a clean prompt.
The user-supplied initial_prompt continues to work — it goes through
a different field and is unaffected.
* whisper: expose no_context as a configurable field
Per review, make no_context an opt-in field on WhisperInferenceParams
so callers can override it for continuous-speech use cases (lectures,
meetings, streaming) where carrying prompt_past across segments
improves consistency. Default stays true — the right choice for
independent utterances such as push-to-talk dictation, which is the
case the previous commit fixed.
* fmt
---------
Co-authored-by: CJ Pais <cj@cjpais.com>1 parent 343768c commit d97ae65
2 files changed
Lines changed: 8 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
8 | 8 | | |
9 | 9 | | |
10 | 10 | | |
11 | | - | |
12 | | - | |
13 | 11 | | |
14 | 12 | | |
15 | 13 | | |
| 14 | + | |
| 15 | + | |
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
110 | 110 | | |
111 | 111 | | |
112 | 112 | | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
113 | 117 | | |
114 | 118 | | |
115 | 119 | | |
| |||
126 | 130 | | |
127 | 131 | | |
128 | 132 | | |
| 133 | + | |
129 | 134 | | |
130 | 135 | | |
131 | 136 | | |
| |||
220 | 225 | | |
221 | 226 | | |
222 | 227 | | |
| 228 | + | |
223 | 229 | | |
224 | 230 | | |
225 | 231 | | |
| |||
0 commit comments