Feature/ollama disable think by Jian-Min-Huang · Pull Request #1 · PureFuncInc/VoiceInk

Jian-Min-Huang · 2026-04-05T16:15:24Z

我在試圖改用最新的 Gemma 4 來取代我 TranslateGemma 的時候發現明明每秒輸出的 Token 比較快，表現也比較好，但是反應就是很慢

gemma4:26b-a4b-it-q8_0

total duration:       1.558639625s
load duration:        137.0625ms
prompt eval count:    256 token(s)
prompt eval duration: 764.7095ms
prompt eval rate:     334.77 tokens/s
eval count:           49 token(s)
eval duration:        602.880289ms
eval rate:            81.28 tokens/s

translategemma:12b

total duration:       1.554478125s
load duration:        136.367ms
prompt eval count:    254 token(s)
prompt eval duration: 566.28025ms
prompt eval rate:     448.54 tokens/s
eval count:           44 token(s)
eval duration:        827.462332ms
eval rate:            53.17 tokens/s

經過交叉測試後發現是因為 Gemma 4 是具備思考能力的，而 TranslateGemma 沒有
所以都用預設的 think: true 就會導致雖然輸出比較快但是因為第一步就要先思考，所以請求的 total duration 會很長

這個 PR 的目的就是 extend 原本的 OllamaClient 然後把 think 預設改成 false

邏輯上用本機模型就是求快，我目前覺得在 VoiceInk 調用 OllamaClient 應該不需要開 think

Add CustomOllamaClient that mirrors OllamaClient.generate() with an additional `think: false` parameter to prevent thinking-capable models from returning think blocks in enhancement responses.

Jian-Min-Huang force-pushed the feature/ollama-disable-think branch from 980a378 to 3f09344 Compare April 5, 2026 16:40

Jian-Min-Huang force-pushed the custom-main branch from 1200e26 to 9989521 Compare April 5, 2026 17:39

fix(ollama): disable thinking mode in Ollama generate requests

362a50b

Add CustomOllamaClient that mirrors OllamaClient.generate() with an additional `think: false` parameter to prevent thinking-capable models from returning think blocks in enhancement responses.

Jian-Min-Huang force-pushed the feature/ollama-disable-think branch from 3f09344 to 362a50b Compare April 5, 2026 17:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/ollama disable think#1

Feature/ollama disable think#1
Jian-Min-Huang wants to merge 1 commit intocustom-mainfrom
feature/ollama-disable-think

Jian-Min-Huang commented Apr 5, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Jian-Min-Huang commented Apr 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Jian-Min-Huang commented Apr 5, 2026 •

edited

Loading