Skip to content

Commit 90c4a6f

Browse files
authored
Merge pull request #13 from grinev/release/0.7.0
Release/0.7.0
2 parents 03e9e69 + 0108ece commit 90c4a6f

22 files changed

Lines changed: 1262 additions & 282 deletions

File tree

.env.example

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,10 @@ OPENCODE_MODEL_ID=big-pickle
3636
# Bot Configuration (optional)
3737
# Maximum number of sessions shown in /sessions (default: 10)
3838
# SESSIONS_LIST_LIMIT=10
39+
40+
# Maximum number of projects shown in /projects (default: 10)
41+
# PROJECTS_LIST_LIMIT=10
42+
3943
# Bot locale: en or ru (default: en)
4044
# BOT_LOCALE=en
4145

@@ -53,3 +57,12 @@ OPENCODE_MODEL_ID=big-pickle
5357
# Code File Settings (optional)
5458
# Maximum file size in KB to send as document (default: 100)
5559
# CODE_FILE_MAX_SIZE_KB=100
60+
61+
# Speech-to-Text / Voice Recognition (optional)
62+
# Enable voice message transcription by setting a Whisper-compatible API URL.
63+
# Works with OpenAI, Groq, or any Whisper-compatible endpoint.
64+
# If STT_API_URL is not set, voice messages will get a "not configured" reply.
65+
# STT_API_URL=
66+
# STT_API_KEY=
67+
# STT_MODEL=
68+
# STT_LANGUAGE=

AGENTS.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -255,6 +255,7 @@ OPENCODE_MODEL_ID=big-pickle
255255

256256
# Bot options (optional)
257257
# SESSIONS_LIST_LIMIT=10
258+
# PROJECTS_LIST_LIMIT=10
258259
# BOT_LOCALE=en # en or ru
259260

260261
# File output options (optional)
@@ -275,6 +276,7 @@ OPENCODE_MODEL_ID=big-pickle
275276
| `OPENCODE_SERVER_PASSWORD` | OpenCode auth password | No | empty |
276277
| `LOG_LEVEL` | Logging level | No | `info` |
277278
| `SESSIONS_LIST_LIMIT` | Max sessions shown in `/sessions` | No | `10` |
279+
| `PROJECTS_LIST_LIMIT` | Max projects shown in `/projects` | No | `10` |
278280
| `BOT_LOCALE` | Bot locale (`en` or `ru`) | No | `en` |
279281
| `CODE_FILE_MAX_SIZE_KB` | Max code file size to send | No | `100` |
280282

PRODUCT.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,7 @@ No public inbound ports are required for normal usage.
4646
### Task handling
4747

4848
- Send text prompts to OpenCode
49+
- Accept voice/audio messages, transcribe via Whisper-compatible STT API, and forward recognized text as prompts
4950
- Interrupt current task (ESC equivalent)
5051
- Handle OpenCode questions with inline options and custom text answers
5152
- Send selected/custom answers back to OpenCode (`question.reply`)
@@ -80,6 +81,7 @@ No public inbound ports are required for normal usage.
8081
- Configurable bot locale
8182
- Configurable visibility for service messages (thinking/tool calls)
8283
- Configurable max code file size in KB (default: 100)
84+
- Optional STT settings for voice transcription (`STT_API_URL`, `STT_API_KEY`, `STT_MODEL`, `STT_LANGUAGE`)
8385

8486
## Current Product Scope
8587

@@ -99,7 +101,7 @@ Current command set:
99101
- [x] `/opencode_stop` - stop local OpenCode server
100102
- [x] `/help` - show command help
101103

102-
Text messages (non-commands) are treated as prompts for OpenCode only when no blocking interaction is active.
104+
Text messages (non-commands) are treated as prompts for OpenCode only when no blocking interaction is active. Voice/audio messages are transcribed and then sent as prompts when STT is configured.
103105

104106
Interaction routing rules:
105107

@@ -123,6 +125,7 @@ Interaction routing rules:
123125
- [x] Sending code blocks as files when needed
124126
- [x] Configurable batching of service messages (thinking + tool updates): recommended `>=2` sec for Telegram rate limits; `0` = immediate
125127
- [x] Configurable service message visibility via env flags (`HIDE_THINKING_MESSAGES`, `HIDE_TOOL_CALL_MESSAGES`)
128+
- [x] Voice/audio transcription via Whisper-compatible APIs (OpenAI/Groq/Together and compatible providers)
126129
- [x] Single-user security model (allowed Telegram user ID)
127130
- [x] Persistent bot settings (`settings.json`) between restarts
128131
- [x] EN/RU localization structure via dedicated i18n files
@@ -138,7 +141,7 @@ Open tasks for upcoming iterations:
138141
- [ ] Improve Telegram-compatible message formatting for richer outputs
139142
- [ ] Support sending files from Telegram to OpenCode (screenshots, documents)
140143
- [ ] Provide a Docker image and basic container deployment guide
141-
- [ ] Add voice transcription
144+
- [x] Add voice transcription
142145

143146
## Possible Improvements
144147

README.md

Lines changed: 48 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,9 @@ Quick start: `npx @grinev/opencode-telegram-bot`
2525
- **Model switching** — pick any model from your OpenCode favorites directly in the chat
2626
- **Agent modes** — switch between Plan and Build modes on the fly
2727
- **Interactive Q&A** — answer agent questions and approve permissions via inline buttons
28+
- **Voice prompts** — send voice/audio messages, transcribe them via a Whisper-compatible API, then forward recognized text to OpenCode
2829
- **Context control** — compact context when it gets too large, right from the chat
30+
- **Input flow control** — when an interactive flow is active, the bot accepts only relevant input to keep context consistent and avoid accidental actions
2931
- **Security** — strict user ID whitelist; no one else can access your bot, even if they find it
3032
- **Localization** — English and Russian UI (`BOT_LOCALE=en|ru`)
3133

@@ -102,15 +104,7 @@ opencode-telegram config
102104
| `/opencode_stop` | Stop the OpenCode server remotely |
103105
| `/help` | Show available commands |
104106

105-
Any regular text message is sent as a prompt to the coding agent only when no blocking interaction is active.
106-
107-
### Interaction Rules
108-
109-
- Only one interactive flow can be active at a time (inline menus, permission request, question flow, rename)
110-
- While an interaction is active, the bot accepts only relevant input for that flow and blocks unrelated actions
111-
- Allowed utility commands remain available during active interactions: `/help`, `/status`, `/stop`
112-
- Unknown slash commands return an explicit fallback message instead of being silently ignored
113-
- Interaction flows do not expire automatically and wait until explicit completion (`answer`, `cancel`, `/stop`, or reset/cleanup)
107+
Any regular text message is sent as a prompt to the coding agent only when no blocking interaction is active. Voice/audio messages are transcribed and then sent as prompts when STT is configured.
114108

115109
> `/opencode_start` and `/opencode_stop` are intended as emergency commands — for example, if you need to restart a stuck server while away from your computer. Under normal usage, start `opencode serve` yourself before launching the bot.
116110
@@ -124,26 +118,54 @@ When installed via npm, the configuration wizard handles the initial setup. The
124118
- **Windows:** `%APPDATA%\opencode-telegram-bot\.env`
125119
- **Linux:** `~/.config/opencode-telegram-bot/.env`
126120

127-
| Variable | Description | Required | Default |
128-
| ------------------------------- | ------------------------------------------------------------------------------------------------------------ | :------: | ----------------------- |
129-
| `TELEGRAM_BOT_TOKEN` | Bot token from @BotFather | Yes ||
130-
| `TELEGRAM_ALLOWED_USER_ID` | Your numeric Telegram user ID | Yes ||
131-
| `TELEGRAM_PROXY_URL` | Proxy URL for Telegram API (SOCKS5/HTTP) | No ||
132-
| `OPENCODE_API_URL` | OpenCode server URL | No | `http://localhost:4096` |
133-
| `OPENCODE_SERVER_USERNAME` | Server auth username | No | `opencode` |
134-
| `OPENCODE_SERVER_PASSWORD` | Server auth password | No ||
135-
| `OPENCODE_MODEL_PROVIDER` | Default model provider | Yes | `opencode` |
136-
| `OPENCODE_MODEL_ID` | Default model ID | Yes | `big-pickle` |
137-
| `BOT_LOCALE` | Bot UI language (`en` or `ru`) | No | `en` |
138-
| `SESSIONS_LIST_LIMIT` | Max sessions shown in `/sessions` | No | `10` |
139-
| `SERVICE_MESSAGES_INTERVAL_SEC` | Service messages interval (thinking + tool calls); keep `>=2` to avoid Telegram rate limits, `0` = immediate | No | `5` |
140-
| `HIDE_THINKING_MESSAGES` | Hide `💭 Thinking...` service messages | No | `false` |
141-
| `HIDE_TOOL_CALL_MESSAGES` | Hide tool-call service messages (`💻 bash ...`, `📖 read ...`, etc.) | No | `false` |
142-
| `CODE_FILE_MAX_SIZE_KB` | Max file size (KB) to send as document | No | `100` |
143-
| `LOG_LEVEL` | Log level (`debug`, `info`, `warn`, `error`) | No | `info` |
121+
| Variable | Description | Required | Default |
122+
| ------------------------------- | ------------------------------------------------------------------------------------------------------------ | :------: | ------------------------ |
123+
| `TELEGRAM_BOT_TOKEN` | Bot token from @BotFather | Yes ||
124+
| `TELEGRAM_ALLOWED_USER_ID` | Your numeric Telegram user ID | Yes ||
125+
| `TELEGRAM_PROXY_URL` | Proxy URL for Telegram API (SOCKS5/HTTP) | No ||
126+
| `OPENCODE_API_URL` | OpenCode server URL | No | `http://localhost:4096` |
127+
| `OPENCODE_SERVER_USERNAME` | Server auth username | No | `opencode` |
128+
| `OPENCODE_SERVER_PASSWORD` | Server auth password | No ||
129+
| `OPENCODE_MODEL_PROVIDER` | Default model provider | Yes | `opencode` |
130+
| `OPENCODE_MODEL_ID` | Default model ID | Yes | `big-pickle` |
131+
| `BOT_LOCALE` | Bot UI language (`en` or `ru`) | No | `en` |
132+
| `SESSIONS_LIST_LIMIT` | Max sessions shown in `/sessions` | No | `10` |
133+
| `PROJECTS_LIST_LIMIT` | Max projects shown in `/projects` | No | `10` |
134+
| `SERVICE_MESSAGES_INTERVAL_SEC` | Service messages interval (thinking + tool calls); keep `>=2` to avoid Telegram rate limits, `0` = immediate | No | `5` |
135+
| `HIDE_THINKING_MESSAGES` | Hide `💭 Thinking...` service messages | No | `false` |
136+
| `HIDE_TOOL_CALL_MESSAGES` | Hide tool-call service messages (`💻 bash ...`, `📖 read ...`, etc.) | No | `false` |
137+
| `CODE_FILE_MAX_SIZE_KB` | Max file size (KB) to send as document | No | `100` |
138+
| `STT_API_URL` | Whisper-compatible API base URL (enables voice/audio transcription) | No ||
139+
| `STT_API_KEY` | API key for your STT provider | No ||
140+
| `STT_MODEL` | STT model name passed to `/audio/transcriptions` | No | `whisper-large-v3-turbo` |
141+
| `STT_LANGUAGE` | Optional language hint (empty = provider auto-detect) | No ||
142+
| `LOG_LEVEL` | Log level (`debug`, `info`, `warn`, `error`) | No | `info` |
144143

145144
> **Keep your `.env` file private.** It contains your bot token. Never commit it to version control.
146145
146+
### Voice and Audio Transcription (Optional)
147+
148+
If `STT_API_URL` and `STT_API_KEY` are set, the bot will:
149+
150+
1. Accept `voice` and `audio` Telegram messages
151+
2. Transcribe them via `POST {STT_API_URL}/audio/transcriptions`
152+
3. Show recognized text in chat
153+
4. Send the recognized text to OpenCode as a normal prompt
154+
155+
Supported provider examples (Whisper-compatible):
156+
157+
- **OpenAI**
158+
- `STT_API_URL=https://api.openai.com/v1`
159+
- `STT_MODEL=whisper-1`
160+
- **Groq**
161+
- `STT_API_URL=https://api.groq.com/openai/v1`
162+
- `STT_MODEL=whisper-large-v3-turbo`
163+
- **Together**
164+
- `STT_API_URL=https://api.together.xyz/v1`
165+
- `STT_MODEL=openai/whisper-large-v3`
166+
167+
If STT variables are not set, voice/audio transcription is disabled and the bot will ask you to configure STT.
168+
147169
### Model Configuration
148170

149171
The bot picks up your **favorite models** from OpenCode. To add a model to favorites:

package-lock.json

Lines changed: 2 additions & 2 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "@grinev/opencode-telegram-bot",
3-
"version": "0.6.1",
3+
"version": "0.7.0",
44
"description": "Telegram bot client for OpenCode to run and monitor coding tasks from chat.",
55
"type": "module",
66
"main": "./dist/index.js",

src/bot/commands/projects.ts

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,9 +15,9 @@ import { createMainKeyboard } from "../utils/keyboard.js";
1515
import { ensureActiveInlineMenu, replyWithInlineMenu } from "../handlers/inline-menu.js";
1616
import { logger } from "../../utils/logger.js";
1717
import { t } from "../../i18n/index.js";
18+
import { config } from "../../config.js";
1819

1920
const MAX_INLINE_BUTTON_LABEL_LENGTH = 64;
20-
const MAX_PROJECTS_TO_SHOW = 10;
2121

2222
function formatProjectButtonLabel(label: string, isActive: boolean): string {
2323
const prefix = isActive ? "✅ " : "";
@@ -34,7 +34,7 @@ export async function projectsCommand(ctx: CommandContext<Context>) {
3434
try {
3535
await syncSessionDirectoryCache();
3636
const projects = await getProjects();
37-
const projectsToShow = projects.slice(0, MAX_PROJECTS_TO_SHOW);
37+
const projectsToShow = projects.slice(0, config.bot.projectsListLimit);
3838

3939
if (projectsToShow.length === 0) {
4040
await ctx.reply(t("projects.empty"));

0 commit comments

Comments
 (0)