Kernel Architecture

Technical documentation for contributors and advanced users.

Module Structure

kernel/
├── __init__.py
├── __main__.py                 # Entry point (asyncio.run → run_bot)
├── bot.py                      # Telegram Application setup, handler registration
├── bot_logging.py              # Logging configuration
├── bot_cleanup.py              # Periodic temp file cleanup
├── config.py                   # TOML config loader (with env var support)
├── config_load.py              # Config file resolution
├── config_types.py             # Dataclasses: Config, TelegramConfig, ProviderConfig, etc.
├── render.py                   # Markdown → Telegram HTML renderer (mistune-based)
├── agent.py                    # Agent (composed via mixins)
├── agent_chat.py               # Chat loop: streaming, tool dispatch, history
├── agent_tools.py              # Built-in tool registration, CLI delegation, MCP init
├── agent_sessions.py           # Session create/restore/switch
├── agent_history.py            # History truncation and slimming
├── agent_titles.py             # Auto-generate conversation titles
├── agent_content.py            # Content block serialization
├── tg_common.py                # BotState, user check, send helpers
├── tg_message.py               # Main message handler (text/photo/file/voice)
├── tg_message_utils.py         # File type detection, TTS text extraction
├── tg_commands_sessions.py     # /new, /history, /resume, /retitle, /del_history
├── tg_commands_memory.py       # /remember, /memory, /forget
├── tg_commands_settings.py     # /provider, /model, /cancel, /status
├── models/
│   ├── base.py                 # ABC: LLM, Message, ToolDef, StreamChunk, ContentBlock
│   ├── claude.py               # Anthropic Messages API implementation
│   └── openai_compat.py        # OpenAI-compatible API implementation
├── memory/
│   ├── store.py                # SQLite store (sessions, messages, settings, memories)
│   ├── memories.py             # FTS5 full-text search (jieba tokenizer), LIKE fallback
│   └── slim.py                 # Content slimming for DB storage
├── cli/
│   ├── base.py                 # CLIAgent ABC, subprocess lifecycle, timeout, output truncation
│   ├── claude_code.py          # Claude Code CLI adapter
│   └── codex.py                # Codex CLI adapter
├── mcp/
│   └── client.py               # MCP client: stdio/HTTP transport, tool registration, reconnect
├── tools/
│   └── registry.py             # Decorator-based tool registry, auto schema from type hints
└── voice/
    ├── stt.py                  # Speech-to-text (OpenAI Whisper API)
    └── tts.py                  # Text-to-speech (Edge TTS → ffmpeg → Opus)

Execution Flow

sequenceDiagram
    participant User as Telegram User
    participant TG as Telegram Handler
    participant Agent
    participant LLM
    participant Tools

    User->>TG: Message (text / photo / file / voice)
    TG->>TG: Auth check (allowed_user)
    TG->>TG: Preprocess (STT / base64 / file extract)
    TG->>Agent: chat(content_blocks)
    Agent->>Agent: Build system prompt (SOUL + time + memory hints)
    Agent->>LLM: chat_stream(history, system, tools)
    LLM-->>Agent: StreamChunks (text + tool_use)

    loop Tool rounds (max 25)
        Agent->>Tools: Execute tool (memory / CLI / MCP)
        Tools-->>Agent: Tool result
        Agent->>LLM: Continue with tool results
        LLM-->>Agent: StreamChunks
    end

    Agent-->>TG: Final text
    TG->>TG: Render markdown → Telegram HTML
    TG->>User: Reply (text or voice)

Agent Composition

The Agent class is assembled from five mixins:

Mixin	Responsibility
`AgentToolsMixin`	Built-in tools, CLI delegation, MCP tool registration
`AgentSessionsMixin`	Session lifecycle (create, restore, switch)
`AgentHistoryMixin`	Context window management, history truncation/slimming
`AgentChatMixin`	Streaming chat loop, tool dispatch (up to 25 rounds)
`AgentTitlesMixin`	Auto-generate session titles via a lightweight LLM

class Agent(AgentToolsMixin, AgentSessionsMixin, AgentHistoryMixin, AgentChatMixin, AgentTitlesMixin):
    ...

LLM Abstraction

Location: models/base.py

All providers implement the LLM abstract class:

Method	Purpose
`chat()`	Single-shot request → `LLMResponse`
`chat_stream()`	Streaming request → `AsyncIterator[StreamChunk]`
`close()`	Cleanup

Content is modeled as typed blocks: TextContent, ImageContent, ToolUseContent, ToolResultContent.

Providers

Provider	Module	API
Anthropic Claude	`models/claude.py`	Anthropic Messages API (native)
OpenAI-compatible	`models/openai_compat.py`	OpenAI Chat Completions (works with OpenAI, DeepSeek, etc.)

Provider selection is runtime-switchable via /provider command.

Memory System

Location: memory/store.py, memory/memories.py

Storage

SQLite database (data/kernel.db) with WAL mode. Schema version tracked via PRAGMA user_version.

Tables

Table	Purpose
`sessions`	Conversation metadata (title, timestamps, archived flag)
`messages`	Chat messages (JSON content, linked to session)
`settings`	Key-value store (current provider, model, etc.)
`memories`	Long-term memory entries
`memories_fts`	FTS5 virtual table for full-text search (if available)

Full-Text Search

Location: memory/memories.py

Tokenizer: jieba (Chinese + English segmentation)
FTS5 index built on startup; falls back to LIKE if FTS5 is unavailable
Search strategy: exact match → OR query → LIKE fallback
Terms are filtered (min length 2, stop words removed, max 8 terms)

AI-Driven Memory

The system prompt instructs the LLM to autonomously call memory_search when a question may depend on user preferences or history. The LLM decides when to store and recall — no hardcoded triggers.

Tool System

Location: tools/registry.py

Registry

ToolRegistry uses a decorator pattern. Parameter schemas are auto-generated from Python type hints:

@registry.tool("memory_add", description="Store to long-term memory")
async def memory_add(text: str) -> dict:
    ...

Type mapping: str→string, int→integer, float→number, bool→boolean, list[T]→array, Literal[...]→enum.

Built-in Tools

Tool	Purpose
`delegate_to_cli`	Delegate tasks to CLI agents (Claude Code / Codex)
`memory_add`	Store a memory
`memory_search`	Search memories
`memory_list`	List all memories
`memory_delete`	Delete a memory by ID

MCP Tools

MCP tools are dynamically registered at startup. Tool names are sanitized to mcp_{server}__{tool} format (max 64 chars, SHA1 suffix if truncated). Supports both stdio and HTTP (Streamable HTTP) transports. Auto-reconnect on failure (1 retry).

CLI Delegation

Location: cli/base.py, cli/claude_code.py, cli/codex.py

Process Lifecycle

Resolve command via shutil.which
Spawn subprocess (asyncio.create_subprocess_exec or _shell on Windows)
Wait with timeout (default 600s)
Capture stdout/stderr, truncate output (max 50K chars)
Save raw output to data/cli_outputs/

Work Directory

If cwd is provided: use it directly
Otherwise: create a unique subdirectory under tasks/ (s{session}_{cli}_{timestamp}_{uid})

Cancellation

/cancel command → Agent.cancel() → proc.kill() on the active CLI process.

Voice Pipeline

Speech-to-Text

Location: voice/stt.py

OpenAI Whisper API (or compatible). Downloads voice message as .ogg, transcribes, deletes temp file.

Text-to-Speech

Location: voice/tts.py

Edge TTS generates MP3 (free, no API key)
ffmpeg converts MP3 → Opus/OGG (48kbps)
Sent as Telegram voice message

ffmpeg is bundled via static-ffmpeg (no system install needed).

Telegram Message Rendering

Location: render.py

Custom mistune renderer (TelegramHTMLRenderer) converts Markdown to Telegram-compatible HTML:

Markdown	Telegram HTML
`bold`	`<b>bold</b>`
`italic`	`<i>italic</i>`
`code`	`<code>code</code>`
Code blocks	`<pre><code>...</code></pre>`
`~~strike~~`	`<s>strike</s>`
Lists	`• item`
Blockquotes	`▍ quoted`

Messages exceeding 4096 chars are split at paragraph/line boundaries with proper HTML tag repair.

Configuration

Location: config.py, config_types.py, config_load.py

TOML-based configuration. API keys support environment variable override (env: KERNEL_*).

Config Sections

Section	Required	Purpose
`[telegram]`	Yes	Bot token, allowed user ID
`[general]`	Yes	Timezone, default provider, context rounds, memory settings
`[providers.*]`	Yes (≥1)	LLM provider configs (type, API key, models)
`[titles]`	No	Lightweight LLM for auto-generating session titles
`[stt]`	No	Speech-to-text (Whisper API endpoint)
`[tts]`	No	Text-to-speech (Edge TTS voice name)
`[cli.*]`	No	CLI agent commands and args
`[[mcp.servers]]`	No	MCP server connections

Data Directory

All runtime data lives under data/ (configurable via general.data_dir):

data/
├── kernel.db           # SQLite database (sessions, messages, memories, settings)
├── cli_outputs/        # Raw CLI agent output files
├── downloads/          # Temporary file downloads (auto-cleaned after 7 days)
└── voice_replies/      # Temporary TTS output (auto-cleaned after 7 days)

Periodic cleanup runs every 24 hours, removing files older than 7 days from temp directories.

Security

Mechanism	Detail
Single-user auth	Every message checked against `telegram.allowed_user`
No inbound ports	Long-polling only, no webhook exposure
API key isolation	Keys in config file, env var override supported
Sensitive masking	Error messages mask API keys before sending to Telegram
CLI sandboxing	CLI agents run as subprocesses with configurable flags

Dependencies

Package	Purpose
`python-telegram-bot`	Telegram Bot API
`anthropic`	Anthropic Claude API
`openai`	OpenAI-compatible APIs
`aiosqlite`	Async SQLite
`jieba`	Chinese text segmentation (for FTS)
`mistune`	Markdown parsing
`mcp`	Model Context Protocol client
`httpx`	HTTP client (for MCP)
`edge-tts`	Free text-to-speech
`static-ffmpeg`	Bundled ffmpeg binary
`tomli`	TOML parsing (Python < 3.12)
`tzdata`	Timezone data

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kernel Architecture

Module Structure

Execution Flow

Agent Composition

LLM Abstraction

Providers

Memory System

Storage

Tables

Full-Text Search

AI-Driven Memory

Tool System

Registry

Built-in Tools

MCP Tools

CLI Delegation

Process Lifecycle

Work Directory

Cancellation

Voice Pipeline

Speech-to-Text

Text-to-Speech

Telegram Message Rendering

Configuration

Config Sections

Data Directory

Security

Dependencies

FilesExpand file tree

ARCHITECTURE.md

Latest commit

History

ARCHITECTURE.md

File metadata and controls

Kernel Architecture

Module Structure

Execution Flow

Agent Composition

LLM Abstraction

Providers

Memory System

Storage

Tables

Full-Text Search

AI-Driven Memory

Tool System

Registry

Built-in Tools

MCP Tools

CLI Delegation

Process Lifecycle

Work Directory

Cancellation

Voice Pipeline

Speech-to-Text

Text-to-Speech

Telegram Message Rendering

Configuration

Config Sections

Data Directory

Security

Dependencies