docs: update documentation for semantic markers and change intent detection

Sephyi · Sephyi · commit a6c22fbe106a · 2026-03-28T00:55:51.000+01:00
Add semantic marker detection (unsafe, derive, decorators, export,
mutability) and diff-based change intent patterns (error handling, test,
logging, dependency updates) across CHANGELOG, PRD, README, and DOCS.
424 tests.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -19,18 +19,25 @@ All notable changes to CommitBee are documented here.
 - **Structural AST diffs** — `AstDiffer` compares old and new tree-sitter nodes for modified symbols, producing structured `SymbolDiff` descriptions (parameter added, return type changed, visibility changed, async toggled, body modified). Shown as `STRUCTURED CHANGES:` section in the prompt.
 - **Whitespace-aware body comparison** — Body diff uses character-stream stripping so reformatting doesn't produce false `BodyModified` results.
 - **Structured changes in prompt** — New `STRUCTURED CHANGES:` section in the LLM prompt shows concise one-line descriptions of what changed per symbol (e.g., `CommitValidator::validate(): +param strict: bool, return bool → Result<()>, body modified`). Omitted when no structural diffs exist.
+- **Semantic markers** — `AstDiffer` now detects `unsafe` added/removed, `#[derive()]` changes, decorator additions/removals, export changes, mutability changes, and generic constraint changes. Shown as `+unsafe`, `+derive(Debug, Clone)`, etc. in the STRUCTURED CHANGES section.
 
 ### Type Inference
 
 - **Test-to-code ratio** — When >80% of additions are in test files, suggests `test` type even with source files present. Uses cross-multiplication to avoid integer truncation.
 
+### Change Intent Detection
+
+- **Diff-based intent patterns** — Scans added lines for error handling (`Result`, `?`, `Err()`), test additions (`#[test]`, `assert!`), logging (`tracing::`, `debug!`), and dependency updates. Shown as `INTENT:` section in the prompt with confidence scores.
+- **Conservative type refinement** — High-confidence performance optimization patterns can override the base type to `perf`.
+
 ### Prompt Quality
 
 - **Token budget rebalance** — Symbol budget reduced from 30% to 20% when structural diffs are available, freeing space for the raw diff. SYSTEM_PROMPT updated to guide the LLM to prefer STRUCTURED CHANGES for signature details.
+- **Unsafe constraint rule** — When `unsafe` is added to a function, a CONSTRAINTS rule instructs the LLM to mention safety justification in the commit body.
 
 ### Testing
 
-- **410 tests** total (up from 367 at v0.5.0).
+- **424 tests** total (up from 367 at v0.5.0).
 
 ## `v0.5.0` — Beyond the Diff
 
diff --git a/DOCS.md b/DOCS.md
@@ -86,11 +86,11 @@ Here's what each step actually does:
 
 **1. Git Service** reads your staged changes using `gix` for repo discovery and the git CLI for diffs. Paths are parsed with NUL-delimited output (`-z` flag) so filenames with spaces or special characters work correctly.
 
-**2. Tree-sitter Analyzer** parses both the staged version and the HEAD version of every changed file — in parallel, using `rayon` across CPU cores. It extracts **full signatures** (e.g., `pub fn connect(host: &str, timeout: Duration) -> Result<Connection>`) by taking the definition node text before the body child. Methods include their **parent scope** (enclosing impl, class, or trait — e.g., `CommitValidator::validate`). Modified symbols show old → new signature diffs, with **structural AST diffs** that describe exactly what changed (parameters added/removed, return type changed, visibility changed, etc.). Cross-file connections are detected (caller+callee both changed). Symbols are tracked in three states: added, removed, or modified-signature, with a **doc-vs-code distinction** indicating whether changes were documentation-only, code-only, or mixed.
+**2. Tree-sitter Analyzer** parses both the staged version and the HEAD version of every changed file — in parallel, using `rayon` across CPU cores. It extracts **full signatures** (e.g., `pub fn connect(host: &str, timeout: Duration) -> Result<Connection>`) by taking the definition node text before the body child. Methods include their **parent scope** (enclosing impl, class, or trait — e.g., `CommitValidator::validate`). Modified symbols show old → new signature diffs, with **structural AST diffs** that describe exactly what changed (parameters added/removed, return type changed, visibility changed, semantic markers like `unsafe`, `derive`, decorators, `export`, mutability, generic constraints, etc.). Cross-file connections are detected (caller+callee both changed). Symbols are tracked in three states: added, removed, or modified-signature, with a **doc-vs-code distinction** indicating whether changes were documentation-only, code-only, or mixed.
 
 **3. Commit Splitter** looks at your staged changes and decides whether they contain logically independent work. It uses diff-shape fingerprinting (what kind of changes — additions, deletions, modifications) combined with Jaccard similarity on content vocabulary to group files. If it finds multiple concerns, it offers to split them into separate commits.
 
-**4. Context Builder** assembles a budget-aware prompt. It classifies modified symbols as whitespace-only or semantic (via character-stream comparison), computes evidence flags (mechanical change? public APIs removed? bug-fix evidence?), detects cross-file connections, identifies import changes and test file correlations, calculates the character budget for the subject line, and packs context within the token limit (~6K tokens). The token budget adapts: when structural AST diffs are available, symbols get 20% of the budget (diffs carry more detail); when only signatures are available, symbols get 30%.
+**4. Context Builder** assembles a budget-aware prompt. It classifies modified symbols as whitespace-only or semantic (via character-stream comparison), computes evidence flags (mechanical change? public APIs removed? bug-fix evidence?), detects **change intent** (error handling, test, logging, dependency update patterns) for the `INTENT:` prompt section, detects cross-file connections, identifies import changes and test file correlations, calculates the character budget for the subject line, and packs context within the token limit (~6K tokens). The token budget adapts: when structural AST diffs are available, symbols get 20% of the budget (diffs carry more detail); when only signatures are available, symbols get 30%.
 
 **5. LLM Provider** streams the prompt to your chosen model (Ollama, OpenAI, or Anthropic) and collects the response token by token.
 
@@ -107,12 +107,13 @@ CommitBee doesn't just send a diff. The prompt includes:
 - **Evidence flags** telling the LLM deterministic facts about the change
 - **Symbol changes with full signatures** — `[+] pub fn connect(host: &str) -> Result<()>`, not just "Function connect"
 - **Signature diffs** — `[~] old_sig → new_sig` for modified symbols
-- **Structured AST diffs** — `CommitValidator::validate(): +param timeout, return Result<()> → Result<Error>` (precise semantic changes from AST comparison)
+- **Structured AST diffs** — `CommitValidator::validate(): +param timeout, return Result<()> → Result<Error>` (precise semantic changes from AST comparison, including semantic markers like `+unsafe`, `+derive(Clone)`, `export added`, `mutability changed`)
 - **Import changes** — `analyzer: added use crate::domain::DiffHunk` (tracked per file)
 - **Test file correlations** — `src/services/context.rs <-> tests/context.rs (test file)`
 - **Doc-vs-code annotations** — modified symbols tagged `[docs only]` or `[docs + code]` when change is documentation-only or mixed
 - **Cross-file connections** — `validator calls parse() — both changed`
 - **Primary change detection** — which file has the most significant changes
+- **Change intent** — detected patterns like error handling, test additions, logging, or dependency updates, surfaced as an `INTENT:` section
 - **Constraints** — rules the LLM must follow based on evidence (e.g., "no bug-fix comments found, prefer refactor over fix")
 - **Character budget** — exact number of chars available for the subject line
 - **Group rationale** — when splitting, why these files are grouped together
@@ -657,7 +658,7 @@ src/
 ├── domain/
 │   ├── change.rs        # FileChange, StagedChanges, ChangeStatus
 │   ├── symbol.rs        # CodeSymbol, SymbolKind, SpanChangeKind
-│   ├── diff.rs          # SymbolDiff, ChangeDetail (structural AST diffs)
+│   ├── diff.rs          # SymbolDiff, ChangeDetail (structural AST diffs + 10 semantic marker variants)
 │   ├── context.rs       # PromptContext — assembles the LLM prompt
 │   └── commit.rs        # CommitType enum (single source of truth)
 └── services/
@@ -702,7 +703,7 @@ No panics in user-facing code paths. The sanitizer and validator are tested with
 
 ### Testing Strategy
 
-CommitBee has 410 tests across multiple strategies:
+CommitBee has 424 tests across multiple strategies:
 
 | Strategy | What It Covers |
 | --- | --- |
@@ -715,7 +716,7 @@ CommitBee has 410 tests across multiple strategies:
 Run them:
 
 ```bash
-cargo test                    # All 410 tests
+cargo test                    # All 424 tests
 cargo test --test sanitizer   # Just sanitizer tests
 cargo test --test integration # LLM provider mocks
 COMMITBEE_LOG=debug cargo test -- --nocapture  # With logging
diff --git a/PRD.md b/PRD.md
@@ -18,7 +18,7 @@ SPDX-License-Identifier: AGPL-3.0-only OR LicenseRef-Commercial
 
 | Version | Date       | Summary |
 |---------|------------|---------|
-| 4.3     | 2026-03-27 | v0.6.0-rc.1 deep semantic understanding: parent scope, import detection, doc-vs-code classification, structural AST diffs (AstDiffer + SymbolDiff), STRUCTURED CHANGES prompt section, token budget rebalance. 410 tests. |
+| 4.3     | 2026-03-27 | v0.6.0-rc.1 deep semantic understanding: parent scope, import detection, doc-vs-code classification, structural AST diffs (AstDiffer + SymbolDiff), STRUCTURED CHANGES prompt section, token budget rebalance, T3 semantic markers (FR-071), change intent detection (FR-072). 424 tests. |
 | 4.2     | 2026-03-22 | v0.5.0 hardening: security fixes (SSRF prevention, streaming caps), prompt optimization (budget fix, evidence omission, emoji removal), eval harness (36 fixtures, per-type reporting), test coverage (15+ new tests), API hygiene (pub(crate) demotions), 5 fuzz targets. 359 tests. |
 | 4.1     | 2026-03-22 | AST context overhaul (v0.5.0): full signature extraction from tree-sitter nodes, semantic change classification (whitespace vs body vs signature), old→new signature diffs, cross-file connection detection, formatting auto-detection via symbols. 359 tests. |
 | 4.0     | 2026-03-13 | PRD normalization: aligned phases with shipped versions (v0.2.0/v0.3.x/v0.4.0), collapsed revision history, unified status markers, resolved stale critical issues, canonicalized test count to 308, removed dead cross-references. FR-031 (Exclude Files) and FR-033 (Copy to Clipboard) shipped. |
@@ -95,7 +95,7 @@ CommitBee is a Rust-native CLI tool that uses tree-sitter semantic analysis and
 | Multiple message generation (pick from N)          | Common (aicommits, aicommit2) | ✅ v0.2.0       |
 | Commit splitting (multi-concern detection)         | No competitor has this        | ✅ v0.2.0       |
 | Custom prompt/instruction files                    | Growing (Copilot, aicommit2)  | ✅ v0.4.0       |
-| Unit/integration tests                             | Non-negotiable for quality    | ✅ 410 tests    |
+| Unit/integration tests                             | Non-negotiable for quality    | ✅ 424 tests    |
 
 ## 3. Architecture
 
@@ -511,6 +511,14 @@ In `infer_commit_type`, when >80% of additions are in `FileCategory::Test` files
 
 `STRUCTURED CHANGES:` section in LLM prompt renders `SymbolDiff::format_oneline()` descriptions (e.g., `CommitValidator::validate(): +param strict: bool, return bool → Result<()>, body modified (+5 -2)`). Omitted when no structural diffs exist. Token budget rebalanced: symbol budget reduced from 30% to 20% when structural diffs available, freeing space for raw diff. SYSTEM_PROMPT updated to guide LLM to prefer structured changes for signature details. 3 tests.
 
+#### FR-071: Semantic Marker Detection ✅
+
+`AstDiffer` extended with 10 marker variants in `ChangeDetail`: `UnsafeAdded`/`Removed`, `DeriveAdded`/`Removed`, `DecoratorAdded`/`Removed`, `ExportAdded`/`Removed`, `MutabilityChanged`, `GenericConstraintChanged`. Extracts unsafe keyword, derive attributes, and mutability from tree-sitter nodes during function comparison. Unsafe additions set `has_unsafe_addition` evidence flag and trigger a CONSTRAINTS rule requiring safety justification in the commit body. 4 unit tests.
+
+#### FR-072: Change Intent Detection ✅
+
+`detect_intents()` scans added diff lines for error handling patterns (9 patterns including `Result<>`, `?`, `Err()`, `.map_err()`), test patterns (6 patterns including `#[test]`, `assert!`), logging patterns (9 patterns including `tracing::`, `debug!()`, `info!()`), and dependency updates (version changes in manifests). `INTENT:` prompt section shows detected patterns with confidence scores. `refine_type_with_intents()` conservatively overrides base type only for high-confidence performance optimization. 7 tests.
+
 ### 4.7 Future — v0.7.0+ (Market Leadership)
 
 #### FR-050: MCP Server Mode
@@ -693,7 +701,7 @@ commitbee eval                         # Run evaluation harness (dev, feature-ga
 
 ## 8. Testing Requirements
 
-**Current test count: 410**
+**Current test count: 424**
 
 ### TR-001: Unit Tests
 
@@ -852,7 +860,7 @@ Invalid JSON → retry once with repair prompt. Second failure → heuristic ext
 | 3 | v0.4.0 | ✅ Shipped | Feature completion — templates, languages, rename, history, eval, fuzzing |
 | 4 | v0.4.x | ✅ Shipped | Remaining polish — exclude files (FR-031), clipboard (FR-033) |
 | 5 | v0.5.0 | ✅ Shipped | AST context overhaul — full signatures, semantic change classification, cross-file connections. 367 tests. |
-| 6 | v0.6.0-rc.1 | ✅ Shipped | Deep semantic understanding — parent scope, import detection, doc-vs-code classification, structural AST diffs, structured changes prompt section. 410 tests. |
+| 6 | v0.6.0-rc.1 | ✅ Shipped | Deep semantic understanding — parent scope, import detection, doc-vs-code classification, structural AST diffs, structured changes prompt section, semantic markers, change intent detection. 424 tests. |
 | 7 | v0.7.0+ | 📋 Planned | Market leadership — MCP server, changelog, monorepo, version bumping, GitHub Action |
 
 ## 12. Success Metrics
@@ -867,7 +875,7 @@ Invalid JSON → retry once with repair prompt. Second failure → heuristic ext
 | Commit message quality | > 80% "good enough" first try | Manual evaluation + `commitbee eval` |
 | Secret leak rate | 0 | Integration tests with known patterns |
 | MSRV | Rust 1.94 (edition 2024) | CI matrix (stable + 1.94) |
-| Test count | ≥ 308 | `cargo test` (current: 410) |
+| Test count | ≥ 308 | `cargo test` (current: 424) |
 
 ## 13. Non-Goals
 
diff --git a/README.md b/README.md
@@ -99,7 +99,7 @@ When your staged changes mix independent work (a bugfix in one module + a refact
 - **🐚 Shell completions** — bash, zsh, fish, powershell via `commitbee completions`.
 - **⚙️ 5-level config** — Defaults → project `.commitbee.toml` → user config → env vars → CLI flags.
 - **🦀 Single binary** — ~18K lines of Rust. Compiles to one static binary with LTO. No runtime dependencies.
-- **🧪 410 tests** — Unit, snapshot, property (proptest for never-panic guarantees), and integration (wiremock).
+- **🧪 424 tests** — Unit, snapshot, property (proptest for never-panic guarantees), and integration (wiremock).
 
 ## 📦 Installation
 
@@ -222,7 +222,7 @@ The default provider (Ollama) runs entirely on your machine. No data leaves your
 ## 🧪 Testing
 
 ```bash
-cargo test   # 410 tests — unit, snapshot (insta), property (proptest), integration (wiremock)
+cargo test   # 424 tests — unit, snapshot (insta), property (proptest), integration (wiremock)
 ```
 
 See [Testing Strategy](DOCS.md#testing-strategy) for the full breakdown.
@@ -231,7 +231,7 @@ See [Testing Strategy](DOCS.md#testing-strategy) for the full breakdown.
 
 See [`CHANGELOG.md`](CHANGELOG.md) for the full version history.
 
-**Current:** `v0.6.0-rc.1` *Deep Understanding* — Parent scope extraction, structural AST diffs, import change detection, doc-vs-code distinction, test file correlation, test-to-code ratio inference, and adaptive token budgeting.
+**Current:** `v0.6.0-rc.1` *Deep Understanding* — Parent scope extraction, structural AST diffs, import change detection, doc-vs-code distinction, test file correlation, test-to-code ratio inference, adaptive token budgeting, semantic markers (unsafe, derive, decorator, export detection in AST diffs), and change intent detection (error handling, test, logging patterns with confidence scoring).
 
 ## 🤝 Contributing