Skip to content

Commit a6c22fb

Browse files
committed
docs: update documentation for semantic markers and change intent detection
Add semantic marker detection (unsafe, derive, decorators, export, mutability) and diff-based change intent patterns (error handling, test, logging, dependency updates) across CHANGELOG, PRD, README, and DOCS. 424 tests.
1 parent 1a1c3ff commit a6c22fb

4 files changed

Lines changed: 31 additions & 15 deletions

File tree

CHANGELOG.md

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,18 +19,25 @@ All notable changes to CommitBee are documented here.
1919
- **Structural AST diffs**`AstDiffer` compares old and new tree-sitter nodes for modified symbols, producing structured `SymbolDiff` descriptions (parameter added, return type changed, visibility changed, async toggled, body modified). Shown as `STRUCTURED CHANGES:` section in the prompt.
2020
- **Whitespace-aware body comparison** — Body diff uses character-stream stripping so reformatting doesn't produce false `BodyModified` results.
2121
- **Structured changes in prompt** — New `STRUCTURED CHANGES:` section in the LLM prompt shows concise one-line descriptions of what changed per symbol (e.g., `CommitValidator::validate(): +param strict: bool, return bool → Result<()>, body modified`). Omitted when no structural diffs exist.
22+
- **Semantic markers**`AstDiffer` now detects `unsafe` added/removed, `#[derive()]` changes, decorator additions/removals, export changes, mutability changes, and generic constraint changes. Shown as `+unsafe`, `+derive(Debug, Clone)`, etc. in the STRUCTURED CHANGES section.
2223

2324
### Type Inference
2425

2526
- **Test-to-code ratio** — When >80% of additions are in test files, suggests `test` type even with source files present. Uses cross-multiplication to avoid integer truncation.
2627

28+
### Change Intent Detection
29+
30+
- **Diff-based intent patterns** — Scans added lines for error handling (`Result`, `?`, `Err()`), test additions (`#[test]`, `assert!`), logging (`tracing::`, `debug!`), and dependency updates. Shown as `INTENT:` section in the prompt with confidence scores.
31+
- **Conservative type refinement** — High-confidence performance optimization patterns can override the base type to `perf`.
32+
2733
### Prompt Quality
2834

2935
- **Token budget rebalance** — Symbol budget reduced from 30% to 20% when structural diffs are available, freeing space for the raw diff. SYSTEM_PROMPT updated to guide the LLM to prefer STRUCTURED CHANGES for signature details.
36+
- **Unsafe constraint rule** — When `unsafe` is added to a function, a CONSTRAINTS rule instructs the LLM to mention safety justification in the commit body.
3037

3138
### Testing
3239

33-
- **410 tests** total (up from 367 at v0.5.0).
40+
- **424 tests** total (up from 367 at v0.5.0).
3441

3542
## `v0.5.0` — Beyond the Diff
3643

DOCS.md

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -86,11 +86,11 @@ Here's what each step actually does:
8686

8787
**1. Git Service** reads your staged changes using `gix` for repo discovery and the git CLI for diffs. Paths are parsed with NUL-delimited output (`-z` flag) so filenames with spaces or special characters work correctly.
8888

89-
**2. Tree-sitter Analyzer** parses both the staged version and the HEAD version of every changed file — in parallel, using `rayon` across CPU cores. It extracts **full signatures** (e.g., `pub fn connect(host: &str, timeout: Duration) -> Result<Connection>`) by taking the definition node text before the body child. Methods include their **parent scope** (enclosing impl, class, or trait — e.g., `CommitValidator::validate`). Modified symbols show old → new signature diffs, with **structural AST diffs** that describe exactly what changed (parameters added/removed, return type changed, visibility changed, etc.). Cross-file connections are detected (caller+callee both changed). Symbols are tracked in three states: added, removed, or modified-signature, with a **doc-vs-code distinction** indicating whether changes were documentation-only, code-only, or mixed.
89+
**2. Tree-sitter Analyzer** parses both the staged version and the HEAD version of every changed file — in parallel, using `rayon` across CPU cores. It extracts **full signatures** (e.g., `pub fn connect(host: &str, timeout: Duration) -> Result<Connection>`) by taking the definition node text before the body child. Methods include their **parent scope** (enclosing impl, class, or trait — e.g., `CommitValidator::validate`). Modified symbols show old → new signature diffs, with **structural AST diffs** that describe exactly what changed (parameters added/removed, return type changed, visibility changed, semantic markers like `unsafe`, `derive`, decorators, `export`, mutability, generic constraints, etc.). Cross-file connections are detected (caller+callee both changed). Symbols are tracked in three states: added, removed, or modified-signature, with a **doc-vs-code distinction** indicating whether changes were documentation-only, code-only, or mixed.
9090

9191
**3. Commit Splitter** looks at your staged changes and decides whether they contain logically independent work. It uses diff-shape fingerprinting (what kind of changes — additions, deletions, modifications) combined with Jaccard similarity on content vocabulary to group files. If it finds multiple concerns, it offers to split them into separate commits.
9292

93-
**4. Context Builder** assembles a budget-aware prompt. It classifies modified symbols as whitespace-only or semantic (via character-stream comparison), computes evidence flags (mechanical change? public APIs removed? bug-fix evidence?), detects cross-file connections, identifies import changes and test file correlations, calculates the character budget for the subject line, and packs context within the token limit (~6K tokens). The token budget adapts: when structural AST diffs are available, symbols get 20% of the budget (diffs carry more detail); when only signatures are available, symbols get 30%.
93+
**4. Context Builder** assembles a budget-aware prompt. It classifies modified symbols as whitespace-only or semantic (via character-stream comparison), computes evidence flags (mechanical change? public APIs removed? bug-fix evidence?), detects **change intent** (error handling, test, logging, dependency update patterns) for the `INTENT:` prompt section, detects cross-file connections, identifies import changes and test file correlations, calculates the character budget for the subject line, and packs context within the token limit (~6K tokens). The token budget adapts: when structural AST diffs are available, symbols get 20% of the budget (diffs carry more detail); when only signatures are available, symbols get 30%.
9494

9595
**5. LLM Provider** streams the prompt to your chosen model (Ollama, OpenAI, or Anthropic) and collects the response token by token.
9696

@@ -107,12 +107,13 @@ CommitBee doesn't just send a diff. The prompt includes:
107107
- **Evidence flags** telling the LLM deterministic facts about the change
108108
- **Symbol changes with full signatures**`[+] pub fn connect(host: &str) -> Result<()>`, not just "Function connect"
109109
- **Signature diffs**`[~] old_sig → new_sig` for modified symbols
110-
- **Structured AST diffs**`CommitValidator::validate(): +param timeout, return Result<()> → Result<Error>` (precise semantic changes from AST comparison)
110+
- **Structured AST diffs**`CommitValidator::validate(): +param timeout, return Result<()> → Result<Error>` (precise semantic changes from AST comparison, including semantic markers like `+unsafe`, `+derive(Clone)`, `export added`, `mutability changed`)
111111
- **Import changes**`analyzer: added use crate::domain::DiffHunk` (tracked per file)
112112
- **Test file correlations**`src/services/context.rs <-> tests/context.rs (test file)`
113113
- **Doc-vs-code annotations** — modified symbols tagged `[docs only]` or `[docs + code]` when change is documentation-only or mixed
114114
- **Cross-file connections**`validator calls parse() — both changed`
115115
- **Primary change detection** — which file has the most significant changes
116+
- **Change intent** — detected patterns like error handling, test additions, logging, or dependency updates, surfaced as an `INTENT:` section
116117
- **Constraints** — rules the LLM must follow based on evidence (e.g., "no bug-fix comments found, prefer refactor over fix")
117118
- **Character budget** — exact number of chars available for the subject line
118119
- **Group rationale** — when splitting, why these files are grouped together
@@ -657,7 +658,7 @@ src/
657658
├── domain/
658659
│ ├── change.rs # FileChange, StagedChanges, ChangeStatus
659660
│ ├── symbol.rs # CodeSymbol, SymbolKind, SpanChangeKind
660-
│ ├── diff.rs # SymbolDiff, ChangeDetail (structural AST diffs)
661+
│ ├── diff.rs # SymbolDiff, ChangeDetail (structural AST diffs + 10 semantic marker variants)
661662
│ ├── context.rs # PromptContext — assembles the LLM prompt
662663
│ └── commit.rs # CommitType enum (single source of truth)
663664
└── services/
@@ -702,7 +703,7 @@ No panics in user-facing code paths. The sanitizer and validator are tested with
702703

703704
### Testing Strategy
704705

705-
CommitBee has 410 tests across multiple strategies:
706+
CommitBee has 424 tests across multiple strategies:
706707

707708
| Strategy | What It Covers |
708709
| --- | --- |
@@ -715,7 +716,7 @@ CommitBee has 410 tests across multiple strategies:
715716
Run them:
716717

717718
```bash
718-
cargo test # All 410 tests
719+
cargo test # All 424 tests
719720
cargo test --test sanitizer # Just sanitizer tests
720721
cargo test --test integration # LLM provider mocks
721722
COMMITBEE_LOG=debug cargo test -- --nocapture # With logging

PRD.md

Lines changed: 13 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ SPDX-License-Identifier: AGPL-3.0-only OR LicenseRef-Commercial
1818

1919
| Version | Date | Summary |
2020
|---------|------------|---------|
21-
| 4.3 | 2026-03-27 | v0.6.0-rc.1 deep semantic understanding: parent scope, import detection, doc-vs-code classification, structural AST diffs (AstDiffer + SymbolDiff), STRUCTURED CHANGES prompt section, token budget rebalance. 410 tests. |
21+
| 4.3 | 2026-03-27 | v0.6.0-rc.1 deep semantic understanding: parent scope, import detection, doc-vs-code classification, structural AST diffs (AstDiffer + SymbolDiff), STRUCTURED CHANGES prompt section, token budget rebalance, T3 semantic markers (FR-071), change intent detection (FR-072). 424 tests. |
2222
| 4.2 | 2026-03-22 | v0.5.0 hardening: security fixes (SSRF prevention, streaming caps), prompt optimization (budget fix, evidence omission, emoji removal), eval harness (36 fixtures, per-type reporting), test coverage (15+ new tests), API hygiene (pub(crate) demotions), 5 fuzz targets. 359 tests. |
2323
| 4.1 | 2026-03-22 | AST context overhaul (v0.5.0): full signature extraction from tree-sitter nodes, semantic change classification (whitespace vs body vs signature), old→new signature diffs, cross-file connection detection, formatting auto-detection via symbols. 359 tests. |
2424
| 4.0 | 2026-03-13 | PRD normalization: aligned phases with shipped versions (v0.2.0/v0.3.x/v0.4.0), collapsed revision history, unified status markers, resolved stale critical issues, canonicalized test count to 308, removed dead cross-references. FR-031 (Exclude Files) and FR-033 (Copy to Clipboard) shipped. |
@@ -95,7 +95,7 @@ CommitBee is a Rust-native CLI tool that uses tree-sitter semantic analysis and
9595
| Multiple message generation (pick from N) | Common (aicommits, aicommit2) | ✅ v0.2.0 |
9696
| Commit splitting (multi-concern detection) | No competitor has this | ✅ v0.2.0 |
9797
| Custom prompt/instruction files | Growing (Copilot, aicommit2) | ✅ v0.4.0 |
98-
| Unit/integration tests | Non-negotiable for quality |410 tests |
98+
| Unit/integration tests | Non-negotiable for quality |424 tests |
9999

100100
## 3. Architecture
101101

@@ -511,6 +511,14 @@ In `infer_commit_type`, when >80% of additions are in `FileCategory::Test` files
511511

512512
`STRUCTURED CHANGES:` section in LLM prompt renders `SymbolDiff::format_oneline()` descriptions (e.g., `CommitValidator::validate(): +param strict: bool, return bool → Result<()>, body modified (+5 -2)`). Omitted when no structural diffs exist. Token budget rebalanced: symbol budget reduced from 30% to 20% when structural diffs available, freeing space for raw diff. SYSTEM_PROMPT updated to guide LLM to prefer structured changes for signature details. 3 tests.
513513

514+
#### FR-071: Semantic Marker Detection ✅
515+
516+
`AstDiffer` extended with 10 marker variants in `ChangeDetail`: `UnsafeAdded`/`Removed`, `DeriveAdded`/`Removed`, `DecoratorAdded`/`Removed`, `ExportAdded`/`Removed`, `MutabilityChanged`, `GenericConstraintChanged`. Extracts unsafe keyword, derive attributes, and mutability from tree-sitter nodes during function comparison. Unsafe additions set `has_unsafe_addition` evidence flag and trigger a CONSTRAINTS rule requiring safety justification in the commit body. 4 unit tests.
517+
518+
#### FR-072: Change Intent Detection ✅
519+
520+
`detect_intents()` scans added diff lines for error handling patterns (9 patterns including `Result<>`, `?`, `Err()`, `.map_err()`), test patterns (6 patterns including `#[test]`, `assert!`), logging patterns (9 patterns including `tracing::`, `debug!()`, `info!()`), and dependency updates (version changes in manifests). `INTENT:` prompt section shows detected patterns with confidence scores. `refine_type_with_intents()` conservatively overrides base type only for high-confidence performance optimization. 7 tests.
521+
514522
### 4.7 Future — v0.7.0+ (Market Leadership)
515523

516524
#### FR-050: MCP Server Mode
@@ -693,7 +701,7 @@ commitbee eval # Run evaluation harness (dev, feature-ga
693701

694702
## 8. Testing Requirements
695703

696-
**Current test count: 410**
704+
**Current test count: 424**
697705

698706
### TR-001: Unit Tests
699707

@@ -852,7 +860,7 @@ Invalid JSON → retry once with repair prompt. Second failure → heuristic ext
852860
| 3 | v0.4.0 | ✅ Shipped | Feature completion — templates, languages, rename, history, eval, fuzzing |
853861
| 4 | v0.4.x | ✅ Shipped | Remaining polish — exclude files (FR-031), clipboard (FR-033) |
854862
| 5 | v0.5.0 | ✅ Shipped | AST context overhaul — full signatures, semantic change classification, cross-file connections. 367 tests. |
855-
| 6 | v0.6.0-rc.1 | ✅ Shipped | Deep semantic understanding — parent scope, import detection, doc-vs-code classification, structural AST diffs, structured changes prompt section. 410 tests. |
863+
| 6 | v0.6.0-rc.1 | ✅ Shipped | Deep semantic understanding — parent scope, import detection, doc-vs-code classification, structural AST diffs, structured changes prompt section, semantic markers, change intent detection. 424 tests. |
856864
| 7 | v0.7.0+ | 📋 Planned | Market leadership — MCP server, changelog, monorepo, version bumping, GitHub Action |
857865

858866
## 12. Success Metrics
@@ -867,7 +875,7 @@ Invalid JSON → retry once with repair prompt. Second failure → heuristic ext
867875
| Commit message quality | > 80% "good enough" first try | Manual evaluation + `commitbee eval` |
868876
| Secret leak rate | 0 | Integration tests with known patterns |
869877
| MSRV | Rust 1.94 (edition 2024) | CI matrix (stable + 1.94) |
870-
| Test count | ≥ 308 | `cargo test` (current: 410) |
878+
| Test count | ≥ 308 | `cargo test` (current: 424) |
871879

872880
## 13. Non-Goals
873881

README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -99,7 +99,7 @@ When your staged changes mix independent work (a bugfix in one module + a refact
9999
- **🐚 Shell completions** — bash, zsh, fish, powershell via `commitbee completions`.
100100
- **⚙️ 5-level config** — Defaults → project `.commitbee.toml` → user config → env vars → CLI flags.
101101
- **🦀 Single binary**~18K lines of Rust. Compiles to one static binary with LTO. No runtime dependencies.
102-
- **🧪 410 tests** — Unit, snapshot, property (proptest for never-panic guarantees), and integration (wiremock).
102+
- **🧪 424 tests** — Unit, snapshot, property (proptest for never-panic guarantees), and integration (wiremock).
103103

104104
## 📦 Installation
105105

@@ -222,7 +222,7 @@ The default provider (Ollama) runs entirely on your machine. No data leaves your
222222
## 🧪 Testing
223223

224224
```bash
225-
cargo test # 410 tests — unit, snapshot (insta), property (proptest), integration (wiremock)
225+
cargo test # 424 tests — unit, snapshot (insta), property (proptest), integration (wiremock)
226226
```
227227

228228
See [Testing Strategy](DOCS.md#testing-strategy) for the full breakdown.
@@ -231,7 +231,7 @@ See [Testing Strategy](DOCS.md#testing-strategy) for the full breakdown.
231231

232232
See [`CHANGELOG.md`](CHANGELOG.md) for the full version history.
233233

234-
**Current:** `v0.6.0-rc.1` *Deep Understanding* — Parent scope extraction, structural AST diffs, import change detection, doc-vs-code distinction, test file correlation, test-to-code ratio inference, and adaptive token budgeting.
234+
**Current:** `v0.6.0-rc.1` *Deep Understanding* — Parent scope extraction, structural AST diffs, import change detection, doc-vs-code distinction, test file correlation, test-to-code ratio inference, adaptive token budgeting, semantic markers (unsafe, derive, decorator, export detection in AST diffs), and change intent detection (error handling, test, logging patterns with confidence scoring).
235235

236236
## 🤝 Contributing
237237

0 commit comments

Comments
 (0)