feat: add FixedChunked transcriber by humblemuzzu · Pull Request #72 · cjpais/transcribe-rs

humblemuzzu · 2026-03-27T23:45:15Z

Reworked from #71 per your feedback — chunking now lives in the Transcriber layer, not inside the model.

What

FixedChunked — a new Transcriber implementation that splits audio into fixed-duration chunks with configurable overlap. Sits alongside VadChunked and EnergyAdaptiveChunked.

No VAD model, no energy analysis. The simplest chunking strategy for models with hard sequence-length limits.

Why overlap

EnergyAdaptiveChunked with search_window=0 gives you fixed-duration chunks, but with hard cuts. FixedChunked keeps a configurable tail from each chunk in the buffer so the next chunk starts with shared audio context. This prevents the model from seeing a hard cut mid-word at chunk boundaries.

Usage

use transcribe_rs::transcriber::{FixedChunked, FixedChunkedConfig, Transcriber};

let config = FixedChunkedConfig::default(); // 30s chunks, 1s overlap
let mut chunker = FixedChunked::new(config, TranscribeOptions::default());
let result = chunker.transcribe(&mut model, &samples)?;

Config

pub struct FixedChunkedConfig {
    pub chunk_duration_secs: f32,  // default 30.0
    pub overlap_secs: f32,         // default 1.0 (0.0 for hard cuts)
    pub padding_secs: f32,         // default 0.0
    pub min_chunk_secs: f32,       // default 0.0
    pub merge_separator: String,   // default " "
}

What changed

New file: src/transcriber/fixed_chunked.rs (413 lines)
src/transcriber/mod.rs: +3 lines (register, re-export, doc)

Uses existing transcribe_padded() and merge_sequential_with_separator(). No new dependencies. Parakeet model internals untouched.

Tests

12 unit tests using MockModel/FailOnNthModel, same patterns as the existing transcriber tests:

Splitting at chunk duration
Overlap retains tail correctly
Remainder handled in finish()
min_chunk_secs skips short remainders
Timestamps correct with and without overlap
Timestamp clamping with padding
Empty input
Object safety (Box<dyn Transcriber>)
Reusable after error
Short audio single pass

Validated

Tested with an ~8 minute recording on parakeet-tdt-0.6b-v3-int8. 17 chunks, 8.9s total, clean output with no garbled words at boundaries.

Adds a new Transcriber implementation that splits audio into fixed-duration chunks with configurable overlap. No VAD model or energy analysis needed — the simplest chunking strategy for models with hard sequence-length limits (e.g. Conformer encoders). The overlap keeps a small tail from each chunk in the buffer so the next chunk starts with shared audio context, preventing garbled words at chunk boundaries. Uses the existing transcribe_padded() and merge_sequential_with_separator() infrastructure. Includes 12 unit tests covering splitting, overlap, timestamps, remainders, error recovery, and object safety. Defaults: 30s chunks, 1s overlap. Session-Id: 2695c041-2969-46cd-b749-61636e27d352

cjpais · 2026-03-27T23:55:01Z

@humblemuzzu would you mind sharing the audio file you have? Or uploading it as part of your commit? Just curious

humblemuzzu · 2026-03-28T11:19:05Z

that was just my long prompt to claude pretty unhinged but happy to share on X, dmed you

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add FixedChunked transcriber#72

feat: add FixedChunked transcriber#72
humblemuzzu wants to merge 1 commit intocjpais:mainfrom
humblemuzzu:fix/fixed-chunked-transcriber

humblemuzzu commented Mar 27, 2026

Uh oh!

cjpais commented Mar 27, 2026

Uh oh!

humblemuzzu commented Mar 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

humblemuzzu commented Mar 27, 2026

What

Why overlap

Usage

Config

What changed

Tests

Validated

Uh oh!

cjpais commented Mar 27, 2026

Uh oh!

humblemuzzu commented Mar 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants