Ananke turns AI code generation from probabilistic text completion into search through valid program spaces. It extracts constraints from source code, compiles them into token-level masks and grammars, and ships those artifacts to an inference backend that enforces them during decoding.
This document describes the machinery that makes that happen. Every
number below was verified against the source tree at commit ae119b3.
┌─────────────────────────────────────────────┐
│ CLI (src/cli/) │
│ extract │ compile │ generate │ validate │
│ export-spec │ init │ version │ help │
├─────────────────────────────────────────────┤
│ Ariadne (src/ariadne/) │
│ Constraint DSL parsing │
├─────────────────────────────────────────────┤
│ Braid (src/braid/) │
│ CLaSH algebra │ Domain fusion │ Feasibility │
│ Type inhabitation │ FIM │ Salience │ Temporal│
├─────────────────────────────────────────────┤
│ Clew (src/clew/) │
│ 14 languages │ tree-sitter │ patterns │
│ Scope context │ Call graph │ Conventions │
├─────────────────────────────────────────────┤
│ Maze (maze/) │
│ Rust FFI │ Modal client │ sglang integration │
├─────────────────────────────────────────────┤
│ Eval (eval/) │
│ pass@k │ quality scoring │ statistical tests │
└─────────────────────────────────────────────┘
Data flows bottom-up at extraction time and top-down at generation time. Clew mines constraints from source; Braid compiles them to an IR that Maze forwards to sglang or Modal for constrained decoding. The eval harness measures whether any of this actually helps.
| File | Purpose |
|---|---|
clew.zig |
Main extraction engine. The Ananke struct lives here. |
extractors.zig |
Extractor registry and dispatch |
extractors/ |
15 files: base.zig + one per language (c.zig, cpp.zig, csharp.zig, go.zig, java.zig, javascript.zig, kotlin.zig, php.zig, python.zig, ruby.zig, rust.zig, swift.zig, typescript.zig, zig_lang.zig) |
patterns.zig |
383 pattern rules across 14 languages, 8 categories per language |
hybrid_extractor.zig |
tree-sitter primary + pattern fallback |
tree_sitter.zig |
tree-sitter FFI orchestration |
tree_sitter/ |
C FFI bindings for tree-sitter |
scope_context.zig |
Homer scope graph integration: ScopeBinding, CanonicalImport, ScopeContext (11 tests) |
call_graph_context.zig |
InlineCoder-style caller/callee context (7 tests) |
conventions.zig |
Convention mining, produces soft CLaSH constraints (5 tests) |
hole_detector.zig |
Typed hole detection (TODO, pass, unimplemented!, etc.) |
semantic_hole_detector.zig |
Semantic hole detection via AST analysis |
parsers/ |
Language-specific parser helpers |
| File | Purpose |
|---|---|
braid.zig |
Main compiler: Constraint[] to ConstraintIR. Cache, incremental, conflict resolution. |
domain_fusion.zig |
CLaSH 5-domain fusion: hard mask intersection + soft additive reweighting + CRANE phase switching (13 tests) |
feasibility.zig |
Conflict detection, tightness scoring, community-aware tension (7 tests) |
salience.zig |
Homer quadrant to intensity + confidence mapping (10 tests) |
temporal.zig |
Stability classes, co-change patterns, confidence adjustment (7 tests) |
fim.zig |
FIM context analysis: PrefixAnalysis, SuffixAnalysis, HoleScale (12 tests) |
types/type_system.zig |
TypeArena, Type union (12 variants), PrimitiveKind (20 variants), Language (10 variants) (5 tests) |
types/parser.zig |
TypeParser: string signatures to unified types, 10 languages (7 tests) |
types/inhabitation.zig |
InhabitationGraph: BFS reachability, 9 EdgeKinds, per-language builtins (4 tests) |
types/mask_generator.zig |
MaskGenerator, TypeInhabitationState, TypeInhabitationBuilder (8 tests) |
types/types.zig |
Type module root |
json_schema_builder.zig |
JSON Schema Draft 7 generation |
regex_analyzer.zig |
Regex pattern extraction + pathology filtering (catastrophic backtracking detection) |
string_interner.zig |
GrammarInterner + RegexPatternPool for string deduplication |
hole_compiler.zig |
Typed holes to IR compilation |
sanitizer.zig |
Constraint injection prevention (name length limits, input validation) |
| File | Purpose |
|---|---|
ariadne.zig |
DSL parser (parsing complete, type checking deferred) |
test_parser.zig |
Parser tests |
| File | Purpose |
|---|---|
extract.zig |
Constraint extraction from source files |
compile.zig |
Constraint compilation to IR |
generate.zig |
Code generation via sglang/Modal, FIM support (11 tests) |
validate.zig |
Code validation against constraints |
export_spec.zig |
One-shot extract+compile+context to ConstraintSpec JSON |
init.zig |
.ananke.toml initialization |
version.zig |
Version information |
help.zig |
Help display |
| File | Purpose |
|---|---|
src/lib.rs |
Crate root. MazeOrchestrator, GenerationRequest, GenerationResponse. |
src/ffi.rs |
FFI bridge to Zig core (constraint extraction and compilation) |
src/modal_client.rs |
Modal HTTP client |
src/model_router.rs |
Model routing and selection logic |
src/model_selector.rs |
Strategy-based model selection |
src/adaptive_selector.rs |
Adaptive model selection based on task characteristics |
src/progressive_refinement.rs |
Iterative generation refinement |
src/diffusion.rs |
Diffusion-based generation strategy |
src/strategy_stats.rs |
Generation strategy statistics |
src/telemetry.rs |
Inference telemetry collection |
src/python.rs |
Python bindings |
tests/ |
ffi_tests.rs, orchestrator_tests.rs, zig_integration_test.rs, modal_client_tests.rs, end_to_end_tests.rs |
modal_inference/inference.py |
Modal deployment: Qwen2.5-Coder-32B-Instruct on A100-80GB |
| File | Purpose |
|---|---|
evaluator.zig |
Evaluator, MultiSampleEvaluator, BatchEvaluationResult |
task_spec.zig |
TaskSpec, 24 TaskCategory variants, 4 DifficultyLevels |
quality_scorer.zig |
5-axis quality scoring |
metrics/pass_at_k.zig |
pass@k statistics |
metrics/statistical_tests.zig |
Paired t-test |
metrics/constraint_metrics.zig |
CodeIF constraint satisfaction metrics |
modal_client.zig |
Eval-specific Modal client |
eval_constraint_compiler.zig |
Constraint compilation for eval tasks |
test_runner.zig |
Test execution harness |
prompt_normalizer.zig |
Prompt normalization |
failure_analyzer.zig |
Failure classification and analysis |
Six variants, mapping to analysis domains:
pub const ConstraintKind = enum {
syntactic, // Code structure, formatting, naming
type_safety, // Type annotations, null safety, generics
semantic, // Data flow, control flow, side effects
architectural, // Module boundaries, dependencies, layering
operational, // Performance, memory, concurrency
security, // Input validation, auth, dangerous ops
};The compiled form of constraints. This is the artifact that crosses the FFI boundary and travels to the inference backend.
pub const ConstraintIR = struct {
json_schema: ?JsonSchema = null,
grammar: ?Grammar = null,
regex_patterns: []const Regex = &.{},
token_masks: ?TokenMaskRules = null,
type_inhabitation: ?TypeInhabitationData = null,
priority: u32 = 0,
hole_specs: []const HoleSpec = &.{},
rich_context: ?RichContext = null,
feasibility_score: f32 = 0.0,
is_feasible: bool = true,
// ...
};The json_schema field carries structural metadata for the sglang
backend. Only the grammar field (EBNF) goes to llguidance for
actual token masking.
The rich_context field carries the full CLaSH decomposition as
serialized JSON -- function signatures, type bindings, class
definitions, imports, control flow patterns, semantic constraints,
scope bindings, and call graph context. Eight JSON blobs, each
independently nullable. This structure lets backends consume whichever
context they support without requiring all-or-nothing.
ConstraintIR also carries owns_grammar_strings -- a flag
distinguishing borrowed (interned) grammar strings from owned (cloned)
ones. Cache hits return cloned IRs that own their strings; the cache
itself holds interned originals. Getting this wrong means either a
double-free or a use-after-free, which is why the flag exists.
braid/sanitizer.zig prevents constraint injection attacks. Constraint
names are capped at 64 bytes. Descriptions are scrubbed for control
characters. The concern is that untrusted constraint sources (user
config, telemetry, LLM-generated) could inject malformed data that
corrupts the IR or escapes into grammar rules. The sanitizer runs
before any constraint enters the compilation pipeline.
Clew answers one question: given source code, what constraints does it imply?
Extraction runs in two tiers:
-
tree-sitter AST walk (primary). Per-language extractors in
extractors/walk the concrete syntax tree and emit structured constraints: function signatures, type annotations, class hierarchies, import maps, error handling patterns. -
Pattern matching (fallback). When tree-sitter parsing fails or a language grammar isn't loaded,
patterns.zigapplies 383 string-match rules organized into 8 categories per language:function_decl,type_annotation,async_pattern,error_handling,imports,class_struct,metadata,memory_management.
The hybrid_extractor.zig orchestrates this: try tree-sitter first,
fall back to patterns, merge results. The pattern matcher
(scanPatterns in patterns.zig) is a linear scan: for each byte
position in the source, it checks all pattern sets in a nested loop.
This is O(n * p) where n is source length and p is total pattern count
-- acceptable because p is bounded (383) and patterns are short string
matches, not regex.
When Homer is available, Clew enriches extraction with cross-file intelligence:
-
Scope context (
scope_context.zig): queries Homer's scope graph for bindings visible at the cursor position. ProducesScopeBindingrecords (name, kind, qualified type, definition file) andCanonicalImportrecords. This feeds the Imports domain in CLaSH -- the vocabulary subset constraint ensures the model only references symbols that are actually in scope. -
Call graph context (
call_graph_context.zig): InlineCoder-style analysis. For a function being generated, retrieves upstream callers (what calls this function, with what arguments, how is the result used) and downstream callees (what this function calls, with what signatures). This gives the model concrete usage patterns rather than abstract type signatures. -
Conventions (
conventions.zig): mines repository-wide coding conventions (naming patterns, error handling idioms, import styles) and produces soft CLaSH constraints. If 95% of the codebase usescamelCase, the model gets a soft nudge towardcamelCase.
C, C++, C#, Go, Java, JavaScript, Kotlin, PHP, Python, Ruby, Rust,
Swift, TypeScript, Zig. Each gets a dedicated extractor in
extractors/ and a pattern set in patterns.zig.
Extraction produces more than flat constraint lists. The RichContext
struct carries eight JSON blobs covering the CLaSH domain decomposition:
function_signatures_json— name, parameters (with types), return type, async flagtype_bindings_json— name, kind, fieldsclass_definitions_json— name, methods, fieldsimports_json— module, items, wildcard flagcontrol_flow_json— async patterns, generators, error handling style, recursionsemantic_constraints_json— kind, expression, sourcescope_bindings_json— Homer scope graph bindings (cross-file)call_graph_json— Homer call graph (callers, callees, argument usage)
The last two require a running Homer instance. Without it, those fields are null and everything else works fine.
Braid transforms a bag of constraints into a ConstraintIR suitable
for token-level enforcement. The pipeline has eleven stages.
From braid.zig, method compile():
- Cache key —
computeCacheKeyhashes canonically-sorted constraints with Wyhash. - Cache lookup — LRU cache with copy-on-write
SharedConstraintIR(reference-counted). Cache hit returns a clone in O(1). - Dependency graph —
buildDependencyGraphcreates edges: syntactic before type_safety before semantic. Topological structure. - Conflict detection —
detectConflictsgroups constraints byConstraintKind, checks pairs within each group. O(n^2/k) where k is the number of kinds. - Conflict resolution — Claude API if an LLM client is configured; otherwise default heuristic (higher severity wins).
- Graph optimization —
optimizeGraphdoes topological sort and boosts priority based on severity. - IR generation (
compileToIR):- Feasibility analysis via
FeasibilityAnalyzer(tightness scoring, feasibility flag) - JSON Schema generation from type constraints
- Type inhabitation data (parses "must return type: T" from descriptions, builds reachability graph)
- Grammar construction from syntactic constraints (EBNF rules, string-interned)
- Regex pattern extraction with pathology filtering via
RegexAnalyzer - Token mask generation from security constraints
- Feasibility analysis via
After compileToIR returns, the caller layers on additional analyses:
- Salience scoring — Homer quadrant mapped to intensity level and confidence.
- Temporal analysis — stability classification, co-change decay, confidence adjustment.
- Domain fusion — ASAp-style hard mask intersection + soft additive reweighting, with CRANE phase switching between reasoning and structured output.
- FIM analysis — if
--fimmode is active,PrefixAnalysisandSuffixAnalysisdetermine hole scale and context boundaries.
Braid supports incremental recompilation via fingerprint-based change
detection (IncrementalState). On recompile:
- Unchanged constraints skip reprocessing entirely.
- Changed constraints propagate through
getAffectedSubgraph. - If more than 80% of the graph is affected, Braid falls back to a full rebuild (it's faster than selective patching at that point).
CLaSH organizes constraints into 5 domains across 2 tiers:
| Domain | Tier | Enforcement |
|---|---|---|
| Syntax | Hard | Earley parser / PDA |
| Types | Hard | Prefix automata |
| Imports | Hard | Vocabulary subset |
| ControlFlow | Soft | Graded 0.0--1.0 |
| Semantics | Soft | Graded 0.0--1.0 |
Hard constraints define the feasible token set (binary pass/fail). Soft constraints rank candidates within that set. Domain fusion intersects hard masks, then applies additive reweighting from soft scores.
CRANE-style phase switching relaxes constraints during reasoning tokens
and tightens them during structured output, preventing constraint
enforcement from interfering with chain-of-thought. During reasoning
phase with adaptive switching enabled, only the Syntax domain stays
active. During structured_output, all domains at the current
intensity level participate.
Salience scoring (salience.zig) maps Homer's repository analysis into
constraint intensity levels. Homer produces a composite score (weighted
blend of PageRank 30%, betweenness 15%, HITS 15%, churn 15%, bus
factor 10%, code size 5%, test presence 10%) and a four-quadrant
classification:
| Quadrant | Centrality | Churn | Intensity | Confidence |
|---|---|---|---|---|
| FoundationalStable | High | Low | full_hard |
High |
| ActiveHotspot | High | High | full |
Medium |
| PeripheralActive | Low | High | standard |
-- |
| QuietLeaf | Low | Low | syntax_only |
-- |
Intensity levels form a lattice from none (no constraints) through
syntax_only, standard (Syntax + Types), full_hard (all 3 hard
domains), full (all 5 domains), to exhaustive (all domains plus
verification hooks). Each level carries a per-token latency budget:
50us for syntax-only up to 5000us for exhaustive.
The practical effect: foundational code gets all five domains enforced; a rarely-touched leaf file might only get grammar checking. This avoids the "constrain everything equally" failure mode where enforcement cost swamps generation speed on code that doesn't need it.
Temporal analysis (temporal.zig) adjusts constraint confidence based
on code stability over time. It classifies files by modification
frequency, applies co-change decay (recently-changed files get reduced
confidence), and modulates the salience-derived intensity accordingly.
A file that was stable for months but just got a major refactor should
temporarily have its constraint confidence reduced until the new
patterns settle.
When --fim mode is active, fim.zig analyzes the code surrounding a
cursor position. PrefixAnalysis examines what comes before the hole:
function context, type expectations, variable bindings in scope.
SuffixAnalysis examines what comes after: expected return types,
closing delimiters, downstream usage. Together they determine the
HoleScale -- whether the model needs to fill a single expression, a
statement, a block, or an entire function body. This scale determines
which constraint domains are relevant (a single expression needs type
constraints; a full function body needs all five domains).
The type system in braid/types/ implements cross-language type
reasoning:
- TypeArena allocates all types in a single arena. One
deinitfrees everything. - Type is a tagged union with 12 variants:
primitive,array,tuple,object,function,union_type,intersection,optional,named,generic,reference,error_union. - PrimitiveKind has 20 variants spanning Zig's integer types,
floats, JS/TS specials (
number,any,unknown,never), and universals (string,char,boolean,void_type,null_type,undefined). - TypeParser maps string type signatures from 10 languages into
the unified
Typerepresentation. - InhabitationGraph does BFS reachability over 9 edge kinds
(
coercion,binary_op,property,method,application,indexing,construction,template,assertion) to determine which types are constructible from available bindings. - MaskGenerator converts inhabitation results into token masks.
Maze is the Rust crate that bridges Zig's constraint engine to GPU inference.
Zig (Ananke core) → C ABI → Rust FFI (maze/src/ffi.rs)
↓
MazeOrchestrator
↓ ↓
Modal client sglang client
↓ ↓
A100-80GB vLLM + llguidance
The Zig core is compiled as a C-compatible library. Rust calls it via
FFI for constraint extraction and compilation, receiving a
ConstraintIR that it serializes and forwards to the inference
backend.
sglang: OpenAI-compatible HTTP with a constraint_spec extension
field. The grammar goes to llguidance for token masking; JSON schema
travels as structural metadata.
Modal: Custom /generate endpoint. Currently runs
Qwen2.5-Coder-32B-Instruct on an A100-80GB. Deployed via
modal deploy maze/modal_inference/inference.py.
The orchestrator (MazeOrchestrator in lib.rs) handles the full
generation lifecycle. It accepts a GenerationRequest (prompt,
constraints IR, max tokens, temperature, optional context) and returns
a GenerationResponse with the generated code, validation results,
and provenance metadata (which model, which constraints were active).
Supporting modules:
model_router.rs+model_selector.rs+adaptive_selector.rs: pick the right backend and model for a given task. The adaptive selector adjusts based on task characteristics (code complexity, constraint density, language).progressive_refinement.rs: iterative generation -- generate, validate against constraints, refine, repeat until satisfied or budget exhausted.diffusion.rs: diffusion-based generation strategy (experimental).strategy_stats.rs: tracks per-strategy success rates and latencies.telemetry.rs: collects inference metrics for observability.
Zig's explicit allocator passing is used throughout. No hidden globals. The pattern is consistent:
var thing = try allocate(allocator);
errdefer thing.deinit(allocator);
// ... use thing ...
return thing; // caller owns it-
TypeArena (
braid/types/type_system.zig): SingleArenaAllocatorfor all type allocations during inhabitation analysis. Onedeinitfrees everyTypenode, every field slice, every string. No individual frees needed. -
GrammarInterner (
braid/string_interner.zig): Deduplicates grammar rule strings and regex patterns. Grammar rules for common constructs (e.g., identifier patterns) appear hundreds of times across constraints; interning them saves meaningful memory. -
RingQueue: Fixed-size circular buffer used for LRU cache eviction ordering.
-
SharedConstraintIR: Reference-counted wrapper around
ConstraintIR. Cache stores the original; callers get either a clone (viacompile()) or an acquired reference (viacompileShared()). The shared path avoids cloning for read-only access -- important when the same constraint set is compiled repeatedly during iterative generation.compileShared()returns an acquired reference that the caller mustrelease()when done;compile()returns an owned clone the caller mustdeinit(). Two APIs because the performance characteristics differ enough to matter: the shared path is O(1), the clone path is O(n) in IR size.
Every allocation that could be abandoned on error gets an errdefer.
This is enforced by convention and caught by Zig's leak-detecting test
allocator, which fails the test if any allocation is not freed.
At last measurement: 370 Zig tests + 86 Rust tests = 456 total, zero
failures, zero memory leaks (Zig's std.testing.allocator detects
leaks as test failures).
Zig: Inline test "name" { ... } blocks colocated with the code
they test. This is idiomatic Zig -- tests live next to the functions
they exercise, share the same file scope, and run with
zig build test --summary all.
Rust: maze/tests/ for integration tests (ffi_tests.rs,
orchestrator_tests.rs, zig_integration_test.rs,
modal_client_tests.rs). Unit tests via #[cfg(test)] modules
inside maze/src/ files.
Eval fixtures: eval/tasks/fixtures/ contains Zig programs used
as evaluation targets.
Modules with the highest test density (test count in parentheses):
domain_fusion.zig(13) — CLaSH fusion correctnessfim.zig(12) — FIM prefix/suffix analysisscope_context.zig(11) — scope graph integrationgenerate.zig(11) — CLI generation commandsalience.zig(10) — salience scoringmask_generator.zig(8) — type inhabitation masksfeasibility.zig(7) — conflict detectiontemporal.zig(7) — temporal analysisparser.zig(7) — type parsercall_graph_context.zig(7) — call graph context
The eval harness supports multi-sample pass@k evaluation with paired
constrained-vs-unconstrained comparison. MultiSampleEvaluator
generates n samples per task, pass_at_k.zig computes the unbiased
estimator, and statistical_tests.zig runs a paired t-test (p < 0.05
threshold) to determine whether constraints actually improve output
quality.
Task specs cover 24 categories (algorithms, API, async, caching, concurrency, data processing, data structures, database, error handling, file I/O, mathematics, memory management, messaging, parsing, patterns, performance, resilience, security, string processing, system utilities, type system, utilities, validation, web components) across 4 difficulty levels (simple, medium, moderate, complex).
# Build and test (Zig)
zig build test --summary all
# Build release binary
zig build -Doptimize=ReleaseSafe -p /tmp/ananke-build
# Rust tests (from maze/)
cargo test
# Deploy Modal inference
modal deploy maze/modal_inference/inference.pyCI runs 7 jobs: security, lint, coverage, ubuntu, macos, integration,
and gate. zig fmt --check . enforces formatting at the repository
root (including build.zig). cargo fmt --check enforces Rust
formatting in maze/.
One quirk: the tree-sitter-zig submodule always shows as dirty
(m vendor/tree-sitter-zig in git status). This is harmless -- a
generated file differs from what git expects. Do not commit it.
tree-sitter-swift is pinned to the 0.7.1-with-generated-files tag
of the alex-pinkus fork (the upstream main branch lacks parser.c).
| Document | What it covers |
|---|---|
docs/CLASH_ALGEBRA.md |
Formal CLaSH domain definitions, tier semantics |
docs/DOMAIN_FUSION.md |
ASAp + CRANE fusion algorithm details |
docs/TYPE_INHABITATION.md |
Type system, inhabitation graph, mask generation |
docs/FIM_GUIDE.md |
Fill-in-the-middle mode usage and internals |
docs/HOMER_INTEGRATION.md |
Homer MCP integration, scope graph, call graph |
docs/FFI_GUIDE.md |
Zig/Rust FFI boundary details |
docs/EVAL_GUIDE.md |
Evaluation framework usage |
docs/spec/SPEC-01 through SPEC-05 |
Feature specifications |
docs/adr/ADR-001 through ADR-007 |
Architectural decision records |