eclipse-rdf4j
diff --git a/‎.codex/skills/high-performance-java/SKILL.md‎
Lines changed: 54 additions & 9 deletions b/‎.codex/skills/high-performance-java/SKILL.md‎
Lines changed: 54 additions & 9 deletions
diff --git a/‎.codex/skills/high-performance-java/agents/openai.yaml‎
Lines changed: 2 additions & 2 deletions b/‎.codex/skills/high-performance-java/agents/openai.yaml‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎.codex/skills/high-performance-java/references/advanced-coding-techniques.md‎
Lines changed: 220 additions & 0 deletions b/‎.codex/skills/high-performance-java/references/advanced-coding-techniques.md‎
Lines changed: 220 additions & 0 deletions
@@ -1,28 +1,33 @@
 ---
 name: high-performance-java
-description: Use when writing, reviewing, or reshaping HotSpot Java where throughput, latency, allocation rate, zero-copy, lazy evaluation, non-materialization, intrinsics, SuperWord auto-vectorization, or C2 assembly matter. Bias toward specialized hot-path code, then ground claims in benchmarks and JIT evidence.
+description: Use when writing, reviewing, or reshaping HotSpot Java where algorithmic complexity, data-structure choice, throughput, latency, allocation rate, zero-copy, lazy evaluation, non-materialization, primitive collections, performance libraries, intrinsics, SuperWord auto-vectorization, or C2 assembly matter. Also use for advanced algorithmic problem solving in Java, including dynamic programming, graph/range techniques, and cache-aware code shape. Bias toward asymptotic wins first, then specialized hot-path code, then benchmark and JIT evidence.
 ---
 
 # High-Performance Java
 
-Use this skill for Java hot paths. Default bias: fewer allocations, fewer copies, less polymorphism, narrower code shape, stronger evidence.
+Use this skill for Java hot paths and algorithm-heavy Java. Default bias: asymptotic win first, then fewer allocations, fewer copies, less polymorphism, narrower code shape, stronger evidence.
 
 HotSpot-only v1. Baseline assumptions:
 - repo baseline: JDK 21
 - current local runtime may be newer
 - low-level claims stay provisional until benchmark + JIT evidence agree
+- algorithm/data-structure claims stay provisional until they match the actual workload constraints
 
 ## Core loop
 
-1. Identify the workload shape.
-2. Find the hot loop or hot call chain.
-3. Write the narrow fast path first.
-4. Push generic abstraction, materialization, and dispatch out of the loop.
-5. Benchmark before claiming improvement.
-6. Inspect HotSpot decisions before claiming JVM-level reasons.
+1. Identify the workload shape and constraints.
+2. Pick the algorithm and data structure that change the slope.
+3. Find the hot loop or hot call chain.
+4. Write the narrow fast path first.
+5. Push generic abstraction, materialization, and dispatch out of the loop.
+6. Benchmark before claiming improvement.
+7. Inspect HotSpot decisions before claiming JVM-level reasons.
 
 ## Default coding bias
 
+- Prefer an algorithmic win over a micro win.
+- Prefer data structures that fit the operation mix, memory budget, and key domain.
+- Prefer primitive-friendly layouts before boxed object graphs.
 - Prefer zero-copy over copy-transform-copy.
 - Prefer reuse over per-item allocation.
 - Prefer lazy traversal over full materialization.
@@ -33,15 +38,23 @@ HotSpot-only v1. Baseline assumptions:
 
 ## Hard rules
 
+- Do not micro-optimize a fundamentally wrong algorithm.
 - Do not defend a perf change with style arguments alone.
 - Do not claim “faster” without a measurement path.
 - Do not claim “JIT will optimize this” without checking inlining / compilation evidence.
+- Do not add a specialized library until you know what property it buys: fewer allocations, fewer copies, lower contention, off-heap layout, better primitive support, or a stronger algorithm.
 - Do not keep elegant-but-generic stream pipelines in verified hot loops.
 - Do not pay interface / visitor / wrapper overhead inside the hottest loop unless evidence shows it disappears.
+- Do not default to boxed `Map<K, V>` / `Set<T>` / `List<T>` shapes when primitive collections or flat arrays better fit the dominant path.
 
 ## Design checklist
 
 Ask these first:
+- What are `N`, `Q`, the update/query ratio, and the memory budget?
+- Is the main problem asymptotic complexity, cache locality, allocation pressure, branchiness, contention, or I/O?
+- What operation dominates: membership, counting, top-k, range query, join, shortest path, DP transition, parsing, encoding?
+- Can the key/value/state space stay primitive or bit-packed?
+- Can the workload become offline, batched, sorted, prefix-based, or compressed?
 - What allocates on the steady-state path?
 - What copies bytes, chars, arrays, or collections?
 - What materializes intermediate state that could stay streamed or cursor-based?
@@ -51,6 +64,18 @@ Ask these first:
 
 ## Workflow
 
+### 0) Pick the algorithmic shape
+
+- Estimate the real workload: input size, query count, mutation pattern, latency target, and memory ceiling.
+- Choose the algorithm and data structure before tuning loop syntax.
+- Favor contiguous, cache-friendly, primitive-heavy representations when semantics allow.
+- For dynamic programming, define state, transition cost, base case, iteration order, and whether state compression is possible.
+- For graph/range/string problems, look for offline transforms, prefix structures, monotonic structures, or specialized search before hand-tuning.
+
+Read these only when relevant:
+- [references/algorithms-data-structures.md](references/algorithms-data-structures.md) for algorithm and data-structure selection.
+- [references/advanced-coding-techniques.md](references/advanced-coding-techniques.md) for dynamic programming and advanced problem-solving patterns.
+
 ### 1) Shape the code for HotSpot
 
 - Split hot and cold paths.
@@ -84,10 +109,20 @@ When a benchmark moves, inspect what HotSpot actually did:
 
 Use sibling skill [hotspot-jit-forensics](../hotspot-jit-forensics/SKILL.md) for method-scoped C2 evidence. Use `async-profiler-java-macos` when wall/cpu/alloc evidence is needed on macOS.
 
-### 4) Report honestly
+### 4) Use libraries intentionally
+
+- Prefer the JDK first when it is close enough and operationally simpler.
+- Reach for specialized libraries when they remove boxing, copies, parser overhead, contention, or off-heap indirection the JDK cannot.
+- Check dependency health before adding a new library.
+- Benchmark the library choice against the simplest credible in-repo baseline.
+
+Library reference: [references/high-performance-java-libraries.md](references/high-performance-java-libraries.md).
+
+### 5) Report honestly
 
 Frame conclusions as:
 - hypothesis
+- algorithm/data-structure choice
 - benchmark result
 - JIT/profile evidence
 - confidence
@@ -99,21 +134,31 @@ If assembly is unavailable, say so and fall back to compilation logs, inlining d
 Use this skill when the user asks to:
 - remove allocation pressure from a parser, iterator, encoder, decoder, or query loop
 - make a Java path zero-copy or lazy
+- choose the right data structure for a Java workload
+- solve a dynamic programming, graph, interval, ranking, or range-query problem in Java under performance constraints
+- replace boxed collections with primitive or cache-friendly structures
+- choose between the JDK and specialized Java performance libraries
 - specialize code for one workload instead of many
 - explain whether a HotSpot optimization actually happened
 - ground a Java perf change in benchmark + C2 evidence
 
 ## Reference map
 
+- Algorithms and data structures: [references/algorithms-data-structures.md](references/algorithms-data-structures.md)
+- Advanced coding techniques: [references/advanced-coding-techniques.md](references/advanced-coding-techniques.md)
+- High-performance Java libraries: [references/high-performance-java-libraries.md](references/high-performance-java-libraries.md)
 - Coding rules: [references/coding-rules.md](references/coding-rules.md)
 - Evidence workflow: [references/evidence-workflow.md](references/evidence-workflow.md)
 - JDK version guardrails: [references/jdk-21-26-notes.md](references/jdk-21-26-notes.md)
 
 ## Output contract
 
 When you use this skill, the answer should usually include:
+- workload model and asymptotic bottleneck
+- algorithm and data-structure recommendation
 - hot-path hypothesis
 - concrete code-shape recommendation
+- library recommendation when a library meaningfully changes the design
 - benchmark command or benchmark evidence
 - JIT/profile evidence or the missing prerequisite
 - a confidence statement tied to the active JDK
@@ -1,4 +1,4 @@
 interface:
   display_name: "High-Performance Java"
-  short_description: "Concise hot-path Java coding skill"
-  default_prompt: "Use $high-performance-java to write or review a Java hot path with benchmark and HotSpot evidence."
+  short_description: "Hot-path Java plus algorithm/perf-library guidance"
+  default_prompt: "Use $high-performance-java to choose the right algorithm, data structure, library, and HotSpot-friendly code shape for a high-performance Java task."
@@ -0,0 +1,220 @@
+# Advanced Coding Techniques
+
+Use this reference when the problem needs more than basic loops and collections: dynamic programming, advanced search, state compression, offline transforms, or optimization patterns that materially change runtime.
+
+## Dynamic programming checklist
+
+Before writing code, define:
+- state: the minimum information needed to continue
+- transition: how one state moves to the next
+- base case: the smallest solved states
+- order: top-down memoization or bottom-up tabulation
+- objective: min, max, count, feasibility, reconstruction
+- memory plan: full table, rolling rows, bitset, or sparse map
+
+If any of those are fuzzy, the DP is not ready.
+
+## DP implementation bias in Java
+
+- Prefer flat primitive arrays over nested object graphs.
+- Flatten `dp[row][col]` into one array when locality matters.
+- Use sentinel values (`INF`, `-1`, impossible masks) instead of wrapper objects.
+- Compress dimensions aggressively when a transition only needs prior rows or prior prefixes.
+- Use iterative tabulation when recursion depth or call overhead is risky.
+- Use memoization when the reachable state space is sparse or pruning is strong.
+
+## Common DP families
+
+### 1D DP
+
+Use for:
+- linear decisions
+- prefix optimization
+- classic knapsack-style transitions
+
+Java notes:
+- Often compresses to one array.
+- Direction matters: reverse iterate for 0/1 knapsack; forward iterate for unbounded knapsack.
+
+### 2D grid / sequence DP
+
+Use for:
+- edit distance
+- LCS variants
+- path counting
+- interval composition
+
+Java notes:
+- Two rolling rows often replace the full matrix.
+- Keep row-major iteration consistent with memory layout.
+
+### Interval DP
+
+Use for:
+- merge cost
+- matrix chain multiplication
+- optimal parenthesization
+- palindrome partitioning
+
+Heuristic:
+- Try increasing interval length order.
+- Precompute reusable range costs.
+
+### Tree DP
+
+Use for:
+- subtree aggregation
+- rerooting
+- independent set / matching variants on trees
+
+Java notes:
+- Iterative traversal can avoid stack overflow.
+- Store parent/index arrays once; reuse buffers for passes.
+
+### DAG DP
+
+Use for:
+- longest path in DAG
+- path counts
+- dependency-ordered optimization
+
+Heuristic:
+- Topological order first, transitions second.
+
+### Bitmask DP
+
+Use for:
+- small `n` subset problems
+- travelling-salesman-style state
+- assignment and partition variants
+
+Java notes:
+- Use `int` masks up to 31 bits, `long` masks up to 63.
+- Precompute subset transitions when reused heavily.
+- Beware exponential memory growth; consider meet-in-the-middle.
+
+### Digit DP
+
+Use for:
+- counting numbers with digit constraints
+- lexicographic numeric constraints
+
+State usually includes:
+- position
+- tight/limited flag
+- started/leading-zero flag
+- problem-specific accumulator
+
+## DP optimization patterns
+
+### Prefix/suffix acceleration
+
+If a transition scans prior states, ask whether prefix minima/maxima/sums can reduce it from `O(n^2)` to `O(n)`.
+
+### Monotonic queue optimization
+
+Use when transitions need min/max over a sliding window.
+
+### Divide-and-conquer DP optimization
+
+Use when the optimal split point is monotonic across rows or columns.
+
+### Convex hull trick / Li Chao tree
+
+Use when transitions are of the form:
+- `dp[i] = min_j(m[j] * x[i] + b[j])`
+- `max` variant of the same
+
+Only use when the algebra really matches.
+
+### Bitset DP
+
+Use when boolean subset transitions can become word-parallel bit operations.
+
+Examples:
+- subset sum
+- knapsack feasibility
+- reachability layers
+
+### State compression
+
+Reduce dimensions by:
+- keeping only prior row/column
+- encoding booleans into bits
+- coordinate-compressing sparse values
+- using ids instead of objects
+
+## Search and optimization patterns
+
+### Binary search on answer
+
+Use when:
+- feasibility is monotonic
+- exact objective is hard but checking a threshold is easier
+
+### Meet-in-the-middle
+
+Use when:
+- brute force is `2^n`
+- `n` is small enough to split into two `2^(n/2)` halves
+
+### Branch and bound
+
+Use when:
+- you can compute tight upper/lower bounds
+- a good heuristic ordering prunes much of the tree
+
+### Iterative deepening
+
+Use when:
+- memory is tight
+- solution depth is unknown but usually shallow
+
+### Offline query processing
+
+Use when:
+- query order is irrelevant
+- sorting queries/events lets you reuse structure updates
+
+## Greedy and exchange-thinking
+
+Before building DP or search, test whether a greedy proof exists:
+- local choice stays globally optimal
+- exchange argument repairs any non-greedy optimal solution
+- matroid-like or interval-scheduling structure is present
+
+If greedy works, it often beats DP both asymptotically and operationally.
+
+## Range and sequence patterns
+
+- Sliding window: monotonic boundary expansion or contraction.
+- Two pointers: sorted arrays, pair/triple sums, dedup, partitioning.
+- Monotonic stack: next greater/smaller, histogram, span problems.
+- Difference arrays: batch range updates.
+- Prefix sums / xor / hashes: cheap repeated range queries.
+
+## Java-specific implementation notes
+
+- Avoid recursion for deep graphs, trees, or DP unless the depth bound is small.
+- Replace tuple objects with parallel arrays or packed longs in hot paths.
+- Pre-size arrays and reusable buffers for repeated test cases.
+- Be explicit about overflow; use `long` for counts/costs unless `int` is proven safe.
+- Separate correctness code from hot code paths once the algorithm is clear.
+
+## Problem-solving ladder
+
+When stuck, try this order:
+1. Can I sort or batch the work?
+2. Can I precompute prefix, suffix, or compressed state?
+3. Can a different data structure remove a nested loop?
+4. Is the problem actually graph, interval, or DP in disguise?
+5. Can the state shrink to primitives or bits?
+6. Can I prove greedy, monotonicity, or convexity?
+
+## Red flags
+
+- DP state includes fields that do not affect future transitions.
+- Memoization key is a heavyweight object when a few ints suffice.
+- Full `O(n^2)` table retained even though only one frontier is used.
+- Search explores symmetric states repeatedly.
+- A library data structure is used where a flat array plus sort is enough.