Use this workflow before making strong performance claims.
- Reproduce with the local benchmark wrapper.
scripts/run-single-benchmark.sh --module <module> --class <fqcn> --method <benchmarkMethod>- If the benchmark moves but cause is unclear:
- use
--enable-jfrfor benchmark-side JFR capture - or use
async-profiler-java-macosfor cpu / alloc / wall evidence on macOS
- use
- If code shape or JIT behavior is the question:
- use hotspot-jit-forensics
- capture compilation tier, inlining decisions, and method-scoped C2 evidence
- Build the smallest reproducible JMH or app-level benchmark.
- Capture baseline result.
- Change code shape.
- Capture candidate result with same JVM, flags, input size, and warmup assumptions.
- If the delta matters, inspect JIT evidence:
java \
-XX:+UnlockDiagnosticVMOptions \
-XX:+LogCompilation \
-XX:LogFile=jit.xml \
-XX:+PrintCompilation \
-jar app.jarIf assembly or per-method diagnostics are needed, move to focused compiler directives and the hotspot-jit-forensics workflow.
Report these five items:
- benchmark delta: throughput/latency before vs after
- allocation delta: lower / unchanged / unknown
- JIT evidence: inline success/failure, tier, bailout, intrinsic, vectorization clue, or “not inspected”
- exact command or benchmark selector
- confidence: high / medium / low
- High: repeatable benchmark delta plus matching profile/JIT evidence
- Medium: repeatable benchmark delta without definitive low-level proof
- Low: one run, noisy run, or JVM explanation not verified
Do not stop at “assembly unavailable”.
Still collect:
jit.xml- compiler directives output
PrintCompilation/ inlining diagnostics- async-profiler or JFR evidence
Then say the exact missing piece: for example hsdis not installed or assembly printing not enabled.