Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
65 commits
Select commit Hold shift + click to select a range
65aa608
initial commit
hmottestad Apr 17, 2026
a12aec7
benchmark
hmottestad Apr 17, 2026
f3306e0
everything else
hmottestad Apr 17, 2026
b065b37
wip
hmottestad Apr 17, 2026
385f650
wip
hmottestad Apr 17, 2026
507069b
wip
hmottestad Apr 17, 2026
eab1486
wip
hmottestad Apr 17, 2026
e461fdd
wip
hmottestad Apr 17, 2026
b6122c4
wip
hmottestad Apr 18, 2026
db7ccb4
wip
hmottestad Apr 18, 2026
54862c4
learned
hmottestad Apr 18, 2026
965c62f
learned
hmottestad Apr 18, 2026
84325b1
learned
hmottestad Apr 18, 2026
1b8ba70
skill
hmottestad Apr 18, 2026
d01550d
fixes
hmottestad Apr 18, 2026
fb7d7e7
fixes
hmottestad Apr 19, 2026
cb1c672
improved loading to 1 million statements per second
hmottestad Apr 19, 2026
99a9de2
improved loading to 1 million statements per second
hmottestad Apr 19, 2026
a6db343
improved loading to 1 million statements per second
hmottestad Apr 19, 2026
a820100
improvef query explanation
hmottestad Apr 19, 2026
23d1373
improvef query explanation
hmottestad Apr 19, 2026
9a8e503
wip
hmottestad Apr 20, 2026
e5ddae0
wip
hmottestad Apr 20, 2026
fb7c471
wip
hmottestad Apr 20, 2026
adbd443
slower and faster
hmottestad Apr 20, 2026
be83a0f
slower and faster
hmottestad Apr 20, 2026
0de0f79
slower and faster
hmottestad Apr 20, 2026
8201da8
queries are faster, but optimization takes the most time
hmottestad Apr 20, 2026
39b97e0
wip
hmottestad Apr 21, 2026
3f0d4c4
remove some skills
hmottestad Apr 21, 2026
ada255c
remove some skills
hmottestad Apr 21, 2026
0675c7c
good query plans, but large overhead for optimizing the query
hmottestad Apr 21, 2026
6bf9c57
lower query optimizer overhead
hmottestad Apr 21, 2026
c430058
wip
hmottestad Apr 22, 2026
3d8f152
very good results with few regressions
hmottestad Apr 22, 2026
b389762
very good results with few regressions
hmottestad Apr 22, 2026
67b9888
very good results with few regressions
hmottestad Apr 22, 2026
71430df
very good results with few regressions
hmottestad Apr 22, 2026
45c455b
Merge remote-tracking branch 'origin/develop' into sketch-based-optim…
hmottestad Apr 22, 2026
d00c92e
big rewrite
hmottestad Apr 23, 2026
e04feb4
big rewrite
hmottestad Apr 23, 2026
e4f50e3
fine tune rewrite
hmottestad Apr 23, 2026
fe4130b
fine tune rewrite
hmottestad Apr 23, 2026
e97c4ed
fine tune rewrite
hmottestad Apr 23, 2026
8c82b2e
fine tune rewrite
hmottestad Apr 24, 2026
cc889b0
fine tune rewrite
hmottestad Apr 24, 2026
2acb7ca
lmdb store gets its own optimizer pipeline and custom optimizers
hmottestad Apr 24, 2026
774b116
lmdb store gets its own optimizer pipeline and custom optimizers
hmottestad Apr 24, 2026
7d1e223
wip
hmottestad Apr 24, 2026
26473bf
wip
hmottestad Apr 24, 2026
d968ff5
skills and scripts to help diagnose regressions
hmottestad Apr 24, 2026
8618d7a
keep fine tuning
hmottestad Apr 24, 2026
d39bef1
revert query join optimizer to the old implementation
hmottestad Apr 24, 2026
c06f66c
continue
hmottestad Apr 24, 2026
c010e4b
continue
hmottestad Apr 24, 2026
49b1538
memory management
hmottestad Apr 24, 2026
dc0c9f5
mapped everything
hmottestad Apr 24, 2026
58101f7
wip
hmottestad Apr 25, 2026
6937938
Merge branch 'refs/heads/develop' into sketch-based-optimizer-4-new-u…
hmottestad Apr 25, 2026
d97d5f1
better memory management
hmottestad Apr 25, 2026
f2b45fe
maybe better in some cases but also mostly worse
hmottestad Apr 26, 2026
6d2f28f
maybe better in some cases but also mostly worse
hmottestad Apr 26, 2026
3d6cb91
maybe better in some cases but also mostly worse
hmottestad Apr 26, 2026
c8f67e3
SOCIAL_MEDIA +152.4%
hmottestad Apr 26, 2026
e7fe55c
best overall
hmottestad Apr 27, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 20 additions & 4 deletions .codex/skills/jmh-benchmark-compare/scripts/jmh_compare_core.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,11 @@
re.IGNORECASE,
)
METRIC_COLUMNS = {"Score", "Error", "Cnt"}
JMH_MODES = {"thrpt", "avgt", "sample", "ss", "all"}
STRICT_NUM_RE = re.compile(
r"[-+]?(?:(?:\d+(?:,\d{3})*|\d+)(?:\.\d+)?|\.\d+)(?:[eE][-+]?\d+)?|[-+]?(?:inf|nan)",
re.IGNORECASE,
)
DATE_PATTERNS = (
re.compile(
r"(20\d{2})[-_]?([01]\d)[-_]?([0-3]\d)[Tt _-]?([0-2]\d)[-_:]?([0-5]\d)(?:[-_:]?([0-5]\d))?"
Expand Down Expand Up @@ -123,17 +128,26 @@ def is_int_token(text: str) -> bool:
return bool(re.fullmatch(r"[+-]?\d+", text or ""))


def is_numeric_metric_token(text: str) -> bool:
value = (text or "").strip()
if value.endswith("±"):
value = value[:-1].strip()
return bool(STRICT_NUM_RE.fullmatch(value))


def has_valid_metric_values(row: Dict[str, str], columns: Sequence[str]) -> bool:
if "Mode" in columns and row.get("Mode", "").strip().lower() not in JMH_MODES:
return False
for col in columns:
value = row.get(col, "")
if col == "Score":
if extract_numeric(value) is None:
if not is_numeric_metric_token(value):
return False
elif col == "Cnt" and value:
if not is_int_token(value):
return False
elif col == "Error" and value:
if extract_numeric(value) is None:
if not is_numeric_metric_token(value):
return False
return True

Expand Down Expand Up @@ -227,8 +241,6 @@ def parse_file(path: Path, label: str, id_columns: Optional[str], timestamp_sour
for line in lines[header_idx + 1 :]:
stripped = line.strip()
if not stripped:
if saw_data:
break
continue
if stripped.startswith("#"):
continue
Expand All @@ -244,6 +256,10 @@ def parse_file(path: Path, label: str, id_columns: Optional[str], timestamp_sour
if saw_data:
break
continue
if not has_valid_metric_values(row, columns):
if saw_data:
break
continue
score = extract_numeric(row.get("Score", ""))
if score is None:
if saw_data and (stripped.startswith("Result") or stripped.startswith("Secondary result")):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ def test_missing_cnt_and_error_values_do_not_shift_score(self) -> None:
repo_root = SCRIPT_DIR.parents[3]
result_file = (
repo_root
/ "core/sail/lmdb/src/test/java/org/eclipse/rdf4j/sail/lmdb/benchmark/results-2026-03-01.md"
/ "core/sail/lmdb/src/test/java/org/eclipse/rdf4j/sail/lmdb/benchmark/theme-query-benchmark-results/results-2026-03-01.md"
)

parsed = core.parse_file(result_file, "results-2026-03-01", None, "mtime")
Expand All @@ -33,7 +33,7 @@ def test_plus_minus_error_rows_keep_score_numeric(self) -> None:
repo_root = SCRIPT_DIR.parents[3]
result_file = (
repo_root
/ "core/sail/lmdb/src/test/java/org/eclipse/rdf4j/sail/lmdb/benchmark/results-2026-03-04.md"
/ "core/sail/lmdb/src/test/java/org/eclipse/rdf4j/sail/lmdb/benchmark/theme-query-benchmark-results/results-2026-03-04.md"
)

parsed = core.parse_file(result_file, "results-2026-03-04", None, "mtime")
Expand Down Expand Up @@ -74,6 +74,58 @@ def test_compare_uses_column_names_when_key_order_differs(self) -> None:
self.assertAlmostEqual(row["Score [right]"], 20.0, places=3)
self.assertAlmostEqual(row["Diff % [right - left]"], 100.0, places=3)

def test_blank_lines_between_jmh_rows_do_not_end_table(self) -> None:
results = "\n".join(
[
"Benchmark (themeName) (z_queryIndex) Mode Score Units",
"ThemeQueryBenchmark.executeQuery MEDICAL_RECORDS 0 avgt 10.0 ms/op",
"",
"ThemeQueryBenchmark.executeQuery SOCIAL_MEDIA 8 avgt 20.0 ms/op",
]
)

with tempfile.TemporaryDirectory() as tmpdir:
result_file = Path(tmpdir) / "results.txt"
result_file.write_text(results, encoding="utf-8")

parsed = core.parse_file(result_file, "results", None, "mtime")

self.assertEqual(len(parsed.rows), 2)
key = ("ThemeQueryBenchmark.executeQuery", "SOCIAL_MEDIA", "8", "avgt", "ms/op")
self.assertIn(key, parsed.score_by_key)
self.assertAlmostEqual(parsed.score_by_key[key], 20.0, places=3)

def test_non_jmh_text_after_blank_does_not_parse_as_rows(self) -> None:
results = "\n".join(
[
"Benchmark (themeName) (z_queryIndex) Mode Cnt Score Error Units",
"ThemeQueryBenchmark.executeQuery MEDICAL_RECORDS 0 avgt 10.0 ms/op",
"",
"ThemeQueryBenchmark.executeQuery SOCIAL_MEDIA 8 avgt 20.0 ms/op",
"",
"Initializing state: k=64, subjectBuckets=4096, predicateBuckets=64, "
"objectBuckets=4096, contextBuckets=16, contextPairSketchesEnabled=false",
"Projection (resultSizeActual=1, hasNextCallCountActual=2)",
]
)

with tempfile.TemporaryDirectory() as tmpdir:
result_file = Path(tmpdir) / "results.txt"
result_file.write_text(results, encoding="utf-8")

parsed = core.parse_file(result_file, "results", None, "mtime")

self.assertEqual(
[
row["Benchmark"]
for row in parsed.rows
],
[
"ThemeQueryBenchmark.executeQuery",
"ThemeQueryBenchmark.executeQuery",
],
)


if __name__ == "__main__":
unittest.main()
11 changes: 11 additions & 0 deletions .codex/skills/mvnf/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,3 +30,14 @@ If the test run fails, it prints the list of Surefire/Failsafe report files unde
- `--module <path>`: Force the module when the test class name exists in multiple modules.
- `--it`: Treat the selector as an integration test and pass it via `-Dit.test=...`.
- `--no-offline`: Run Maven commands without `-o` (useful if offline resolution fails).

## LMDB regression speedup note

For LMDB theme regression/snapshot tests, enable persistent prepared stores to skip repeated dataset rebuilds:

- `-Drdf4j.lmdb.themeRegression.persistentStore.enabled=true`
- Optional root override: `-Drdf4j.lmdb.themeRegression.persistentStore.root=persistent-lmdb-theme-store`

`mvnf.py` does not forward arbitrary `-D` flags today, so use direct Maven for this mode, for example:

- `mvn -o -Dmaven.repo.local=.m2_repo -pl core/sail/lmdb -Dtest=LmdbThemeQueryRegressionTest#socialMediaFiveCycleInterleavesValuesWithFollowsEdges -Drdf4j.lmdb.themeRegression.persistentStore.enabled=true test`
74 changes: 73 additions & 1 deletion .codex/skills/query-plan-snapshot-cli/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,82 @@ description: Use QueryPlanSnapshotCli to capture and compare RDF4J query plans,

# query-plan-snapshot-cli

Use this skill to run reproducible query-plan captures and classify likely regression/improvement signals.
Use this skill to run reproducible query-plan captures, triage historical theme-query benchmark results, and classify likely regression/improvement signals.

## Fast workflow

1. Capture raw benchmark output into a normalized result file when needed.
2. Analyze the newest dated run against historical results.
3. Drill into the fastest known runs for a specific theme/query.
4. If needed, capture baseline/candidate plan snapshots and diff them semantically.

## History triage

Result files live in:

- `core/sail/lmdb/src/test/java/org/eclipse/rdf4j/sail/lmdb/benchmark/theme-query-benchmark-results`

Normalize raw JMH output into a new result file:

- `pbpaste | scripts/theme-query-benchmark-results.sh capture`
- `scripts/theme-query-benchmark-results.sh capture raw-jmh.txt`

Analyze only the queries that are more than 20% slower than history:

- `core/sail/lmdb/src/test/java/org/eclipse/rdf4j/sail/lmdb/benchmark/theme-query-benchmark-results/analyze-theme-query-history.sh`

Sort regressions from biggest to smallest:

- `core/sail/lmdb/src/test/java/org/eclipse/rdf4j/sail/lmdb/benchmark/theme-query-benchmark-results/analyze-theme-query-history.sh --sort-regressions`

Only print the top N regressions:

- `core/sail/lmdb/src/test/java/org/eclipse/rdf4j/sail/lmdb/benchmark/theme-query-benchmark-results/analyze-theme-query-history.sh --top 10`

Analyze every latest query, including current-run wins over previous best:

- `core/sail/lmdb/src/test/java/org/eclipse/rdf4j/sail/lmdb/benchmark/theme-query-benchmark-results/analyze-theme-query-history.sh --all`

Drill into the three fastest known runs for one theme/query and print optimized plan/query when present:

- `core/sail/lmdb/src/test/java/org/eclipse/rdf4j/sail/lmdb/benchmark/theme-query-benchmark-results/analyze-theme-query-history.sh --theme PHARMA --query-index 10`

Interpretation:

- Default mode: newest dated file only for the “latest” baseline; compares against all other `results-*.md`, including `results-develop.md` and `results-main-branch.md`, but prints only queries where latest is more than 20% slower than historical best.
- `--sort-regressions`: flat regression list, biggest slowdown first.
- `--top N`: top N regressions only; implies regression sorting.
- `--all`: prints every latest query; if latest is a new best it prints how much faster it is than the previous best.
- Query detail mode: top three runs sorted by score ascending; ties prefer richer files with plan/query content.
- `plan no | query yes`: optimized query rendered, no physical plan block in that result file.
- `plan no | query no`: summary-only run or no per-query capture in that file.

Use this path when the goal is optimizer-loop work: find the fastest known plan/query for a theme/query, then compare new runs back to that history before touching production logic.

## Fast regression test loop (persistent LMDB theme stores)

Theme regression/snapshot tests in `core/sail/lmdb` now support reusing a prepared LMDB store across runs.

- Enable persistent reuse:
- `-Drdf4j.lmdb.themeRegression.persistentStore.enabled=true`
- Optional custom root directory:
- `-Drdf4j.lmdb.themeRegression.persistentStore.root=persistent-lmdb-theme-store`
- Default root directory:
- `persistent-lmdb-theme-store`

Behavior:

- If the store has expected `triples/data.mdb` and `values/data.mdb` sizes (from `expected-db-file-sizes.properties`), tests reuse it and skip rebuild/ingest.
- If sizes mismatch or the marker file is missing/invalid, tests rebuild the store, then refresh the expected-size file.

Example focused run:

- `mvn -o -Dmaven.repo.local=.m2_repo -pl core/sail/lmdb -Dtest=LmdbThemeQueryRegressionTest#socialMediaFiveCycleInterleavesValuesWithFollowsEdges -Drdf4j.lmdb.themeRegression.persistentStore.enabled=true test`

## Snapshot diff workflow

Use this when you need semantic plan diffs between two controlled captures of the same query.

1. Capture baseline run (main/reference commit).
2. Capture candidate run (changed commit) with same query selector + `--query-id`.
3. Produce semantic diff (`--compare-existing`).
Expand Down
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -56,4 +56,7 @@ e2e/test-results
.serena/
.vscode
/.codex/environments/environment.toml
/.m2_repo_linux_j25
improved-optimizers-query-rewrite-sketch-based-lmdb-page-walking/
/.m2_repo_linux_j25/
/core/sail/lmdb/persistent-lmdb-theme-store/
core/sail/lmdb/persistent-lmdb-theme-store
9 changes: 4 additions & 5 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ Before taking any action (either tool calls *or* responses to the user), you mus

1.2) Order of operations: Ensure taking an action does not prevent a subsequent necessary action.

1.2.1) The user may request actions in a random order, but you may need to reorder operations to maximize successful completion of the task.
1.2.1) The user may request actions in a random order, but you may need to reorder operations to maximize successful completion of the task.

1.3) Other prerequisites (information and/or actions needed).

Expand Down Expand Up @@ -48,9 +48,9 @@ Before taking any action (either tool calls *or* responses to the user), you mus

7.2) Avoid premature conclusions: There may be multiple relevant options for a given situation.

7.2.1) To check for whether an option is relevant, reason about all information sources from #5.
7.2.1) To check for whether an option is relevant, reason about all information sources from #5.

7.2.2) You may need to consult the user to even know whether something is applicable. Do not assume it is not applicable without checking.
7.2.2) You may need to consult the user to even know whether something is applicable. Do not assume it is not applicable without checking.

7.3) Review applicable sources of information from #5 to confirm which are relevant to the current state.

Expand Down Expand Up @@ -288,7 +288,7 @@ Plan
1. **Compile deps fast (skip tests):**
`mvn -o -Dmaven.repo.local=.m2_repo -pl <module> -am -Pquick clean install`
2. **Run tests:**
`mvn -o -Dmaven.repo.local=.m2_repo -pl <module> verify | tail -500`
`python3 .codex/skills/mvnf/scripts/mvnf.py <module> --retain-logs --stream` or `mvn -o -Dmaven.repo.local=.m2_repo -pl <module> verify | tail -500`

It is illegal to `-am` when running tests!
It is illegal to `-q` when running tests!
Expand Down Expand Up @@ -677,7 +677,6 @@ Immediately after creating any new Java source file, add the signature comment (
* Slow tests (by module):
`mvn -o -Dmaven.repo.local=.m2_repo -pl <module> verify -PslowTestsOnly,-skipSlowTests | tail -500`
* Slow tests (specific test):

* `mvn -o -Dmaven.repo.local=.m2_repo -pl core/sail/shacl -PslowTestsOnly,-skipSlowTests -Dtest=ClassName#method verify | tail -500`
* Integration tests (entire repo):
`mvn -o -Dmaven.repo.local=.m2_repo verify -PskipUnitTests | tail -500`
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
*
* SPDX-License-Identifier: BSD-3-Clause
******************************************************************************/
// Some portions generated by Codex

package org.eclipse.rdf4j.model.impl;

Expand Down
Loading
Loading