Skip to content

Commit 4e47eba

Browse files
jgmclaude
andcommitted
scanners: avoid transient write when sentinel already NUL.
_scan_at patched the byte at ptr[c->len] with '\0' around every scanner invocation so that the re2c scanners (built with yyfill:enable = 0) could use it as a stopping sentinel. In the common case the chunk is backed by a cmark_strbuf whose data[size] is already '\0', so the write was an idempotent mutation of shared backing storage. Skip it when the byte is already '\0' to preserve const-correctness and reentrancy on that path; retain the patch fallback for chunks that are not already NUL-terminated. Co-Authored-By: Claude <noreply@anthropic.com>
1 parent 8c9dcdd commit 4e47eba

2 files changed

Lines changed: 28 additions & 6 deletions

File tree

src/scanners.c

Lines changed: 14 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,9 +13,20 @@ bufsize_t _scan_at(bufsize_t (*scanner)(const unsigned char *), cmark_chunk *c,
1313
} else {
1414
unsigned char lim = ptr[c->len];
1515

16-
ptr[c->len] = '\0';
17-
res = scanner(ptr + offset);
18-
ptr[c->len] = lim;
16+
// The re2c scanners are built with yyfill:enable = 0, so they
17+
// require a NUL sentinel at ptr[c->len]. In the common case the
18+
// chunk is backed by a cmark_strbuf which is already NUL-terminated
19+
// at that position, so we avoid the transient write entirely (it
20+
// would otherwise mutate shared backing storage and break both
21+
// const-correctness and reentrancy). Only when the sentinel is
22+
// missing do we fall back to patching the byte around the call.
23+
if (lim == '\0') {
24+
res = scanner(ptr + offset);
25+
} else {
26+
ptr[c->len] = '\0';
27+
res = scanner(ptr + offset);
28+
ptr[c->len] = lim;
29+
}
1930
}
2031

2132
return res;

src/scanners.re

Lines changed: 14 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,9 +12,20 @@ bufsize_t _scan_at(bufsize_t (*scanner)(const unsigned char *), cmark_chunk *c,
1212
} else {
1313
unsigned char lim = ptr[c->len];
1414

15-
ptr[c->len] = '\0';
16-
res = scanner(ptr + offset);
17-
ptr[c->len] = lim;
15+
// The re2c scanners are built with yyfill:enable = 0, so they
16+
// require a NUL sentinel at ptr[c->len]. In the common case the
17+
// chunk is backed by a cmark_strbuf which is already NUL-terminated
18+
// at that position, so we avoid the transient write entirely (it
19+
// would otherwise mutate shared backing storage and break both
20+
// const-correctness and reentrancy). Only when the sentinel is
21+
// missing do we fall back to patching the byte around the call.
22+
if (lim == '\0') {
23+
res = scanner(ptr + offset);
24+
} else {
25+
ptr[c->len] = '\0';
26+
res = scanner(ptr + offset);
27+
ptr[c->len] = lim;
28+
}
1829
}
1930

2031
return res;

0 commit comments

Comments
 (0)