feat: schema generation#1178
Conversation
✨ Highlights
🧾 Changes by Scope
🔝 Top Files
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #1178 +/- ##
===========================================
- Coverage 82.12% 82.03% -0.09%
===========================================
Files 33 32 -1
Lines 3149 3090 -59
Branches 734 714 -20
===========================================
- Hits 2586 2535 -51
+ Misses 387 385 -2
+ Partials 176 170 -6
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
An automated preview of the documentation is available at https://1178.mrdocs.prtest2.cppalliance.org/index.html If more commits are pushed to the pull request, the docs will rebuild at the same URL. 2026-05-06 10:24:06 UTC |
cdda298 to
b9fac7c
Compare
|
Path coverage is OK, isn't it? I added the PR description. |
alandefreitas
left a comment
There was a problem hiding this comment.
What about documentation? How do I use the feature? What about the schema in the documentation? Do we have no schema files in the repository? How do we check if the files are up to date in CI? Some of the changes in the golden files are also kind of weird (like ids changing and things like that and some groups are just empty, which could probably be removed in the schema - I'm not sure).
The PR is great though. The only reason I left many comment is because the PR is huge. 😅
| */ | ||
| inline | ||
| std::string | ||
| toJson(dom::Value const& v, int indent = 0) |
There was a problem hiding this comment.
Don't we already have this functionality defined somewhere else in the Dom library?
There was a problem hiding this comment.
Yup, I missed it! Thanks for pointing that out.
| */ | ||
| inline | ||
| std::string | ||
| toJson(dom::Value const& v, int indent = 0) |
There was a problem hiding this comment.
Why is this file in Schemas if it's a general JSON emitter (not JSON schema emitter)?
There was a problem hiding this comment.
Right, it landed in Schemas/ only because the schema writer was its first user.
| */ | ||
| inline | ||
| std::string | ||
| toJson(dom::Value const& v, int indent = 0) |
There was a problem hiding this comment.
Why would the JSON schema go through dom::Value anyway?
There was a problem hiding this comment.
Hmm, right. That's not necessary.
| @@ -0,0 +1,761 @@ | |||
| // | |||
There was a problem hiding this comment.
I don't think we need public mrdocs/Schemas at all. Each generator (or category of generator) has an associated schema (or not). If even the generators aren't public, I see no reason for the scheme logic to be public.
| return entry.text; | ||
| } | ||
| } | ||
| return {}; |
There was a problem hiding this comment.
How do we ensure we never get here?
| /** Build the complete DOM JSON Schema document. | ||
| */ | ||
| inline dom::Value | ||
| buildDomSchema() |
There was a problem hiding this comment.
DOM Schema means JSON schema?
There was a problem hiding this comment.
Yes. "DOM Schema" here is the JSON Schema describing the DOM that Handlebars templates see. I'll rename buildDomSchema to buildDomJsonSchema.
| // ------------------------------------------------------- | ||
| // Header | ||
| // ------------------------------------------------------- | ||
| line("#"); |
There was a problem hiding this comment.
This main function is kind of hard for a human to read. Is this one of these cases where instead of using reflection to improve the schema, we make the function very complex to match the bad pattern we used to have before?
There was a problem hiding this comment.
Indeed. The function is long because the schema is currently describing what XMLWriter happens to emit. That's bad.
| output, suitable for writing directly to `mrdocs.rnc`. | ||
| */ | ||
| inline std::string | ||
| buildRncSchema() |
There was a problem hiding this comment.
Should we be using RNC for XML at all? If we have reflection, why not already output the final schema?
There was a problem hiding this comment.
Great idea! :-)
| properties.set("class", withDesc(std::move(classSchema), "class")); | ||
|
|
||
| dom::Object boolSchema; | ||
| boolSchema.set("type", "boolean"); |
There was a problem hiding this comment.
Unfortunately, no. Those aren't C++ members of Symbol. They are synthesized at serialization time by tag_invoke:
io.map("class", std::string("symbol"));
io.map("isRegular", I.Extraction == ExtractionMode::Regular);
...
|
|
||
| } // (anon) | ||
|
|
||
| TEST_SUITE( |
There was a problem hiding this comment.
No tests for the other ones?
There was a problem hiding this comment.
Oops, I'll add them.
a6d5fab to
4340b1b
Compare
Add a --schemas[=<dir>] option that writes a JSON Schema file (mrdocs-dom-schema.json) describing every object and field available to Handlebars templates. The schema is derived from the same compile-time reflection metadata used by MapReflectedType.hpp, so it stays in sync with the code automatically. The option requires no config file or source files — it writes the schema and exits immediately.
OperatorKind was the only enum serialized as a raw integer. All other enums serialize as human-readable strings. Change tag_invoke to use getOperatorName, consistent with the rest.
--schemas now writes both mrdocs-dom-schema.json (Handlebars DOM) and mrdocs.rnc (XML output). The XML schema mirrors XMLWriter.cpp's serialization.
…ClassKind
Replace manual toString and tag_invoke overloads with
MRDOCS_DESCRIBE_ENUM for four enums whose kebab-case names match the
existing string representations.
The XML writer now emits these fields (e.g. <access>public</access>,
<constexpr>constexpr</constexpr>) where they were previously silently
skipped. None/none sentinel values are suppressed via a generic
has_none_enumerator check.
TypeKind stays manual because toKebabCase("LValueReference") produces
"l-value-reference", not the established "lvalue-reference".
… --schemas This guarantees the RELAX NG schema stays in sync with the C++ type definitions. Every CI run now validates all golden test XML files against a schema derived from the same reflection metadata that produces the XML.
This adds a small lookup table keyed by (typeName, memberName) carrying hand-written descriptions for the DOM seen by Handlebars templates which become "description" fields in the JSON schema.
The newly-added JsonEmitter.hpp duplicated functionality already provided by `dom::JSON::stringify`. This drops the new header and uses the existing `stringify`.
The new function name is more descriptive. Contextually, this expands the doc-comment to spell out what the function returns.
The schema headers are only used by ToolMain and the unit tests. They don't need to live under include/mrdocs/.
4340b1b to
73561af
Compare
…output XMLWriter silently dropped three kinds of data: - SpecializationName::TemplateArgs. Polymorphic<Name> fell into writePolymorphic's generic branch with T deduced to the base Name (Polymorphic::operator* returns a base reference), so only Name's own described members were emitted. Template specializations rendered as <name>SmallVector</name> rather than <specialization-name>SmallVector<...></specialization-name>. - NoexceptInfo. No MRDOCS_DESCRIBE_STRUCT — it serializes to a string via tag_invoke, which writeElement had no path for. Functions with noexcept-specifications produced no <noexcept> element. - ExplicitInfo. Same pattern. explicit constructors and conversion operators produced no <explicit> element. This adds a NameKind branch in writePolymorphic (using a "-name" suffix on the kind tag to disambiguate from the Name::Identifier field), and adds a NoexceptInfo/ExplicitInfo branch in writeElement that emits the toString() value as text, skipping the empty case. Also update RncSchemaWriter to match: Polymorphic<Name> -> AnyName, drop NoexceptInfo/ExplicitInfo from the omit list. Most XML golden fixtures regenerate to include the now-emitted elements.
The XMLTags helper in src/lib/Gen/Xml/ contained two pieces of machinery that aren't specific to XML doc generation: an XML escaper (xmlEscape) and a tag/indent stream emitter. A RELAX NG schema writer, which will be introduced with the next commit, also needs them. So, we factored them out.
The --schemas option now writes a RELAX NG XML document directly. This gets rid of the trang RNC->RNG conversion step. Which, in turn, means we no longer need Java. The bootstrap script dependency on Java will be removed with the next commit.
The bootstrap script checked for Java because the build needed it to run trang.jar, which converted the RNC schema to RNG. But trang.jar is no longer used (the --schemas option directly emits a .rng), so Java is no longer needed.
The schema writer emits a `description` field for every type and every described member it touches, looking it up from DomDescriptions.hpp. When the lookup failed, it silently returned an empty string, so forgetting a description caused an undocumented `$defs` entry to be silently emitted. This adds an assert that fires when the lookup of a type or a member finds no entry. This allowed finding many missing entries, which have been added. Any future described type added to the schema trips the assert at build time until descriptions for it and its members are provided.
…ation targets Both schema files are now committed under docs/, parallel to the existing docs/mrdocs.schema.json (the YAML config schema) and are exposed to the Antora docs site as downloadable attachments. Two new CTest targets, `rng-schema-check` and `dom-schema-check`, run `cmake -E compare_files` between the freshly-generated schemas in the build tree and the checked-in copies; drift fails the test. The schemas custom_command is lifted out of the LibXml2 conditional so the freshness checks run independently of whether libxml2 is available. .gitattributes pins the two schema files to LF line endings, because --schemas emits LF line endings and we do a byte-for-byte comparison.
This replaces the hand-written DOM reference with one generated from mrdocs-dom-schema.json. A new Antora extension walks the file's `$defs` in source order and emits one section per type.
73561af to
f9d5f63
Compare


This PR addes a
--schemas[=<dir>]option that generates a mrdocs-dom-schema.json (Handlebars DOM) and mrdocs.rnc (XML output). Both schemas are generated from the reflection metadata. The old, hand-written, mrdocs.rnc is replaced entirely in CI.