feat: schema generation by gennaroprota · Pull Request #1178 · cppalliance/mrdocs

gennaroprota · 2026-04-16T15:07:10Z

This adds a --schemas[=<dir>] option that emits two schemas derived from the reflection metadata in the source tree: mrdocs-dom-schema.json, which describes the Handlebars DOM produced by the generator, and mrdocs.rng (RelaxNG), which describes the XML output. The hand-written mrdocs.rnc is removed and replaced by the generated .rng everywhere CI used it, eliminating drift between the schema and the actual output.

The schemas are produced by walking the reflection types and member descriptions in the source tree, so they stay in sync with the code automatically as new metadata types are added. The DOM reference documentation is switched from a hand-maintained page to one generated by an Antora extension that reads the JSON schema.

A generic XML emitter is factored out of the existing XMLTags / XMLWriter code into src/lib/Support/Xml.{cpp,hpp}, and several reflection types (AccessKind, ConstexprKind, OperatorKind, StorageClassKind, ParamDirection, TypeKind) are made describable through new MapReflectedType / MergeReflectedType / TypeTraits headers in include/mrdocs/Support/. Three small fixes ride along: OperatorKind is serialized as a string in the DOM (not the underlying enum), the XML emitter no longer drops template args / noexcept / explicit specifiers, and asserts fire when a reflection type or member lacks a description.

Changes

Source: New src/lib/Schemas/ with DomDescriptions.hpp, DomSchemaWriter.hpp, RngSchemaWriter.{cpp,hpp}. Generic XML emitter extracted into src/lib/Support/Xml.{cpp,hpp} and consumed by Gen/xml/XMLTags and XMLWriter (which shrink correspondingly). --schemas wired through tool/ToolArgs.{cpp,hpp} and tool/ToolMain.cpp. Several specifier headers in include/mrdocs/Metadata/ refactored alongside new MapReflectedType, MergeReflectedType, and TypeTraits headers in include/mrdocs/Support/. Schema-writer headers live in src/lib/Schemas/ (deliberately kept out of the public API).
Tests: New src/test/Schemas/DomSchemaWriter.cpp covering the JSON schema writer end-to-end.
Golden tests: ~286 XML fixture diffs reflecting the newly emitted attributes (template args, noexcept, explicit) that the old XML writer was silently dropping. These are intentional output corrections.
Docs: New docs/modules/ROOT/pages/schemas.adoc documents --schemas. docs/extensions/dom-reference.js renders the DOM reference page from the JSON schema; the corresponding hand-written section was removed from generators.adoc. mrdocs-dom-schema.json and mrdocs.rng are also linked as downloadable attachments.
Build: mrdocs.rnc removed; CMakeLists.txt updated so mrdocs.rng and mrdocs-dom-schema.json are emitted and verified during the build. The Java prerequisite (previously needed to process mrdocs.rnc) is removed.
Breaking changes: The XML schema artifact distributed with mrdocs is now a .rng instead of .rnc: downstream tooling that consumes the schema file by name will need to update. The generated XML output now includes template args, noexcept, and explicit attributes that were silently omitted before; consumers parsing the XML and not expecting those attributes may need to adjust.

Testing

src/test/Schemas/DomSchemaWriter.cpp exercises the JSON schema writer against a fixed set of reflection types.
The committed mrdocs.rng and mrdocs-dom-schema.json files are paired with two CI verification steps that regenerate them from the current source and fail if the on-disk version drifts from the regenerated one. This keeps the schemas in sync with the code on every build rather than relying on a contributor remembering to regenerate.
The ~286 golden-test XML diffs serve as regression coverage for the newly emitted attributes: any future omission of template args / noexcept / explicit will cause the golden tests to fail.
An assert fires when a reflection type or member lacks a description, so any future addition to the reflection metadata that ships without schema metadata trips immediately in CI rather than silently producing an under-specified schema.

Documentation

Two pieces:

docs/modules/ROOT/pages/schemas.adoc documents the new --schemas option.
docs/extensions/dom-reference.js is an Antora extension that renders the DOM reference from mrdocs-dom-schema.json; the hand-written DOM section in generators.adoc was removed in favor of this generated page, so DOM-reference docs now stay in sync with the code automatically.

github-actions · 2026-04-16T15:07:34Z

✨ Highlights

🧪 Existing golden tests changed (behavior likely shifted)

🧾 Changes by Scope

Scope	Lines Δ%	Lines Δ	Lines +	Lines -	Files Δ	Files +	Files ~	Files ↔	Files -
🥇 Golden Tests	52%	16008	9896	6112	254	-	254	-	-
📄 Docs	31%	9460	8577	883	12	7	5	-	-
🛠️ Source	11%	3362	3016	346	30	6	24	-	-
🏗️ Build	3%	1034	26	1008	2	-	1	-	1
🧪 Unit Tests	1%	320	320	-	1	1	-	-	-
🔧 Toolchain	1%	250	60	190	9	-	8	-	1
🔧 Toolchain Tests	1%	222	20	202	7	-	7	-	-
🧰 Tooling	<1%	36	2	34	3	-	1	-	2
⚙️ CI	<1%	2	2	-	1	-	1	-	-
Total	100%	30694	21919	8775	319	14	301	-	4

Legend: Files + (added), Files ~ (modified), Files ↔ (renamed), Files - (removed)

🔝 Top Files

docs/mrdocs-dom-schema.json (Docs): 5591 lines Δ (+5591 / -0)
test-files/golden-tests/symbols/record/class-template-specializations-1.xml (Golden Tests): 3903 lines Δ (+2718 / -1185)
docs/mrdocs.rng (Docs): 2757 lines Δ (+2757 / -0)

Generated by 🚫 dangerJS against 21452d4

codecov · 2026-04-16T15:08:19Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 83.10%. Comparing base (8fe79f8) to head (21452d4).

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #1178      +/-   ##
===========================================
- Coverage    83.16%   83.10%   -0.06%     
===========================================
  Files           35       34       -1     
  Lines         3658     3599      -59     
  Branches       843      823      -20     
===========================================
- Hits          3042     2991      -51     
+ Misses         409      407       -2     
+ Partials       207      201       -6

Flag	Coverage Δ
bootstrap	`83.10% <ø> (-0.06%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

cppalliance-bot · 2026-04-16T15:14:09Z

An automated preview of the documentation is available at https://1178.mrdocs.prtest2.cppalliance.org/index.html

If more commits are pushed to the pull request, the docs will rebuild at the same URL.

2026-06-23 16:25:51 UTC

alandefreitas · 2026-04-21T05:49:58Z

:)

gennaroprota · 2026-04-21T08:38:08Z

Path coverage is OK, isn't it? I added the PR description.

alandefreitas

What about documentation? How do I use the feature? What about the schema in the documentation? Do we have no schema files in the repository? How do we check if the files are up to date in CI? Some of the changes in the golden files are also kind of weird (like ids changing and things like that and some groups are just empty, which could probably be removed in the schema - I'm not sure).

The PR is great though. The only reason I left many comment is because the PR is huge. 😅

alandefreitas · 2026-05-02T00:01:35Z

+        // -------------------------------------------------------
+        // Header
+        // -------------------------------------------------------
+        line("#");


This main function is kind of hard for a human to read. Is this one of these cases where instead of using reflection to improve the schema, we make the function very complex to match the bad pattern we used to have before?

Indeed. The function is long because the schema is currently describing what XMLWriter happens to emit. That's bad.

So has this one been resolved?

RncSchemaWriter.hpp no longer exists, because we emit RNG directly now. RngSchemaWriter.cpp should be relatively easy to read.

`Symbol`'s `tag_invoke` overload added four convenience booleans -- `isRegular`, `isSeeBelow`, `isImplementationDefined`, `isDependency` -- outside the described struct. Because reflection couldn't see them, the JSON schema writer had to mirror the same hardcoded list. This removes those booleans and lets templates compare the described enum directly; e.g.: {{#if isRegular}} -> {{#if (eq extraction "regular")}} For the two `filter_by` / `any_of_by` sites that previously keyed on the booleans, the helper family gains variadic siblings `filter_by_eq` and `any_of_by_eq` -- signature `(container, key, value1, value2, ...)`. This addresses review feedback on PR cppalliance#1178.

alandefreitas · 2026-06-04T16:16:14Z

Thanks, removing the isXxx booleans is a good change.

Is there a way to let reflection drive the missing parts that are manually written right now? If that's not possible right now, could we open an issue to track it? This is exactly what reflection was meant to solve.

Separately, as I mentioned in the previous comment, a few of the golden XML changes still looked odd to me.

gennaroprota · 2026-06-18T16:01:27Z

Is there a way to let reflection drive the missing parts that are manually written right now? If that's not possible right now, could we open an issue to track it? This is exactly what reflection was meant to solve.

I opened issue #1215 for this.

Separately, as I mentioned in the previous comment, a few of the golden XML changes still looked odd to me.

I think you are referring to the IDs of global overload sets? Those change because the hashed string changes: toString(access) now returns none, whereas it returned "" before.

alandefreitas · 2026-06-18T18:03:39Z

Is there a way to let reflection drive the missing parts that are manually written right now? If that's not possible right now, could we open an issue to track it? This is exactly what reflection was meant to solve.

I opened issue #1215 for this.

So it is not possible now? Why?

Separately, as I mentioned in the previous comment, a few of the golden XML changes still looked odd to me.

I think you are referring to the IDs of global overload sets? Those change because the hashed string changes: toString(access) now returns none, whereas it returned "" before.

Well... I exemplified it with too many ID changes that seemed arbitrary (you just explained it) and new empty groups, but I was referring to the sheer number of changes. In other words, the categories of changes (the examples I mentioned, but really any recurring changes) should be determined, understood, and evaluated individually.

hashed string changes

What's in the hashed string again? I thought the access let alone the serialization of toString(access) wouldn't affect the individual strings getting hashed.

`Symbol`'s `tag_invoke` overload added four convenience booleans -- `isRegular`, `isSeeBelow`, `isImplementationDefined`, `isDependency` -- outside the described struct. Because reflection couldn't see them, the JSON schema writer had to mirror the same hardcoded list. This removes those booleans and lets templates compare the described enum directly; e.g.: {{#if isRegular}} -> {{#if (eq extraction "regular")}} For the two `filter_by` / `any_of_by` sites that previously keyed on the booleans, the helper family gains variadic siblings `filter_by_eq` and `any_of_by_eq` -- signature `(container, key, value1, value2, ...)`. This addresses review feedback on PR cppalliance#1178.

gennaroprota · 2026-06-23T09:02:57Z

Is there a way to let reflection drive the missing parts that are manually written right now? If that's not possible right now, could we open an issue to track it? This is exactly what reflection was meant to solve.

I opened issue #1215 for this.

So it is not possible now? Why?

By "now" you mean "in this PR"? That's possible but requires implementing MRDOCS_DESCRIBE_COMPUTED_PROPERTIES, which seems out of scope for this PR. What do you think?

Separately, as I mentioned in the previous comment, a few of the golden XML changes still looked odd to me.

I think you are referring to the IDs of global overload sets? Those change because the hashed string changes: toString(access) now returns none, whereas it returned "" before.

Well... I exemplified it with too many ID changes that seemed arbitrary (you just explained it) and new empty groups, but I was referring to the sheer number of changes. In other words, the categories of changes (the examples I mentioned, but really any recurring changes) should be determined, understood, and evaluated individually.

#	Category	Source commit
1	Schema-location header `mrdocs.rnc` -> `mrdocs.rng`	feat: have --schemas emit a .rng, not a .rnc schema file
2	Structured name rename `<name>` -> `<identifier-name>` / `<specialization-name>`	fix: emit template args, noexcept, and explicit specifiers in XML output
3	Previously-dropped data now emitted (template arguments, `<noexcept>`, `<explicit>`)	fix: emit template args, noexcept, and explicit specifiers in XML output
4	Overload-set `<id>` re-hash (and the `<functions>` / `<shadow-declarations>` references that point at them)	refactor: describe AccessKind, ConstexprKind, ParamDirection, StorageClassKind
5	Newly-described enums now serialized (`<access>`, `<constexpr>`, `<storage-class>`, `<direction>`)	refactor: describe AccessKind, ConstexprKind, ParamDirection, StorageClassKind

hashed string changes

What's in the hashed string again? I thought the access let alone the serialization of toString(access) wouldn't affect the individual strings getting hashed.

There are two "kinds" of IDs: For "real" symbols (functions, records, etc.), the string that gets hashed is the Clang USR. The access isn't part of the USR, so those IDs are unaffected by this PR's changes. Overload sets, instead, have no USR, and are hashed from a composed key that does include the access. From src/lib/Metadata/Symbol/Overloads.cpp:

OverloadsSymbol::OverloadsSymbol(SymbolID const &Parent, std::string_view Name,
                             AccessKind access, bool isStatic) noexcept
    : SymbolCommonBase(SymbolID::createFromString(std::format(
          "{}-{}-{}-{}", toBase16(Parent), Name, toString(access), isStatic))) {
  this->Parent = Parent;
}

When AccessKind got MRDOCS_DESCRIBE_ENUM, toString(AccessKind::None) went from "" to "none"; namespace-scope overload sets have access == None, so, for them, the ID changes.

P.S.: I think the doc-comment attached to SymbolID should be fixed:

/** A unique identifier for a symbol.

    This is calculated as the SHA1 digest of the
    USR. A USRs is a string that provides an
    unambiguous reference to a symbol.
*/

Add a --schemas[=<dir>] option that writes a JSON Schema file (mrdocs-dom-schema.json) describing every object and field available to Handlebars templates. The schema is derived from the same compile-time reflection metadata used by MapReflectedType.hpp, so it stays in sync with the code automatically. The option requires no config file or source files — it writes the schema and exits immediately.

OperatorKind was the only enum serialized as a raw integer. All other enums serialize as human-readable strings. Change tag_invoke to use getOperatorName, consistent with the rest.

…ClassKind Replace manual toString and tag_invoke overloads with MRDOCS_DESCRIBE_ENUM for four enums whose kebab-case names match the existing string representations. The XML writer now emits these fields (e.g. <access>public</access>, <constexpr>constexpr</constexpr>) where they were previously silently skipped. None/none sentinel values are suppressed via a generic has_none_enumerator check. TypeKind stays manual because toKebabCase("LValueReference") produces "l-value-reference", not the established "lvalue-reference".

--schemas now writes both mrdocs-dom-schema.json (Handlebars DOM) and mrdocs.rnc (XML output). The XML schema mirrors XMLWriter.cpp's serialization.

… --schemas This guarantees the RELAX NG schema stays in sync with the C++ type definitions. Every CI run now validates all golden test XML files against a schema derived from the same reflection metadata that produces the XML.

This adds a small lookup table keyed by (typeName, memberName) carrying hand-written descriptions for the DOM seen by Handlebars templates which become "description" fields in the JSON schema.

The newly-added JsonEmitter.hpp duplicated functionality already provided by `dom::JSON::stringify`. This drops the new header and uses the existing `stringify`.

The new function name is more descriptive. Contextually, this expands the doc-comment to spell out what the function returns.

The schema headers are only used by ToolMain and the unit tests. They don't need to live under include/mrdocs/.

XMLWriter silently dropped three kinds of data: - `SpecializationName::TemplateArgs`: `Polymorphic<Name>` fell into `writePolymorphic`'s generic branch with `T` deduced to the base Name (`Polymorphic::operator*` returns a base reference), so only `Name`'s own described members were emitted. Template specializations rendered as `<name>SmallVector</name>` rather than `<specialization-name>SmallVector<...></specialization-name>`. - `NoexceptInfo`: no `MRDOCS_DESCRIBE_STRUCT` - it serializes to a string via `tag_invoke`, which `writeElement` had no path for. Functions with noexcept-specifications produced no `<noexcept>` element. - `ExplicitInfo`: same pattern. explicit constructors and conversion operators produced no `<explicit>` element. This adds a `NameKind` branch in `writePolymorphic` (using a "-name" suffix on the kind tag to disambiguate from the `Name::Identifier` field), and adds a `NoexceptInfo`/`ExplicitInfo` branch in `writeElement` that emits the `toString()` value as text, skipping the empty case. Also update `RncSchemaWriter` to match: `Polymorphic<Name>` -> `AnyName`, drop `NoexceptInfo`/`ExplicitInfo` from the omit list. Most XML golden fixtures regenerate to include the now-emitted elements.

The XMLTags helper in src/lib/Gen/Xml/ contained two pieces of machinery that aren't specific to XML doc generation: an XML escaper (xmlEscape) and a tag/indent stream emitter. A RELAX NG schema writer, which will be introduced with the next commit, also needs them. So, we factored them out.

The --schemas option now writes a RELAX NG XML document directly. This gets rid of the trang RNC->RNG conversion step. Which, in turn, means we no longer need Java. The bootstrap script dependency on Java will be removed with the next commit.

`Symbol`'s `tag_invoke` overload added four convenience booleans -- `isRegular`, `isSeeBelow`, `isImplementationDefined`, `isDependency` -- outside the described struct. Because reflection couldn't see them, the JSON schema writer had to mirror the same hardcoded list. This removes those booleans and lets templates compare the described enum directly; e.g.: {{#if isRegular}} -> {{#if (eq extraction "regular")}} For the two `filter_by` / `any_of_by` sites that previously keyed on the booleans, the helper family gains variadic siblings `filter_by_eq` and `any_of_by_eq` -- signature `(container, key, value1, value2, ...)`. This addresses review feedback on PR cppalliance#1178.

The bootstrap script checked for Java because the build needed it to run trang.jar, which converted the RNC schema to RNG. But trang.jar is no longer used (the --schemas option directly emits a .rng), so Java is no longer needed.

The schema writer emits a `description` field for every type and every described member it touches, looking it up from DomDescriptions.hpp. When the lookup failed, it silently returned an empty string, so forgetting a description caused an undocumented `$defs` entry to be silently emitted. This adds an assert that fires when the lookup of a type or a member finds no entry. This allowed finding many missing entries, which have been added. Any future described type added to the schema trips the assert at build time until descriptions for it and its members are provided.

…ation targets Both schema files are now committed under docs/, parallel to the existing docs/mrdocs.schema.json (the YAML config schema) and are exposed to the Antora docs site as downloadable attachments. Two new CTest targets, `rng-schema-check` and `dom-schema-check`, run `cmake -E compare_files` between the freshly-generated schemas in the build tree and the checked-in copies; drift fails the test. The schemas custom_command is lifted out of the LibXml2 conditional so the freshness checks run independently of whether libxml2 is available. .gitattributes pins the two schema files to LF line endings, because --schemas emits LF line endings and we do a byte-for-byte comparison.

The page's Schema section still pointed at the old hand-written mrdocs.rnc. The --schemas option now emits mrdocs.rng directly, so update the root-element example and the canonical-schema reference to mrdocs.rng and point readers at the new Output Schemas page.

This replaces the hand-written DOM reference with one generated from mrdocs-dom-schema.json. A new Antora extension walks the file's `$defs` in source order and emits one section per type.

`Symbol`'s `tag_invoke` overload added four convenience booleans -- `isRegular`, `isSeeBelow`, `isImplementationDefined`, `isDependency` -- outside the described struct. Because reflection couldn't see them, the JSON schema writer had to mirror the same hardcoded list. This removes those booleans and lets templates compare the described enum directly; e.g.: {{#if isRegular}} -> {{#if (eq extraction "regular")}} For the two `filter_by` / `any_of_by` sites that previously keyed on the booleans, the helper family gains variadic siblings `filter_by_eq` and `any_of_by_eq` -- signature `(container, key, value1, value2, ...)`. This addresses review feedback on PR cppalliance#1178.

Rebasing onto develop brought two model changes the schema generators did not yet account for. Attributes became a polymorphic family (a base attribute plus one type per kind, some carrying an argument) instead of a flat string list, and records and functions gained the specialization-placement fields: whether a specialization is listed on its primary, the list of specializations, and a record's deduction guides. This teaches both schema writers about them.

The committed mrdocs.rng and mrdocs-dom-schema.json now cover the attribute family. The golden XML files that could not be merged during the rebase are regenerated to match this branch's output: structured names, the `noexcept` and `explicit` specifiers, the described-enum elements, and the RELAX NG schema location.

alandefreitas · 2026-06-23T20:09:56Z

namespace-scope overload sets have access == None, so, for them, the ID changes

Interesting. Thanks

I think the doc-comment attached to SymbolID should be fixed

Agree. For overloads (or any entity clang doesn't have a USR for), we use a variant that's just a hash of something else.

Something like "This is calculated as the SHA1 digest of the USR or other SymbolIDs" or something like that.

which seems out of scope for this PR. What do you think?

Mmmm... This is a hard question. Because if the scope as defined by the PR is "feat: schema generation", it's maybe out of scope because MRDOCS_DESCRIBE_COMPUTED_PROPERTIES is about "improving schema generation" rather than "schema generation". An improvement is about how well you do the thing, but it's definitely about the scope of the thing. But it can also be left out because things don't have to be perfect.

Or maybe what you really mean is "isn't there already too much code in here" rather than "is the scope" improving schema generation" inside the scope "schema generation", to which the answer would be true. But from that point of view, this PR became a proof of concept for something else that is the much bigger problem: if we're replacing a hand-curated adoc version of the reference with a ~30k LOC diff "Hand-curated descriptions for the JSON Schema produced by @ref DomSchemaWriter" that comes with an application that coverts it back adoc, then I'm not sure we have good motivation for the work here. If we had to literally list every member in the cpp file and it was still hand-curated, I'm not sure we'd have achieved anything, since we'd have the same result but much more code to maintain. Maybe the difference is that now we get compiler errors? Maybe I thought we would at least get computed properties, which diverge a little more, but these would also be a lot of work and we got into the question about scope.

So... at the same time the scope question is valid, I don't think this is a scope issue at all. I see it as an issue of basic usefulness / making sense / motivation. Because I don't see this PR as "feat: schema generation", as the title describes. Because "feat: schema generation" is a feature that already exists. It's best, it's a refactor from one manual strategy to another. The feat schema already exists and is, unfortunately, done manually. The one done here is also manual. So I originally saw this PR as a refactor to improve the schema generation by using reflection to automate it and keeping it up to date in the future (although I wasn't sure what solution we could have for the brief and descriptions).

In other words, if the PR is about making it automatic in a smarter way rather than manual, and it still needs to hard-code not only the exceptions but all fields of each type, then at best, this is a proof of concept that we don't have the technology to make it automatic yet. Because we can keep things hard-coded by not doing anything. It means reflection doesn't help us with that.

To be honest, I don't have a good solution for any of that. If anything, it seems like MRDOCS_DESCRIBE_COMPUTED_PROPERTIES is what should ship first (because it's what actually useful for reflection so the objects finally have everything without relying on anything manual) and automatic schema generation would only ship after that if and only if we still find a solution to how to describe these things.

Another thing that makes this issue quite low ROI is that the real unknown is the script generator, and now that everything is described, we could just adjust the reference section to say the thing is a cpp:mrdocs::Symbol[] with the key/value pairs reflected exactly as documented in the reference section. Any divergence, such as kebab/pascal/camel casing, could be manually described, and the proper description of each field is already perfectly described there. The same could be done for the XML. This solves almost all problems described here with the exception of maybe a low ROI aesthetic benefit of having these fields already presented with the correct casing for the user. And after we have the generators, another way to make this have a single source of truth is to extract the description of each field directly with mrdocs. The transform extension would filter anything outside the metadata and the generator extension would generate the reference page. The comments wouldn't have to be manually maintained in another location because they would come directly from the header files.

gennaroprota · 2026-06-24T08:52:43Z

Makes sense. I'll split MRDOCS_DESCRIBE_COMPUTED_PROPERTIES out into its own PR and land it first.

On the descriptions: I agree the single source of truth should be the header doc-comments. I'll look at sourcing the field descriptions from the headers rather than keeping a hand-written table, leaving only the projection divergences (casing, computed properties, $meta) described by hand.

alandefreitas · 2026-06-24T20:02:44Z

On the descriptions: I agree the single source of truth should be the header doc-comments. I'll look at sourcing the field descriptions from the headers rather than keeping a hand-written table, leaving only the projection divergences (casing, computed properties, $meta) described by hand.

On a technical level, I agree. We should first reach a level where everything can be done automatically. At best, I would just mention that these other properties can also be made automatic. We also have the doc-comments for the computed properties, so this can be done automatically. And casing obviously can be done automatically. And if there's any other divergence, it could also be resolved before we attempt this (for instance, we have an issue for the "is-*" parameters).

In other words, this refactor should be a low-hanging-fruit celebration that reflection made everything consistent across our application. If we haven't reached this point yet, we can just make it consistent before we try to improve this. Otherwise, the refactor doesn't give us much ROI. We can already point users to the reference section today, showing it to them with the projections is nice but noone else claimouring for that, and the investment to create and maintain a code with exceptions to match the old behavior is very high.

gennaroprota changed the title ~~Feat/schema generation~~ feat: schema generation Apr 17, 2026

gennaroprota force-pushed the feat/schema_generation branch 6 times, most recently from cdda298 to b9fac7c Compare April 17, 2026 16:31

alandefreitas reviewed May 2, 2026

View reviewed changes

gennaroprota force-pushed the feat/schema_generation branch 7 times, most recently from 73561af to f9d5f63 Compare May 6, 2026 10:14

gennaroprota force-pushed the feat/schema_generation branch from f9d5f63 to fb9c795 Compare May 13, 2026 14:41

gennaroprota force-pushed the feat/schema_generation branch from fb9c795 to 0ae30ec Compare May 13, 2026 16:08

gennaroprota force-pushed the feat/schema_generation branch from 0ae30ec to 903033b Compare May 14, 2026 06:19

gennaroprota force-pushed the feat/schema_generation branch from 903033b to 019d019 Compare May 14, 2026 07:37

gennaroprota force-pushed the feat/schema_generation branch from 019d019 to 8cbb13b Compare June 23, 2026 07:30

gennaroprota added 12 commits June 23, 2026 11:21

fix: serialize OperatorKind as a string in the DOM

80955ec

OperatorKind was the only enum serialized as a raw integer. All other enums serialize as human-readable strings. Change tag_invoke to use getOperatorName, consistent with the rest.

feat: have --schemas also generate an XML schema

d50c237

--schemas now writes both mrdocs-dom-schema.json (Handlebars DOM) and mrdocs.rnc (XML output). The XML schema mirrors XMLWriter.cpp's serialization.

feat(schemas): describe DOM types and members in the JSON schema

69293c5

This adds a small lookup table keyed by (typeName, memberName) carrying hand-written descriptions for the DOM seen by Handlebars templates which become "description" fields in the JSON schema.

refactor: use dom::JSON::stringify instead of a duplicate emitter

d2a3d39

The newly-added JsonEmitter.hpp duplicated functionality already provided by `dom::JSON::stringify`. This drops the new header and uses the existing `stringify`.

refactor: rename buildDomSchema to buildDomJsonSchema

bef0949

The new function name is more descriptive. Contextually, this expands the doc-comment to spell out what the function returns.

refactor: move the schema headers out of the public API

0d29dbc

The schema headers are only used by ToolMain and the unit tests. They don't need to live under include/mrdocs/.

gennaroprota force-pushed the feat/schema_generation branch from 8cbb13b to 1716719 Compare June 23, 2026 15:04

gennaroprota added 9 commits June 23, 2026 18:04

build: drop the Java prerequisite

f4c962d

The bootstrap script checked for Java because the build needed it to run trang.jar, which converted the RNC schema to RNG. But trang.jar is no longer used (the --schemas option directly emits a .rng), so Java is no longer needed.

docs: add documentation for the --schemas option

1b2a79d

docs: replace the manual DOM reference with a generated one

b29106b

This replaces the hand-written DOM reference with one generated from mrdocs-dom-schema.json. A new Antora extension walks the file's `$defs` in source order and emits one section per type.

gennaroprota force-pushed the feat/schema_generation branch from 1716719 to 21452d4 Compare June 23, 2026 16:15

Uh oh!

Conversation

gennaroprota commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Testing

Documentation

Uh oh!

github-actions Bot commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✨ Highlights

🧾 Changes by Scope

🔝 Top Files

Uh oh!

codecov Bot commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

cppalliance-bot commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alandefreitas commented Apr 21, 2026

Uh oh!

gennaroprota commented Apr 21, 2026

Uh oh!

alandefreitas left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alandefreitas May 2, 2026

Choose a reason for hiding this comment

Uh oh!

gennaroprota May 4, 2026

Choose a reason for hiding this comment

Uh oh!

alandefreitas May 13, 2026

Choose a reason for hiding this comment

Uh oh!

gennaroprota May 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alandefreitas commented Jun 4, 2026

Uh oh!

gennaroprota commented Jun 18, 2026

Uh oh!

alandefreitas commented Jun 18, 2026

Uh oh!

gennaroprota commented Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alandefreitas commented Jun 23, 2026

Uh oh!

gennaroprota commented Jun 24, 2026

Uh oh!

alandefreitas commented Jun 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

gennaroprota commented Apr 16, 2026 •

edited

Loading

github-actions Bot commented Apr 16, 2026 •

edited

Loading

codecov Bot commented Apr 16, 2026 •

edited

Loading

cppalliance-bot commented Apr 16, 2026 •

edited

Loading

gennaroprota commented Jun 23, 2026 •

edited

Loading