Codegen model unification analysis
Table of Contents
Summary
ORE Studio's ores.codegen architecture generates C++, SQL, and Qt
artefacts from org-mode model files. Today the same logical entity is
described by two parallel org-mode model types: a domain_entity file
(ores.refdata.<name>.org) and, for some entities, a separate table
file (ores.refdata.<name>_table.org). The two types are governed by
the same conceptual schema (see the org entity meta model and the
codegen input org-file schema reference) but are detected, parsed, and
rendered through entirely separate code paths.
This split creates three problems: confusion (two files of record per entity), redundancy (overlapping column and identity declarations), and a live overwrite risk (both files can target the identical SQL output path, and the entity pathway produces a structurally incomplete file).
This document establishes the current state, inventories every model in
projects/ores.refdata/modeling/, classifies each by type, enumerates
the six concrete blockers to unification, and recommends a migration to a
single org file per entity that carries all SQL, C++, and Qt sections.
Current State
Model type detection
generator.py determines a model's type from its filename, in a fixed
priority order. get_model_type() and load_model() are both keyed off
filename suffixes:
| Priority | Filename pattern | Detected type | Loader | Root key |
|---|---|---|---|---|
| 1 | ores.*.org (no recognised suffix) |
domain_entity |
load_org_model() |
{"domain_entity": {...}} |
| 2 | *_table.org |
table |
load_org_table_model() |
{"table": {...}} |
| 3 | *_junction.org |
junction |
(table-family loader) | {"junction": {...}} |
| 4 | *_field_group.org |
field_group |
||
| 5 | *_lookup_entity.org / *_entity.json |
schema |
||
| 6 | service_registry, component, enum, data | (various) |
The first matching rule wins. A file named ores.refdata.party_status.org
is therefore a domain_entity, while ores.refdata.party_status_table.org
is a table — purely because of the _table suffix.
Profile dispatch by model type
Which facets fire is a function of the detected model type. The profile × model_type matrix, drawn from the facet catalogue:
| Profile | domain_entity | table | junction | schema | Notes |
|---|---|---|---|---|---|
sql |
yes (4 tmpl) | yes | yes | yes | 4 templates on entity; 1 each on table/junction/schema |
domain |
yes | no | no | yes | skips table and junction |
generator |
yes | no | no | yes | |
repository |
yes | no | no | yes | |
service |
yes | no | no | yes | |
protocol |
yes | no | no | yes | |
nats-eventing |
yes | no | no | yes | |
nats-handler |
yes | no | no | yes | |
qt |
yes (12 tmpl) | no | no | no | fires on domain_entity ONLY |
all (composite) |
yes | no | no | no | explicitly guarded: refuses table and schema |
The sql profile fires on every type. The C++ and messaging profiles
(domain, generator, repository, service, protocol,
nats-eventing, nats-handler) fire only on domain_entity and
schema, skipping table and junction entirely. The qt profile
fires on domain_entity only. The composite --profile all is
explicitly guarded to refuse table and schema types.
Component split
The two file types are discovered by two different components:
refdata— discovers*_table.organd*_junction.orgfiles, then dispatches SQL-only profiles.refdata-cpp— discoversores.refdata.*.orgentity files, then dispatches C++ and messaging profiles.
This split exists only because the two file types are physically separate. The discovery globs are disjoint by construction.
Component field migration (partial)
Entity files declare which API component they belong to via a field in
the * C++ / ** Flags property drawer. The schema for this field has
changed:
| Era | Fields | Status |
|---|---|---|
| Legacy | :component_include: and :component_core: |
Deprecated |
| Current | :subcomponent: api |
Active |
generator.py now requires :subcomponent: to be present in order to
dispatch messaging profiles. As of the main branch, at least one entity
(book_status) reached a transitional state where all three fields
coexist — both the deprecated pair and the new field — because separate
PRs applied the migration inconsistently. The deprecated fields are
silently ignored by the generator once :subcomponent: is present, but
they represent schema debt and should be removed.
A scan for remaining legacy fields:
grep -rl "component_include\|component_core" projects/ores.refdata/modeling/
Legacy JSON models
No JSON models remain in projects/ores.refdata/. The models_dir
config entries in the component catalogue for both refdata and
refdata-cpp point to non-existent directories. They are vestigial
after the org migration and are dead code.
Model Inventory
Complete inventory of projects/ores.refdata/modeling/, classified by
model type. See the entity coverage matrix for the corresponding
generation coverage per layer.
ores.codegen.entity — 27 files (C++ + SQL + Qt capable)
Dual-file entities — also have a _table.org counterpart, so SQL
generation is split across two files:
| book_status | contact_type | country |
| currency_market_tier | monetary_nature | party_id_scheme |
| party_status | party_type | rounding_type |
Entity-only — no _table.org; SQL is generated by
sql_schema_domain_entity_create.mustache from this file, or SQL is not
yet generated:
| book | business_unit | cds_convention |
| counterparty | counterparty_contact_information | counterparty_identifier |
| deposit_convention | fra_convention | fx_convention |
| ibor_index_convention | ois_convention | overnight_index_convention |
| party | party_contact_information | party_identifier |
| portfolio | swap_convention | zero_convention |
ores.codegen.table — 11 files (SQL-only)
Dual-file — paired with an entity file:
| book_status | contact_type | country |
| currency_market_tier | monetary_nature | party_id_scheme |
| party_status | party_type | rounding_type |
Table-only — no entity counterpart; SQL-only, no C++ or Qt:
| currency | purpose_type |
ores.codegen.junction — 3 files
| party_counterparty_junction | party_country_junction | party_currency_junction |
ores.codegen.module — 1 file
ores.refdata.module — index only; skipped by the generator.
Not processed by the generator — 1 file
component_overview.org.
The Duplication Problem
The SQL overwrite risk
Both ores.refdata.party_status.org (a domain_entity) and
ores.refdata.party_status_table.org (a table) target the identical
output path:
projects/ores.sql/create/refdata/refdata_party_statuses_create.sql
The files on disk carry Template: sql_schema_create.mustache — the
table pathway. Running --profile sql on the refdata-cpp component,
which discovers entity org files, would overwrite this output with the
structurally incomplete sql_schema_domain_entity_create.mustache
output. The entity file's SQL template cannot generate the validation
function body — only its drop stub, via paste-marker
5E47F108-1350-4540-B3C2-E83DD5379B2D.
The result is a silent regression: a valid SQL artefact replaced with one that cannot enforce the entity's validation contract.
Why the two templates are not interchangeable
| Concern | sql_schema_create.mustache (table) |
sql_schema_domain_entity_create.mustache (entity) |
|---|---|---|
| Root key | table |
domain_entity |
| Validation function | full body (table.validation_fn.tenant_scope) |
drop stub only (paste-marker) |
| Insert trigger | table.insert_trigger.validations[] |
not represented |
| Coding scheme | table.coding_scheme |
not represented |
| Tenancy | table.has_tenant_id |
derived, partial |
The table template consumes a richer data shape than the entity template can supply. No single template currently handles both richness levels.
Blockers to Unification
- Two incompatible SQL templates with divergent data shapes.
sql_schema_create.mustacheconsumestable.validation_fn.tenant_scope,table.insert_trigger.validations[],table.coding_scheme, andtable.has_tenant_id.sql_schema_domain_entity_create.mustacheconsumes thedomain_entityroot key and paste-marker fragments. No single template currently handles both richness levels. get_model_type()andload_model()are hard-coupled to filename suffixes. Merging requires detection based on#+type:frontmatter rather than filename patterns, plus a unified parse path.- Entirely separate parsers.
load_org_table_model()andload_org_model()share no code. The* Validation functionand* Insert triggersections are not parsed byload_org_model()at all. - The
refdatavsrefdata-cppcomponent split relies on the two file types being separate. Unification collapses this into one component with one discovery glob. - SQL-only entities must remain expressible.
currencyandpurpose_typehave no C++ layer. A unified schema needs#+sql_only: true(or equivalent) so the generator suppresses C++ profile dispatch for those entities. - The = C++ / * Qt= subsection is mandatory in the current
load_org_model()preprocessing path. Roughly 200 lines of field derivation assume a C++ context. Making all C++ fields optional requires significant defensive template logic, or an explicit#+has_qt: falseguard.
Target State
A single ores.codegen.entity file per entity, containing all SQL, C++,
and Qt sections. Concretely:
- The table-specific sections move into the entity file schema:
* Validation function,* Insert trigger / ** Validations,#+has_tenant_id,#+coding_scheme, and#+image_id. - The
refdataandrefdata-cppcomponent entries in the component catalogue merge into one component with one discovery glob. - The
ores.codegen.tabletype is retired. The 11*_table.orgfiles are migrated into their entity counterparts (or, for table-only entities, into newsql_onlyentity files). - A
#+sql_only: trueflag handles SQL-only entities (currency,purpose_type), suppressing C++ and Qt profile dispatch. - The SQL overwrite risk is eliminated: one file, one SQL output path, one template capable of the full validation-function and insert-trigger body.
The unified file schema is governed by the org entity meta model and
the variability (profile) metamodel, with sql_only and has_qt as
new variability points.
Migration Path
The recommended order minimises overwrite risk during the transition and keeps the generator runnable at each step.
- Detection by frontmatter. Change
get_model_type()to read#+type:rather than the filename suffix. Keep the filename-suffix fallback temporarily so existing files still resolve. This unblocks blocker 2 without moving any content. - Unify the SQL template. Extend
sql_schema_domain_entity_create.mustache(or fold both into a single template) so thedomain_entitydata shape carries the full validation-function body and insert-trigger validations — sourced from the migrated sections, not the paste-marker stub. This retires the distinction at the template layer (blocker 1). - Merge the parsers. Teach
load_org_model()to parse the* Validation functionand* Insert triggersections, and to populatehas_tenant_id,coding_scheme, andimage_id. Reuse the logic fromload_org_table_model(); then delete the table loader (blocker 3). - Add the variability guards. Introduce
#+sql_only: trueand#+has_qt: false. Make C++ and Qt field derivation conditional on these flags so SQL-only entities and C++-without-Qt entities are expressible (blockers 5 and 6). - Migrate content, one entity at a time. For each of the 9 dual-file
entities, move the
_table.orgsections into the entity file, verify the generated SQL is byte-identical to the table-pathway output, then delete the_table.orgfile. For the 2 table-only entities (currency,purpose_type), createsql_onlyentity files and delete the table files. Use the domain entity evaluation checklist and the entity commissioning process to validate each migrated entity. - Collapse the components. Merge
refdataandrefdata-cppinto one component with a single discovery glob overores.refdata.*.org(excluding junctions and the module index). Remove the deadmodels_direntries (blocker 4). - Retire the table type. Once no
*_table.orgfiles remain, remove thetablebranch fromget_model_type(),load_model(), and the profile matrix. Junctions remain a distinct type and are out of scope for this merge.