ores.codegen architecture
Table of Contents
Summary
This document records how ores.codegen is built and how its parts
fit together: where its code and data live, how the generator picks
templates for models, how it handles modelines and licence headers,
and the conventions it enforces.
For what the component is and how to operate it, see ores.codegen (the component model) and the recipes it links to.
Detail
Directory structure
projects/ores.codegen/ ├── library/ │ ├── component_catalogue.org # Component → models-dir/glob/modeling-dir (read by manifest.py) │ ├── data/ # Static data (licences, modelines) │ │ ├── licence-GPL-v3.txt │ │ └── modeline.org # Editor modelines — MASD-style org, read directly by generator.py │ ├── facet_catalogue.org # Profile → model-type → template mapping (read by generator.py) │ └── templates/ # Mustache templates │ ├── sql_*.mustache │ └── doc_*.org.mustache ├── models/ # JSON model files driving generation │ └── <component>/ # e.g. refdata/, trade/, dq/, iam/ ├── output/ # Default destination for generated files ├── scripts/ # Shell scripts for common operations ├── src/ # Python source │ ├── codegen/ # CLI package (codegen.py entry point) │ │ ├── __init__.py │ │ ├── generate.py # generate / regenerate subcommands │ │ ├── manifest.py # named component registry │ │ └── diff.py # diff subcommand │ ├── generator.py # Core generator: JSON models + templates → output │ ├── doc_generate.py # v2 information-architecture doc generator │ ├── fpml_parser.py # FPML Genericode XML → JSON models │ ├── images_generate_sql.py │ ├── lei_extract_subset.py │ └── iso_generate_metadata_sql.py ├── modeling/ │ └── ores.codegen.org # Component model (the entry point) ├── docs/ │ ├── architecture.org # This document │ ├── cpp_generation_analysis.md │ └── doc_generator.md ├── requirements.txt ├── codegen.sh # Wrapper: activates venv and calls codegen.py ├── run_generator.sh # Legacy wrapper (kept for non-CLI uses) ├── generate_doc.sh # Wrapper for doc_generate.py └── generate_*_schema.sh # Legacy per-component schema scripts (being retired)
Internal modules
| File | Purpose |
|---|---|
src/generator.py |
Core generator: JSON models + Mustache templates → SQL / C++ / etc. |
src/codegen/generate.py |
generate and regenerate subcommand implementations |
src/codegen/manifest.py |
Named component registry (maps component name → models dir + glob) |
src/codegen/diff.py |
diff subcommand implementation |
src/doc_generate.py |
v2 information-architecture document generator (task/story/sprint/…) |
src/fpml_parser.py |
FPML Genericode XML → JSON models |
src/iso_generate_metadata_sql.py |
ISO standards → SQL |
src/images_generate_sql.py |
Image artefacts (flags, crypto icons) → SQL |
src/lei_extract_subset.py |
LEI dataset subset extractor |
library/facet_catalogue.org |
Declares which templates each profile runs per model type; read directly by generator.py |
library/component_catalogue.org |
Maps component names to discovery roots; read directly by manifest.py |
library/data/modeline.org |
Editor modeline strings per language; read directly by generator.py |
library/data/ |
Static data files (licences, modelines) |
library/templates/ |
Mustache templates |
models/ |
JSON model files |
output/ |
Default destination for generated files |
Main generator functions
In src/generator.py:
is_table_model(filename)— returnsTruefor*_table.jsonfilenames.get_model_type(filename)— maps filename suffix to model type string (table,schema,domain_entity,junction, …).load_profiles(base_dir)— parselibrary/facet_catalogue.orgdirectly via_load_profiles_from_org().resolve_profile_templates(profile, profiles, model_type)— return the template list for a given profile and model type.resolve_output_path(pattern, model_data, model_type)— expand an output path pattern using model fields.load_data(data_dir)— load JSON and text files from a data directory.render_template(template_path, data)— render a Mustache template with the given data.generate_from_model()— orchestrate generation for one model file.
Template system
- Mustache via the
pystachelibrary. - Templates live in
library/templates/. The.mustachefiles are generated artefacts, tangled from the literate facet docs in the same directory — see the Codegen template library overview for the hierarchy, tangle workflow, and drift checks. - Output is SQL, C++, or org-mode depending on the template family.
Data files
library/data/licence-GPL-v3.txt— full GPL v3 licence text used in generated headers.library/data/modeline.org— per-language editor modeline strings (MASD-style org hierarchy; read directly bygenerator.pyvia_load_modelines_from_org()).
Model types and file naming
The generator detects a model's type from its filename suffix. Each
type has a root JSON key of the same name and is associated with one or
more templates through library/facet_catalogue.org.
| Filename suffix | Model type | Root JSON key | Primary template |
|---|---|---|---|
*_table.json |
table |
table |
sql_schema_create.mustache (profile: sql) |
*_entity.json |
schema |
entity |
sql_schema_table_create.mustache (legacy) |
*_domain_entity.json |
domain_entity |
domain_entity |
C++ templates + sql_schema_domain_entity_create.mustache |
*_junction.json |
junction |
(varies) | sql_schema_junction_create.mustache |
The _table.json format is the current standard for SQL-only entity
tables (refdata and similar). It replaces the legacy _entity.json →
sql_schema_table_create.mustache path.
Model-template mapping
The active mapping is declared in library/facet_catalogue.org, not hard-coded
in Python. Each profile entry lists which templates apply to which model
types and the output path pattern.
Table schema mappings (*_table.json files)
These files drive unified SQL generation via sql_schema_create.mustache.
Run with codegen.sh regenerate --component <name> --profile sql.
| Template | Output file pattern |
|---|---|
sql_schema_create.mustache |
projects/ores.sql/create/{component}/{component}_{entity_plural}_create.sql |
The *_table.json model root key is table with the following fields:
| Field | Type | Description |
|---|---|---|
schema, product, component |
string | Namespace identifiers |
entity_singular, entity_plural |
string | Used in table name and output filename |
description |
string | Appears in the SQL header comment |
has_tenant_id |
boolean | Whether the table has a tenant_id column |
primary_key.column |
string | Primary key column name |
primary_key.type |
string | SQL type (e.g. text, uuid) |
primary_key.is_text |
boolean | Controls empty-string check constraint and default quoting |
columns[] |
list | Non-PK columns (name, type, nullable, default) |
coding_scheme |
none / required / nullable |
Whether to add a coding_scheme_code FK column |
image_id |
boolean | Whether to add an image_id UUID FK column |
validation_fn.tenant_scope |
system / both / tenant |
Which tenants the validation function queries |
validation_fn.default |
string / absent | Return value when input is null or empty |
validation_fn.order_by |
string / absent | Column for ORDER BY in error messages (defaults to PK) |
insert_trigger.validations[] |
list | Per-column validation function calls in the insert trigger |
check_constraints[] |
list | Additional SQL check constraints (expression strings) |
indexes[] |
list | Extra indexes beyond the standard ones |
The generator preprocesses boolean flags (has_coding_scheme,
has_any_coding_scheme, scope_system, scope_both, scope_tenant,
etc.) and pre-renders sql_check_constraints as a single string before
passing data to Mustache, to avoid pystache whitespace issues with
adjacent section tags.
Domain entity schema mappings (*_domain_entity.json files)
These files drive both C++ generation and SQL for domain entities (party, book, counterparty, etc.). The SQL portion is handled separately from the C++ portion.
| Template | Output file |
|---|---|
sql_schema_domain_entity_create.mustache |
{component}_{entity}_create.sql |
sql_schema_notify_trigger.mustache |
{component}_{entity}_notify_trigger.sql |
sql_schema_artefact_create.mustache |
dq_{entity}_artefact_create.sql |
C++ templates (via --profile all-cpp) |
.hpp / .cpp files |
Standard data mappings
| Model file | Template(s) |
|---|---|
model.json |
sql_batch_execute.mustache |
catalogs.json |
sql_catalog_populate.mustache |
country_currency.json |
sql_flag_populate.mustache, sql_currency_populate.mustache, sql_country_populate.mustache |
datasets.json |
sql_dataset_populate.mustache, sql_dataset_dependency_populate.mustache |
methodologies.json |
sql_methodology_populate.mustache |
tags.json |
sql_tag_populate.mustache |
Entity populate mappings (*_data.json files)
| Template | Output file |
|---|---|
sql_populate_refdata.mustache |
{component}_{entity}_populate.sql |
Profile system
Profiles are declared in library/facet_catalogue.org. Each profile maps to a
list of template entries; each entry specifies the template name, the
output path pattern, and the model types it applies to.
Built-in profiles:
| Profile | Description |
|---|---|
sql |
SQL DDL only (table create / domain entity create) |
all-cpp |
C++ headers, implementations, JSON and table I/O |
all |
sql + all-cpp — not allowed for _table.json or _entity.json models (see below) |
The --profile all guard: running all on a SQL model (table or
schema type) is refused by the CLI because a matching
_domain_entity.json may exist for the same entity, and running all
would silently overwrite the production SQL with the wrong template.
Use --profile sql or --profile all-cpp explicitly.
Component registry
Named components are declared in library/component_catalogue.org, read directly
by src/codegen/manifest.py at startup. See component_catalogue.org for the full
16-component table (name, models_dir, entity_glob, exclude_suffix, modeling_dir).
Retired scripts
The following shell scripts have been deleted and replaced by codegen.sh:
| Deleted script | Replacement command |
|---|---|
generate_refdata_schema.sh |
codegen.sh regenerate --component refdata --profile sql |
Modeline configurations
From library/data/modeline.org:
| Language | Modeline |
|---|---|
| SQL | sql-product: postgres; indent-tabs-mode: nil |
| C++ | mode: c++; indent-tabs-mode: nil; c-basic-offset: 4 |
Features
- Licence generation. Generated files carry a licence header with editor modelines, a copyright with the current year, and the appropriate per-language comment formatting.
- Multi-language comment support. SQL and C++ use
/* ... */with a* = line prefix; Python uses =""", JavaScript uses/** */. - Flexible output. Default output directory is
output/; overridable via the second positional argument; created automatically if absent. - Overall models. A
model.jsoncan orchestrate generation of multiple artefacts in dependency order. - Dynamic prefixing. A
model_nameproperty on an overall model prefixes every output file (for examplesolvaris_). - Automatic sibling loading. JSON models in the same directory are loaded together so a template can cross-reference them.
- Enhanced data context. Subject-area datasets (such as
currencies_dataset,countries_dataset) are surfaced as named variables to templates for direct access.
Example model structure
From models/slovaris/catalogs.json:
[
{
"name": "Slovaris",
"description": "Imaginary world to test all system functions.",
"owner": "Testing Team"
}
]
The sql_catalog_populate.mustache template generates SQL that:
- Includes the enhanced licence header.
- Sets the schema to
ores. - Generates SQL calls to
metadata.upsert_dq_catalogs(). - Includes summary queries.
Extending
To add a new _table.json model for an existing component:
- Create the
*_table.jsonfile in the component'smodels/directory. - Run
codegen.sh regenerate --component <name> --profile sqlto generate the SQL.
To add a new profile or template:
- Add the Mustache template to
library/templates/. - Add an entry to
library/facet_catalogue.orgunder the relevant profile heading, adding a row to the templates table withtemplate,outputpattern, and optionallymodel_types. - If adding a new model type, update
get_model_type()insrc/generator.pyand add any preprocessing logic in thegenerate_from_model()dispatch block.
To add a new named component:
- Add a row to
library/component_catalogue.orgwith the component'smodels_dir,entity_glob,exclude_suffix, andmodeling_dir. - The component is then available as
codegen.sh regenerate --component <name>.
See also
- ores.codegen — the component model (entry point).
- Codegen input org-file schema reference — schema for every org-file type consumed by the generator.
- How do I run codegen? — operational recipe.
- How do I create a new doc? — operational recipe for the org-document generator.
projects/ores.codegen/docs/cpp_generation_analysis.md— analysis of the C++ code generation paths.projects/ores.codegen/docs/doc_generator.md— full CLI reference forgenerate_doc.sh.