UML Model Refresh
Bring all component class diagrams up to date; introduce a generation script
Table of Contents
Problem Statement
The per-component PlantUML class diagrams in projects/*/modeling/*.puml are
significantly out of date. They were last systematically updated during the
early scaffolding phase and have not tracked subsequent domain-model evolution.
Current state
| Category | Count |
|---|---|
Projects with no modeling/*.puml at all |
27 |
Projects with an empty placeholder .puml |
4 |
| Projects with a stub model ("Core types to be added") | ~20 |
| Projects with a substantive but stale model | ~10 |
| Total components needing work | ~61 |
Projects missing entirely:
ores.codegen, ores.compute.api, ores.hpp, ores.http, ores.http.core,
ores.iam.api, ores.iam.client, ores.lisp, ores.marketdata.api,
ores.marketdata.core, ores.qt.admin, ores.qt.analytics, ores.qt.api,
ores.qt.compute, ores.qt.data_transfer, ores.qt.mktdata, ores.qt.party,
ores.qt.refdata, ores.qt.scheduler, ores.qt.trading, ores.qt.workflow,
ores.refdata.api, ores.reporting.api, ores.reporting.core,
ores.scheduler.core, ores.trading.api, ores.workflow.api.
Projects with empty .puml:
ores.assets.api, ores.scheduler.api, ores.synthetic.api,
ores.variability.api.
No generation script
The only automation is projects/ores.codegen/plantuml_er_generate.sh, which
generates the SQL schema ER diagram from ores.sql definitions. There is no
equivalent tool for component class diagrams. Each .puml was hand-authored,
making them expensive to create and easy to forget to update.
Goals
- Create a script (
build/scripts/generate_component_puml.py) that reads the C++ headers of each component and emits a skeleton PlantUML class diagram. - Run the script to regenerate all existing stubs and create files for the 31 components that currently have none.
- Manually enrich the generated diagrams for the API and domain-heavy components (relationships, layout hints, notes).
- Keep the script in-tree so future model refreshes are a single command, not a multi-day manual exercise.
Non-goals
- Full relationship extraction from C++ (only fields and class membership are scraped; relationships are added by hand in the enrichment phase).
- Rendering PNG files as part of CI — PNG generation is a local-tooling concern.
- Updating the top-level
projects/modeling/ores.pumlsystem diagram in this plan (tracked separately).
Approach
Why a script
At 61+ components with dozens of domain types each, hand-authoring is not sustainable. The pattern is formulaic:
- Walk
projects/<name>/include/for*.hppfiles. - Parse namespace declarations and
struct=/=classdefinitions with their public fields and types. - Emit a
@startumlfile following the project's established conventions (set namespaceSeparator ::,namespace ores #F2F2F2 {...}, class blocks with#F7E5FFfill, the standard GPL header).
Enrichment (notes, relationships, layout hints) stays hand-authored and is preserved in a dedicated section the script never touches.
Script design
File: build/scripts/generate_component_puml.py
usage: generate_component_puml.py [--project <name>] [--all] [--dry-run]
- Input:
projects/<name>/include/directory tree. - Output:
projects/<name>/modeling/<name>.puml. - Parse strategy: regex-based pass over header files; does not require
a full C++ parser. Targets
structandclassblocks at namespace scope, extracts public field declarations. Template specialisations and anonymous types are skipped. - Idempotent: if a
.pumlalready exists, the script regenerates only the auto-generated section (between@startumland the first' --- manualsentinel) and leaves everything after the sentinel untouched. - Missing modeling/ dir: created automatically.
Sentinel convention
' --- manual: everything below this line is hand-authored; the script preserves it ---
Anything before the sentinel is regenerated. Anything after is preserved verbatim. New files start with an empty manual section so enrichment can be added incrementally.
Phase Plan
Phase 1 — Script
Write and test build/scripts/generate_component_puml.py.
Acceptance:
--dry-run --allprints a diff of what would change without writing files.--project ores.iam.coreregenerates the auto-generated section of that component's.pumlwithout touching the existing hand-authored notes.- Running the script twice produces no diff (idempotent).
Phase 2 — Bulk generation
Run generate_component_puml.py --all to:
- Create
modeling/*.pumlfor the 27 projects that have none. - Populate the 4 empty placeholder files.
- Regenerate the ~20 stub files.
- Refresh the auto-generated section of the ~10 substantive existing models.
Commit the result as a single "bulk regeneration" commit.
Phase 3 — Enrichment (API and domain components)
Hand-enrich the diagrams for the components where relationships and layout matter most for documentation purposes. Priority order:
- Trading domain:
ores.trading.api— trade sub-structs, instrument variant, protocol types. - IAM:
ores.iam.api/ores.iam.core— account, session, party graph. - Refdata:
ores.refdata.api/ores.refdata.core— party hierarchy, country. - Workflow:
ores.workflow.api/ores.workflow.core— job, saga, step FSM. - Scheduler:
ores.scheduler.api/ores.scheduler.core— job definition, instance, lifecycle. - Reporting:
ores.reporting.api/ores.reporting.core— report definition, instance, execution status. - Qt plugin layer:
ores.qt.trading,ores.qt.compute,ores.qt.scheduler— controller/window pairs, form registry.
Each enrichment is a separate commit per component. The script is never re-run after enrichment without checking that the manual section is intact.
Phase 4 — PNG render script (optional)
Add build/scripts/render_puml.sh that calls plantuml on every
projects/*/modeling/*.puml in one pass. Useful for a local documentation
build. Not wired into CI.
Files
| File | Action |
|---|---|
build/scripts/generate_component_puml.py |
New — generation script |
build/scripts/render_puml.sh |
New (Phase 4, optional) — bulk PNG renderer |
projects/*/modeling/*.puml |
Updated — 61 components |
Effort and Risk
| Item | Effort | Risk |
|---|---|---|
| Phase 1: script | M | Low — regex parsing of well-formed C++ headers |
| Phase 2: bulk generation | S | Low — script-driven, one commit |
| Phase 3: enrichment | L | Low — each commit is independent |
| Phase 4: render script | S | None |
The main risk is the regex parser missing complex template or multi-line declarations. These are logged as warnings and left as stubs; they do not block the bulk generation commit.