Sprint Backlog 16
Sprint Mission
- TBD.
Stories
Active
| Tags | Headline | Time | % | ||
|---|---|---|---|---|---|
| Total time | 0:00 | 100.0 |
| Tags | Headline | Time | % | ||
|---|---|---|---|---|---|
| Total time | 0:00 | 100.0 |
TODO Distributed tracing: OpenTelemetry identifiers across Qt and NATS analysis
Background
OreStudio currently uses a single Nats-Correlation-Id header (UUID v4) that
is generated at the workflow service entry point and propagated to downstream
NATS calls. This lets you grep logs for a single workflow invocation, but it
does not answer two important questions:
- Which user session originated this request? A single session may trigger dozens of top-level requests (bundle list, bundle install, party provision, bootstrap clear); there is no way to group them under one session identity.
- Which specific operation within the session is this? Each Qt wizard step is a distinct user action; the current ID is generated server-side so the client cannot know it before the response arrives.
Furthermore, the relationship between the Qt client subsystem and the NATS
microservices subsystem is opaque: a log line in ores.refdata has no link
back to the Qt button click that ultimately caused it.
OpenTelemetry trace model
The W3C traceparent standard defines four fields carried in every call:
| Field | Size | Generated by | Meaning |
|---|---|---|---|
| version | 1 byte | fixed 00 |
Format version |
| trace-id | 16 bytes | originating client | One UUID for the entire top-level user action |
| parent-id | 8 bytes | caller of this hop | The span ID of the caller, linking parent↔child |
| span-id | 8 bytes | this service | Unique ID for this service's handling of the call |
| flags | 1 byte | originating client | Sampling decision |
Wire format: traceparent: 00-<trace-id>-<parent-id>-<flags>
Example call tree for workflow.v1.parties.provision:
Qt client trace=aabbcc span=0001 parent=0000 (root span, generated by Qt)
→ workflow trace=aabbcc span=0002 parent=0001
→ refdata.save trace=aabbcc span=0003 parent=0002
→ iam.save trace=aabbcc span=0004 parent=0002
→ iam.add-party trace=aabbcc span=0005 parent=0002
This allows reconstruction of the exact call tree from logs alone.
What needs to change
Two subsystems need to be joined:
- Qt client — generates the root
trace-idand the firstspan-idat the point of user action (button click / wizard Next). Passes both in every outgoing NATS message. Currently generates nothing. - NATS microservices — each handler extracts
traceparent, records theparent-id(caller's span), generates a newspan-idfor itself, and forwards an updatedtraceparent(sametrace-id, its ownspan-idas newparent-id) on every outbound NATS call.
In addition, a session ID (separate from trace/span) should be generated once on login and attached to every request for the lifetime of that session, enabling queries like "show all requests from user alice during session X".
Proposed NATS header set:
| Header | Meaning | Generated by |
|---|---|---|
Nats-Correlation-Id |
Current per-operation ID (keep for compat) | First handler |
traceparent |
W3C OpenTelemetry trace-id + span-id + parent | Qt client / handler |
Nats-Session-Id |
UUID for the user's login session | Qt client on login |
Scope of this story
This story covers analysis and design only. Output: a plan document under
doc/plans/ that specifies:
- Which headers to carry on every NATS message.
- How the Qt
ClientManagergenerates and propagatestraceparentandNats-Session-Id. - How
handler_helpers.hppis extended solog_handler_entryextracts, logs, and forwards all three headers in one call. - Whether to retain
Nats-Correlation-Id(backward compat) or replace it. - What a
Nats-Session-Idshould be tied to (login token? UI session UUID?). - The migration path: handlers currently log
correlation_id; how to move totraceparentwithout breaking existing log grep patterns. - Whether to integrate with an external OpenTelemetry collector (Jaeger / Tempo) or keep tracing in-process via log aggregation.
Implementation can follow as a separate story once the design is agreed.
- Tasks
[ ]Research W3Ctraceparent/tracestatespec and NATS header limits[ ]Identify all Qt entry points that should generate a root span (wizard steps, dialog confirms, background refresh calls)[ ]Identify all NATS handler hops that need span generation and forwarding[ ]Define the session ID lifecycle: created on login, invalidated on logout, stored inClientManager[ ]Decide: replaceNats-Correlation-Idor keep alongsidetraceparent[ ]Decide: structured log fields vs free-text grep-friendly format[ ]Writedoc/plans/document with the agreed design and migration steps
TODO Three-level provisioning: end-to-end testing code
End-to-end test of the complete three-level provisioning flow implemented
across PRs #582, #611, #614, #619, and the correlation ID / summary page
work on feature/three-level-provisioning-e2e. Branch:
feature/three-level-provisioning-e2e.
- Tasks
[ ]Provision a fresh tenant from scratch (recreate DB, start services)[ ]Log in as system admin; verifyTenantProvisioningWizardfires[ ]Step through bundle selection and install; verify DQ bundle published[ ]Step throughPartyProvisionPage; verify party + account created in DB withstatus'Inactive'=[ ]Verify correlation ID appears on the summary page and matchesores_workflow_workflow_instances_tbl.correlation_id[ ]Verify bootstrap flag cleared; wizard does not reappear on next login[ ]Log in as the provisioned party admin; verifyPartyProvisioningWizardfires[ ]Complete party wizard; verify partystatusset toActivein DB[ ]VerifyPartyProvisioningWizarddoes not reappear on subsequent logins[ ]Verify compensation: force failure in each of the 3 workflow steps and confirm rolled-back state in DB for party, account, account_party[ ]VerifyNats-Correlation-Idheader propagated torefdata.v1.parties.saveandiam.v1.accounts.save(check server logs)
TODO Provisioned accounts: force password reset on first login code
When workflow.v1.parties.provision creates an account it should set
password_reset_required = true so the party admin is forced to change the
initial password on first login. See Phase 4 in
plan.
- Tasks
[ ]Addpassword_reset_requiredcolumn toores_iam_accounts_tbl(or reuse existing field if present)[ ]provision_parties_workflow: set flag when creating account[ ]auth_handler.hpp: returnpassword_reset_requiredinlogin_responseif account flag is set, and reject login with a specific error code[ ]Qt: show password-change dialog on login when flag is set[ ]Clear flag on successful password change
TODO Multi-select LEI picker for PartyProvisionPage code
LeiEntityPicker currently supports single selection. Extend to multi-select
so the tenant admin can pick the full GLEIF hierarchy (root + subsidiaries)
in one pass, creating one provision_party_input entry per selected LEI.
See Phase 4 in
plan.
- Tasks
[ ]ExtendLeiEntityPickerto support multi-select mode[ ]PartyProvisionPage: iterate selected LEIs, build one input row per LEI[ ]Deriveprincipalper party (username_base + "_" + short_code)[ ]Show per-party rows in the page with optional credential override fields[ ]Summary page: list all provisioned usernames and their party names
TODO Async workflow progress for large party hierarchies code
For tenants with more than ~20 parties the synchronous provision-parties
endpoint will time out. Add an async path: return workflow_id immediately
and poll workflow.v1.status from a progress page. See Phase 4 in
plan.
- Tasks
[ ]Addworkflow.v1.statusNATS subject + request/response types[ ]workflow_handler: implement status query by workflow ID[ ]Add async variant ofprovision_parties_response(returnsworkflow_idonly)[ ]Qt: addBundleInstallPage-style async progress page inTenantProvisioningWizardwith polling timer[ ]Threshold: use async path whenparties.size() > 5(configurable)
TODO IAM/Refdata service boundary cleanup code
ores.iam.core currently crosses the service boundary in two places. These
are pre-existing violations noted in the plan and must be fixed to ensure
correct RLS enforcement and clean service ownership. See "Known pre-existing
violations" in
plan.
- Tasks
[ ]bootstrap_handler.hpp: replace directores_refdata_parties_tblwrite withrefdata.v1.parties.saveNATS call[ ]auth_handler.hpp: replace directores_refdata_parties_tblquery (auth_lookup_party) withrefdata.v1.parties.get-by-principalNATS call (add endpoint toores.refdataif missing)[ ]Verify RLS policies still enforced end-to-end after refactor[ ]Remove cross-schema table includes fromores.iam.coreCMake deps
TODO DQ/Refdata service boundary cleanup code
DQ bundle publication currently writes directly to ores_refdata_* tables.
This must be routed via refdata.v1.* NATS endpoints. See "Open Questions"
in plan.
- Tasks
[ ]Identify all directores_refdata_*writes insideores.dqpublication pipeline[ ]Add any missingrefdata.v1.*NATS endpoints needed by DQ[ ]Rewrite DQ publication to use NATS calls instead of direct DB writes[ ]Verify bundle publish end-to-end after refactor[ ]Remove cross-schema table includes fromores.dqCMake deps
TODO Extend ores.workflow: trade-expiry workflow code
First financial workflow in ores.workflow: expire a trade and cascade to
positions, P&L reporting and scheduler cleanup. See Phase 5 in
plan.
- Tasks
[ ]Addworkflow.v1.trade-expiryNATS subject + request/response types[ ]Implementtrade_expiry_workflowexecutor (4 steps + compensation)- Step 1:
trading.v1.trades.expire - Step 2:
risk.v1.positions.update - Step 3:
reporting.v1.runs.trigger-pnl - Step 4:
scheduler.v1.jobs.remove
- Step 1:
[ ]Register inworkflow_handlerandregistrar.cpp[ ]Qt: trigger from trade blotter context menu ("Expire trade")[ ]Integration test: verify all 4 steps and compensation
TODO Extend ores.workflow: barrier-event workflow code
Second financial workflow: apply a knock-in/out barrier event to a trade and cascade to Greeks recomputation and reporting. See Phase 5 in plan.
- Tasks
[ ]Addworkflow.v1.barrier-eventNATS subject + request/response types[ ]Implementbarrier_event_workflowexecutor (3 steps + compensation)- Step 1:
trading.v1.trades.apply-barrier-event - Step 2:
risk.v1.greeks.recompute - Step 3:
reporting.v1.runs.trigger
- Step 1:
[ ]Register inworkflow_handlerandregistrar.cpp[ ]Qt: trigger from trade detail dialog when barrier condition is met[ ]Integration test: verify all 3 steps and compensation
TODO Positions domain model code
Implement the positions domain model. A position aggregates the net exposure for a given instrument and book combination, derived from the trade blotter. See plan for context.
Covers long/short positions across all instrument families, backed by one new
temporal table: ores_trading_positions_tbl (book_id, instrument_id,
trade_type_code, quantity, notional, currency, as_of_date, and standard
temporal/audit fields).
- Tasks
[ ]SQL:ores_trading_positions_tbl+ notify trigger + drop files[ ]SQL: Register intrading_create.sql,drop_trading.sql[ ]Domain:positionstruct, JSON I/O, table I/O, protocol messages[ ]Repository:positionentity, mapper, repository[ ]Service:position_service[ ]Server: messaging handler + registrar registration[ ]Qt UI:ClientPositionModel,PositionMdiWindow,PositionDetailDialog,PositionHistoryDialog,PositionController,MainWindowintegration[ ]Database: recreate to pick up new table
TODO Source report definitions from DQ instead of hardcoded C++ code
Background
Report definitions in PartyProvisioningWizard are currently hardcoded as a
constexpr std::array in C++ (PartyProvisioningWizard.cpp, ~line 427). This
is architecturally inconsistent with how all other seedable reference data is
handled in OreStudio, which uses the DQ artefact pipeline:
- A staging table (
dq_*_artefact_tbl) holds the source data - An artefact type entry in
ores_dq_artefact_types_tblmaps staging → target- publish function
- A dataset (e.g.
ore.report_definitions) references the staging table - Bundles (
organisation, or a newore_analyticsbundle) group datasets - The publish function copies approved rows into the target table
Business units, portfolios, and books already follow this pattern. Report definitions do not, causing two problems:
- Off-by-one fragility: the array was declared
std::array<ReportEntry, 28>with only 27 entries, producing a zero-initialised trailing entry withname = "". This triggered a DB check constraint violation during party provisioning (fixed as a stopgap by changing the array size to 27). - Non-evolvable: adding, renaming, or adjusting a report definition requires a C++ recompile and new release. There is no way to update defaults via data tooling or deliver them as part of a bundle update.
Target architecture
SQL seed data (populate script)
→ ores_dq_report_definitions_artefact_tbl (staging)
→ ores_dq_report_definitions_publish_fn (publish function)
→ ores_reporting_report_definitions_tbl (target, party-scoped)
The PartyProvisioningWizard loads candidate definitions by querying the DQ
artefact table for the selected bundle, presents them with checkboxes (same UX
as today), and on confirmation calls the existing
reporting.v1.report-definitions.save endpoint for each selected entry. The
hardcoded k_default_reports array is deleted.
Scope
This story covers the full stack end-to-end: SQL schema, DQ pipeline, seed data, and Qt wizard refactor.
- Tasks
[ ]SQL: createores_dq_report_definitions_artefact_tblwith columnsname,description,report_type,schedule_expression,concurrency_policy,display_orderplus standard DQ audit/temporal fields; add todq_create.sqlanddrop_dq.sql[ ]SQL: writeores_dq_report_definitions_publish_fnthat inserts approved artefact rows intoores_reporting_report_definitions_tblscoped to the calling party (mirrorsores_dq_books_publish_fnpattern)[ ]SQL: register the new artefact typereport_definitionsindq_artefact_types_populate.sqlpointing at the staging table, target table, and publish function[ ]SQL: create seed populate scriptpopulate/reporting/reporting_report_definitions_populate.sqlwith the 27 standard ORE analytics definitions (migrate fromk_default_reports) and wire it into the reporting populate orchestrator[ ]SQL: add a new bundleore_analytics(or extendorganisation) indq_dataset_bundle_populate.sqland register theore.report_definitionsdataset as a member[ ]API: addget_report_definition_templates_request/responsetoores.reporting.api(or reuse the DQ artefact list endpoint) so the Qt client can fetch candidates without a direct DB query[ ]Qt: refactorPartyReportSetupPageto load report definition candidates from the API call above oninitializePage()instead of iteratingk_default_reports; remove theconstexprarray entirely[ ]Qt: handle async load inPartyReportSetupPage: show a spinner while fetching, populate the list widget on success, show an error label on failure[ ]SQL: recreate database to pick up new tables and seed data; verify 27 artefact rows present in staging table afterpopulaterun[ ]End-to-end test: run party provisioning wizard, confirm report definitions created inores_reporting_report_definitions_tblmatch the selected artefacts, confirm no check constraint violations
DONE Rename instrument_family to product_type across trading domain code
The field instrument_family in ores_trading_trades_tbl is a DB routing
discriminator — it tells the system which product-specific extension table to
use, not which asset class the trade belongs to. The name "family" implies a
risk taxonomy, which caused it to be confused with asset_class.
FpML and the ISDA CDM both use product terminology: the abstract base type
is Product; the structural classification is productType. Renaming to
product_type makes the field's role explicit and aligns with industry-standard
language.
This is a pure rename — values (swap, fx, bond, credit, equity,
commodity, composite, scripted) are unchanged. No behaviour changes.
Scope: SQL DDL, C++ domain structs, NATS message types, trade handler, repository mapper, Qt trade window.
- Tasks
[X]Rename PG enum type:instrument_family_t→product_type_tintrading_instrument_family_type_create.sql; rename file totrading_product_type_create.sql[X]Rename column and index intrading_trades_create.sql[X]Updatetrading_trades_functions_create.sqlandtrading_trades_bu_functions_create.sql[X]Updatetrading_create.sql\irinclude for renamed file[X]Rename field inores.trading.api/domain/trade.hpp[X]Rename field inores.trading.api/messaging/instrument_protocol.hpp[X]Rename field inores.trading.core/include/.../trade_entity.hpp[X]Updatetrade_mapper.cpp,trade_repository.cpp,instrument_handler.hpp,trade_handler.hpp[X]UpdateTradeMdiWindow.cpp,TradeController.cpp,ImportTradeDialog.cpp,importer.cpp,xml_trade_import_tests.cpp[X]Update plan document and sprint backlog to reflectproduct_typedecision
IN-PROGRESS Unify asset class modelling across trading and market data analysis
Background
Asset classes are currently modelled independently — and inconsistently — in three places, with no shared source of truth:
- Trading domain (
ores.trading):instrument_family_tis a hard-coded PostgreSQLCREATE TYPE … AS ENUMwith eight values (swap,fx,bond,credit,equity,commodity,composite,scripted). Extending it requires a DDLALTER TYPEmigration and a C++ recompile. File:projects/ores.sql/create/trading/trading_instrument_family_type_create.sql - Market data domain (
ores.marketdata):asset_classis a hard-coded C++ enum (fx,rates,credit,equity,commodity,inflation,bond,cross_asset) serialised as an unconstrainedTEXTcolumn inores_marketdata_series_tbl. No database-level validation exists; any string is accepted. Files:projects/ores.marketdata.api/include/ores.marketdata.api/domain/asset_class.hpp,projects/ores.marketdata.core/src/repository/market_series_mapper.cpp - Refdata domain (
ores_refdata_asset_classes_tbl): a proper bitemporal reference table already exists, complete with a validation function (ores_refdata_validate_asset_class_fn) and a DQ publish pipeline. However it is seeded with FpML PascalCase codes ("Commodity","ForeignExchange", etc.) — a different namespace from both the trading enum and the market data enum — so it cannot serve as a shared source of truth without first seeding the ORE codes. Files:projects/ores.sql/create/refdata/refdata_asset_classes_create.sql,projects/ores.sql/populate/fpml/fpml_asset_class_artefact_populate.sql - Qt client (
ores.qt):ClientMarketSeriesModelhard-codes the eight display labels in C++;MarketSeriesMdiWindowhard-codes filter combo entries. Any new asset class requires changes in at least four files across two subsystems. Files:projects/ores.qt/src/ClientMarketSeriesModel.cpp,projects/ores.qt/src/MarketSeriesMdiWindow.cpp
Complete reconciliation audit
The following table captures every representation of the asset class concept currently in the codebase:
| ORE concept | C++ enum | DB stored as | FpML code | Qt label |
|---|---|---|---|---|
| FX | asset_class::fx |
"fx" |
ForeignExchange |
"FX" |
| Rates | asset_class::rates |
"rates" |
InterestRate |
"Rates" |
| Credit | asset_class::credit |
"credit" |
Credit |
"Credit" |
| Equity | asset_class::equity |
"equity" |
Equity |
"Equity" |
| Commodity | asset_class::commodity |
"commodity" |
Commodity |
"Commodity" |
| Inflation | asset_class::inflation |
"inflation" |
(not seeded) | "Inflation" |
| Bond | asset_class::bond |
"bond" |
(not seeded) | "Bond" |
| Cross Asset | asset_class::cross_asset |
"cross_asset" |
(not seeded) | "Cross Asset" |
Trading-only instrument_family_t values that have no asset class counterpart:
| DB ENUM value | Meaning |
|---|---|
swap |
Instrument structure / routing discriminator |
composite |
Instrument architecture |
scripted |
Instrument architecture |
Key finding: instrument_family and asset_class are two orthogonal concepts
Two distinct concepts must be kept separate:
- Asset class (
asset_class) — risk taxonomy: answers "what underlying economic risk does this expose you to?" Applies equally to market data series and to trades. Corresponds to FpML'sassetClasselement and the CDM's top-level product category. Values:fx,rates,credit,equity,commodity,inflation,bond,cross_asset. - Instrument type (
instrument_family, to be renamedproduct_type) — product structure and DB routing discriminator: answers "what kind of financial instrument is this structurally?" Applies only to trades. Corresponds to FpML'sproductTypeand CDM's concrete product type. Values:swap,fx,bond,credit,equity,commodity,composite,scripted.
The concepts are many-to-many: a swap (instrument type) can be rates,
credit, equity, inflation, or fx (asset class) depending on the
specific trade. composite and scripted have no single implied asset class
at all. The name overlap (fx, bond, credit, equity, commodity appear
in both) is coincidental: in the trading schema they identify DB extension
tables and routing paths; they are not risk buckets.
The goal is not to end up with a single field; it is to give each concept a precise, non-overlapping definition and ensure both are present and validated where applicable. After this work:
- Market data series: has
asset_class, noproduct_type. - Trades: has both
product_type(routing) ANDasset_class(risk).
See doc/plans/2026-04-01-asset-class-unification.org for full design.
Infrastructure that already exists
The refdata infrastructure was clearly designed for exactly this unification:
ores_refdata_asset_classes_tbl— bitemporal table withcode+coding_scheme_codePK, allowing multiple namespaces (ORE codes and FpML codes) to coexist.ores_refdata_validate_asset_class_fn(tenant_id, value)— validation function, fully implemented but not called from the market data insert trigger.ores_dq_asset_classes_publish_fn— DQ publish function that copies approved artefact rows into the refdata table.- Notify trigger on
ores_refdata_asset_classes_tbl— already wired for event-driven cache invalidation.
All that is missing is: (a) seeding the ORE codes into the DQ artefact table,
(b) calling the validation function from the market_series insert trigger,
and (c) a NATS endpoint so the Qt client can fetch the list dynamically.
Why this matters
- No single source of truth. The same concept is expressed three different ways (C++ enum, DB text, FpML string, Qt label) with no enforcement linking them.
- No database enforcement. The
TEXTcolumn inmarket_seriesaccepts any string; correctness depends entirely on C++ serialisation code. - FpML seeding is incomplete.
inflation,bond, andcross_assethave no FpML counterpart seeded; they would fail validation if the function were called today. - Hard to extend. Adding a new asset class requires DDL, a C++ enum change, serialiser changes, and Qt label updates with no data-driven path.
- Client/UI fragility. Combo entries are duplicated string literals that can silently diverge from server-side values.
Target architecture
ores_refdata_asset_classes_tbl ← single source of truth coding_scheme_code = 'ORE' 8 ORE codes (lowercase snake_case) coding_scheme_code = 'FPML_ASSET_CLASS' 8 FpML codes (PascalCase) ores_marketdata_series_tbl asset_class TEXT ← validated by ores_refdata_validate_asset_class_fn series_subclass TEXT ← lightweight check constraint (no refdata table) ores_trading_trades_tbl product_type product_type_t (renamed from instrument_family) asset_class TEXT (new column, same validation fn) NATS: refdata.v1.asset-classes.list → Qt loads combo dynamically; no hardcoded labels remain
Decisions
- ORE codes are lowercase snake_case (
fx,rates,cross_asset) — consistent with all other ORE codes in the system. instrument_familyis renamedproduct_typeeverywhere: PG enum type, column name, C++ field name. Values are unchanged.series_subclassis validated via a DB check constraint (not a refdata table); it is too tightly coupled to the ORE key structure to be worth a full DQ pipeline.- Full design and implementation phases in
doc/plans/2026-04-01-asset-class-unification.org.
- Tasks
[X]Audit all places where asset class / instrument family values are hard-coded: SQL enums, C++ enums, serialisers, Qt combo entries, seed data (done — see reconciliation table above)[X]Establish thatinstrument_familyandasset_classare different concepts and should not be merged; define both precisely against FpML/CDM (asset_class= risk taxonomy,product_type= structural routing discriminator)[X]Confirm thatores_refdata_asset_classes_tbl+ validation function already provide the required infrastructure[X]Decide canonical ORE code set string format: lowercase snake_case, consistent with all other ORE codes[X]Decideinstrument_family→product_typerename (values unchanged)[X]Assessseries_subclass: DB check constraint sufficient, no refdata table[X]Write plan document:doc/plans/2026-04-01-asset-class-unification.org[ ]Phase 1: Renameinstrument_family→product_typethroughout (DDL, C++, NATS messages, Qt)[ ]Phase 2: Seed ORE codes + missing FpML codes into artefact table; publish to refdata[ ]Phase 3: Enforceasset_classvalidation inmarketdata_seriestrigger; addseries_subclasscheck constraint[ ]Phase 4: Addasset_classcolumn toores_trading_trades_tbl+ C++ + NATS messages + Qt trades view[ ]Phase 5:refdata.v1.asset-classes.listNATS endpoint + Qt data-driven combo
BLOCKED Add missing party isolation RLS policies code
Several tables have a party_id column and tenant-level RLS enabled, but are
missing the required AS RESTRICTIVE party isolation policy. These were
identified by the RLS_002 check in validate_schemas.sh and added to
validation_ignore.txt as TODOs. Until the policies are added, a tenant admin
can query rows belonging to any party in the tenant.
For each table the fix is the same pattern: add an AS RESTRICTIVE FOR SELECT
policy using party_id = ANY(ores_iam_visible_party_ids_fn()) in the
corresponding *_rls_policies_create.sql file, then remove the RLS_002
ignore entry from validation_ignore.txt.
Also covers hardening the trading instrument subtables (currently accessed
only via the parent ores_trading_instruments_tbl which has RLS, but direct
queries bypass it).
- Tasks
[X]ores_iam_account_parties_tbl: addAS RESTRICTIVEparty isolation policy iniam_rls_policies_create.sql[X]ores_refdata_business_units_tbl: addAS RESTRICTIVEparty isolation policy inrefdata_rls_policies_create.sql[X]ores_refdata_party_contact_informations_tbl: addAS RESTRICTIVEparty isolation policy inrefdata_rls_policies_create.sql[X]ores_refdata_party_countries_tbl: addAS RESTRICTIVEparty isolation policy inrefdata_rls_policies_create.sql[X]ores_refdata_party_currencies_tbl: addAS RESTRICTIVEparty isolation policy inrefdata_rls_policies_create.sql[X]ores_refdata_party_identifiers_tbl: addAS RESTRICTIVEparty isolation policy inrefdata_rls_policies_create.sql[X]ores_scheduler_job_instances_tbl: addAS RESTRICTIVEparty isolation policy inscheduler_rls_policies_create.sql[X]Trading instrument subtables: addENABLE ROW LEVEL SECURITY+ tenant isolation policies forores_trading_instruments_tbland all subtables (bond,commodity,equity,fx,credit,scripted,composite,composite_legs,swap_legs) intrading_rls_policies_create.sql[X]For each fixed table: remove the correspondingRLS_001/RLS_002ignore entry fromutility/validation_ignore.txt[X]Runvalidate_schemas.shand confirm zero warnings (0 warnings)
TODO Automate new service registration infra
Background
When ores.marketdata.service was added to the running system, the following
manual steps were required across multiple files, with no single checklist and
no tooling to enforce completeness. Several were discovered only at runtime
(missing NATS cert, missing DB user, missing IAM role, missing IAM
permissions), requiring iterative recreate_database runs.
Manual steps identified (in discovery order)
build/scripts/generate_nats_certs.sh— add service name toSERVICESarray so that an mTLS client certificate is generated. Without this the service crashes immediately on NATS connect withConnection Closed.projects/ores.codegen/models/services/ores_services_service_registry.json— add service entry (name, psql_var, env_key, iam_role, dml_prefixes, select_tables). Then re-run theservice-registrycode-gen profile to regenerate five files:projects/ores.sql/service_vars.shprojects/ores.sql/create/iam/service_users_create.sqlprojects/ores.sql/create/iam/iam_service_db_grants_create.sqlprojects/ores.sql/populate/iam/iam_service_accounts_populate.sqlprojects/ores.sql/populate/iam/iam_service_account_roles_populate.sql
projects/ores.sql/teardown_all.sql— adddrop role if exists <env>_<service>_service;entry (not covered by code-gen).projects/ores.sql/populate/iam/iam_permissions_populate.sql— register all domain permissions (<service>::<resource>:<action>and<service>::*). These must exist before any role can reference them.projects/ores.sql/populate/iam/iam_roles_populate.sql— create the IAM role and assign permissions. Fails at runtime if permissions from step 4 are absent..env— addORES_<SERVICE>_SERVICE_DB_USERandORES_<SERVICE>_SERVICE_DB_PASSWORDvariables (currently done byinit-environment.sh, but only if the service is known to that script).
Proposed improvements
- Extend
generate_nats_certs.shto derive its service list fromservice_vars.sh(SERVICE_NAMESarray) rather than a hard-codedSERVICESarray, so adding to the registry automatically covers cert generation. - Extend the
service-registrycode-gen profile (or add a new template) to also emit the IAM permissions and IAM role seed SQL for each service, driven by thedml_prefixesandselect_prefixesfields already in the registry. This would cover steps 4 and 5 automatically. - Extend the
service-registrycode-gen profile to emit theteardown_all.sqlfragment for each service (or generate a separateteardown_services.sqlthat is included), covering step 3. - Add a
new-service checklisttoCLAUDE.mdor a dedicateddoc/how-to/add-a-new-service.mdcovering all manual steps, so that until full automation is in place nothing is missed. - Add a validation step to
recreate_database.sh(orvalidate_schemas.sh) that cross-checks: every service inservice_vars.shhas a NATS cert, an IAM role, and at least one registered permission.
- Tasks
[ ]Drivegenerate_nats_certs.shfromSERVICE_NAMESinservice_vars.shinstead of a hard-coded array[ ]Add service-registry code-gen template for IAM permissions seed SQL[ ]Add service-registry code-gen template for IAM role seed SQL[ ]Add service-registry code-gen template (or include fragment) forteardown_all.sqlservice role drops[ ]Add cross-check torecreate_database.sh/validate_schemas.sh: every service has NATS cert + IAM role + ≥1 permission[ ]Document the remaining manual steps indoc/how-to/add-a-new-service.mdas an interim checklist
TODO Party wizard UX improvements code
Follow-on UX polish for the three-level provisioning wizard, deferred from the three-level provisioning plan (Phase 4).
- Tasks
[ ]Force password reset: addpassword_reset_required = trueflag to accounts created byprovision-parties;auth_handler.hppalready handles this flag[ ]Multi-select LEI picker: extendLeiEntityPickerfrom single- to multi-select soPartyProvisionPagecan select a full GLEIF hierarchy in one pass[ ]Per-party credential override: allow eachPartyProvisionPageentry to override the shared username/password with per-party credentials[ ]Async wizard progress: switchprovision-partiesto async for hierarchies > 20 parties — returnworkflow_idimmediately and pollworkflow.v1.statusfrom the wizard progress page
TODO Financial workflows: trade-expiry and barrier-event code
Extend ores.workflow with the first two financial workflow types, deferred
from the three-level provisioning plan (Phase 5).
- Tasks
[ ]Implementtrade-expiryworkflow executor:- Step 1:
trading.v1.trades.expire{ trade_id } - Step 2:
risk.v1.positions.update{ trade_id } - Step 3:
reporting.v1.runs.trigger-pnl{ party_id, date } - Step 4:
scheduler.v1.jobs.remove{ trade_id } - Compensation:
trading.v1.trades.reinstateon Step 1 failure
- Step 1:
[ ]Implementbarrier-eventworkflow executor:- Step 1:
trading.v1.trades.apply-barrier-event{ trade_id, event_type } - Step 2:
risk.v1.greeks.recompute{ trade_id } - Step 3:
reporting.v1.runs.trigger{ party_id }
- Step 1:
[ ]Register both executors in workflow service wiring[ ]Integration tests for both workflows (happy path + compensation)
TODO Qt: instrument creation in TradeDetailDialog code
When a trade is created in TradeDetailDialog, the instrument tabs are hidden
and no instrument can be entered. This is because instruments are currently only
loaded asynchronously from an existing server record. The standalone per-family
instrument dialogs (which previously allowed direct instrument creation) have
been removed as part of the unified-dialog plan.
The fix is to reveal the instrument tabs in create mode once the user selects a
product_type (via the trade type field or a new family selector combo), and to
send a save_*_instrument_request for the new instrument before the
save_trade_request on first save.
Affects all instrument families: FX (PR 1), Swap/Rates (PR 2), Bond + Credit (PR 3), Equity + Commodity (PR 4), Composite (PR 5), Scripted (PR 6).
- Tasks
[ ]Add a "Product Type" selector (QComboBox) to the General tab that maps totrade.product_type; pre-populate with all known families[ ]In create mode: show the relevant instrument tabs when a product type is selected (hide them when none is selected)[ ]On save in create mode: if instrument tabs are visible, send the appropriatesave_*_instrument_requestfirst, then use the returned instrument ID to populatetrade.instrument_idbefore sendingsave_trade_request[ ]Handle save failure for the new instrument (surface error, do not save the trade)[ ]Apply to all six instrument families as each PR is merged
TODO NATS-based health for all services: HTTP, WT and compute wrapper code
Currently ores.http.server, ores.wt.service, and ores.compute.wrapper
do not publish NATS heartbeats, so the Service Dashboard can only show them
as "Running" (process-alive via the controller) rather than "Online"
(confirmed healthy via NATS telemetry). All services should be first-class
NATS participants so that a single, uniform health signal is used across the
board.
- Background
The three services that are not NATS-based:
ores.http.server— REST gateway; no NATS messaging today.ores.wt.service— Wt web-UI server; no NATS messaging today.ores.compute.wrapper— Grid worker node; communicates directly with the compute service over NATS work/result subjects, but does not publish a standard heartbeat and has no service identity in the NATS namespace.
- Scope
ores.http.server: add NATS connectivity and publish a standardservice_heartbeat_messageevery 15 s viaheartbeat_publisher. Register the service in the NATS subject namespace underores.http.server.ores.wt.service: same as above — connect to NATS and publish heartbeats.ores.compute.wrapper: register each replica as a named NATS participant. Publish heartbeats including the replica index so the controller can track each worker individually. The existing compute work/result subjects are unaffected.
- Tasks
[ ]ores.http.server: add NATS client + heartbeat_publisher (mirrors pattern in all domain services)[ ]ores.wt.service: same[ ]ores.compute.wrapper: add NATS heartbeat per replica; include replica index and host-id in heartbeat payload[ ]Update Service Dashboard: remove "Running" fallback path once all services send heartbeats; "Online" becomes the single source of truth[ ]Updatecontroller_service_definitions_populate.sql: remove custom args_templates for http/wt/wrapper once they no longer need to diverge from the default
TODO Trade-aware market data filtering for report execution code
The report execution workflow currently fetches all market data series for the tenant when gathering market data (Phase 3.5). This is correct but wasteful: a portfolio of interest rate swaps only needs yield curves and fixings, not FX vol surfaces or commodity curves.
Implement trade-aware market data derivation: given the trade portfolio gathered in step 0, determine which market data series are actually required for the ORE computation. This requires inspecting trade types, currencies, underliers, and the analytics flags in risk_report_config to build a precise list of required series types, currencies, and tenors.
- Background
The report execution pipeline gathers data in stages. Step 0 (gather_trades) produces a MsgPack blob in object storage containing all trades and instruments in scope. Step 1 (gather_market_data) currently fetches all tenant market data because we lack the logic to derive requirements from the trade portfolio.
ORE (Open Source Risk Engine) requires specific market data for each trade type:
- Interest rate swaps: discount curves, forecast curves, fixings
- FX forwards/options: FX spot rates, FX vol surfaces
- Credit default swaps: default probability curves, recovery rates
- Equity options: equity spot prices, equity vol surfaces
- Commodities: commodity price curves
The mapping from trade type → required market data is ORE-domain knowledge that belongs in
ores.ore(not inores.reporting). - Tasks
[ ]Design amarket_data_requirementsstruct that captures the set of required series types, currency pairs, and tenors[ ]Implementderive_market_data_requirements(trades, risk_report_config)inores.orethat inspects the trade portfolio and analytics flags[ ]Add a NATS request/reply API for market data derivation so the reporting service can call it without pulling in ORE domain logic[ ]Updategather_market_datastep to use derived requirements instead of fetching everything[ ]Add unit tests with representative trade portfolios (IR swaps, FX, credit, equity, commodity) verifying correct market data derivation
TODO Scheduler: complete Qt UI and promote to top-level menu code
Background
The scheduler backend (ores.scheduler.*) is fully implemented: job_definition,
job_instance, job_status, cron_expression, cron_scheduler, and
scheduler_loop are all present. On the Qt side only JobDefinitionController
exists, and it is wired into Reporting > &Job Definitions.
This is architecturally wrong on two counts:
- Wrong menu placement: scheduling is a cross-domain concern. It will drive report runs, trade expiry, housekeeping jobs, data archival, cache invalidation, and any future scheduled workflow. Placing it under Reporting implies it only exists to schedule reports.
- Missing UI:
job_instance(the execution record) has no Qt controller, window, or detail dialog. Users can define schedules but cannot see what ran, when, whether it succeeded, or inspect its output.
Target state
A dedicated top-level &Scheduler menu replaces the Reporting > &Job Definitions
entry:
&Scheduler
├── &Job Definitions ← move from Reporting > Job Definitions
├── &Job Instances ← new: execution history per job
├── ─────
└── &Monitor ← new: live view of running/queued jobs and
next-fire times (analogous to Service Dashboard)
Job Definitions and Job Instances follow the same controller/MDI window/detail
dialog/history dialog pattern used throughout the application.
Scheduler Monitor is a singleton MDI window (similar to ServiceDashboardMdiWindow)
showing: currently running jobs, queue depth, next scheduled fire time per job,
and last execution status per job. Auto-refreshes on a configurable timer.
What already exists
| Layer | Exists | Missing |
|---|---|---|
| Backend API | job_definition, job_instance, job_status, |
|
cron_expression domain types + JSON/table I/O |
||
| Backend service | cron_scheduler, scheduler_loop, |
|
job_definition_service, |
||
job_definition_handler, |
||
job_instance_repository |
||
| NATS protocol | scheduler_protocol.hpp |
list_job_instances endpoint |
get_scheduler_status endpoint |
||
| Qt | JobDefinitionController, |
JobInstanceController |
JobDefinitionMdiWindow, |
JobInstanceMdiWindow |
|
JobDefinitionDetailDialog, |
JobInstanceDetailDialog |
|
JobDefinitionHistoryDialog |
SchedulerMonitorMdiWindow |
|
SchedulerPlugin (new plugin) |
Scope
This story covers the full stack: any missing NATS endpoints, all missing Qt
widgets, relocation of JobDefinitionController from ComputePlugin to the new
SchedulerPlugin, and the new &Scheduler top-level menu.
- Tasks
[ ]NATS: addscheduler.v1.job-instances.listrequest/response toores.scheduler.api/messaging/scheduler_protocol.hpp; implement handler inores.scheduler.core[ ]NATS: addscheduler.v1.statusrequest/response (queue depth, running count, next-fire map per job); implement handler[ ]Qt:ClientJobInstanceModel— table model wrappingjob_instancelist[ ]Qt:JobInstanceMdiWindow— list view of instances with status badges, start time, duration, exit code[ ]Qt:JobInstanceDetailDialog— detail view of a single instance (cron expression, trigger time, log excerpt if available)[ ]Qt:SchedulerMonitorMdiWindow— singleton live view; auto-refresh timer; shows per-job: last run status, next fire time, running count[ ]Qt:SchedulerPlugin— new plugin inores.qt.compute; ownsJobDefinitionController,JobInstanceController,SchedulerMonitorController; creates&Schedulermenu; contributes no toolbar actions (scheduler is not a daily-ops toolbar item)[ ]Qt: removeJob DefinitionsfromComputePlugin::create_menus(Reportingmenu); delete the entry fromReporting[ ]Qt: registerSchedulerPlugininPluginRegistryin the correct position so&Schedulerappears between&Reportingand&System[ ]Updatedoc/analysis/qt-menu-analysis.orgto reflect the new structure
Footer
| Previous: Sprint Backlog 15 | Next: TBD |