ORE Instrument Mappers — Phase 9

Table of Contents

Overview

Phase 9 adds forward and reverse instrument mappers to ores.ore for each product family established in Phases 0–8, and puts in place the testing infrastructure that will govern correctness of ORE XML round-trips going forward.

The existing trade_mapper in ores.ore already maps ORE XML to ores::trading::domain::trade (the envelope). It handles metadata only: external_id, trade_type, netting_set_id. Phase 9 extends this to cover instrument economics — the product-specific data stored in swap_instrument, fx_instrument, bond_instrument, etc.

Motivation

Grid job valuation uses ORE XML as the execution format. ORES stores trades in a relational domain model. For ORES to produce correct valuations it must be able to reconstruct faithful ORE XML from its own domain types. Any missing field is a silent valuation error — the grid job will run, produce a number, but it will be wrong.

This means completeness and fidelity of mappers are hard dependencies, not aspirational quality attributes.

What We Are Testing — Three Distinct Requirements

Three separate testing concerns emerged from analysis. They are related but independent and must not be conflated.

Thing 1 — ores.ore Serialization Fidelity (Golden File Regression)

Question: When ores.ore reads ORE XML and writes it back, does the output stay identical across code changes?

Method:

  1. Read ORE XML using ores.ore XSD-generated domain types.
  2. Serialize back to XML (canonical form — fixed element order, stable date and float formatting, no optional whitespace variation).
  3. Diff against a committed golden file in assets/test_data/golden_dataset/.
  4. Any diff fails the test.

Owner: ores.ore — this is purely a test of the XSD-generated serialization layer. It does not involve ORES domain types or mappers.

Key requirement: Serialization must be deterministic. The reverse mapper (Thing 3) must produce byte-identical output for the same input so the golden diff is meaningful. This means: fixed element ordering (match XSD sequence), ISO 8601 dates with no trailing zeros, fixed decimal precision for rates and notionals, no locale-sensitive number formatting.

Thing 2 — ORE Coverage Completeness (Python Gap Check)

Question: Does our ORES golden dataset cover all fields and trade types present in the upstream ORE example library?

Method: A Python script scripts/ore_coverage_check.py compares assets/test_data/golden_dataset/ against external/ore/examples/Products/Example_Trades/. It reports:

  • Trade types in external/ not present in golden_dataset/
  • Fields present in external/ example XML not present in the corresponding ORES-generated golden file
  • XSD validation errors: each golden file is validated against the ORE XSD schemas (external/ore/xsd/) using lxml; any schema violation is reported as a gap (and treated as a blocking error once the family is mandatory)

Progression:

  • Initially non-blocking: the script runs as a GitHub Action step and produces a coverage report. Build passes regardless.
  • Becomes mandatory (blocking) once all gaps reach zero.
  • The blocking threshold is per product family — a family can be made mandatory once its own gaps are resolved, before the full inventory is complete.

Owner: Cross-component — the script compares against external/ore/examples/ and generates a report. It does not own any component's code.

Key insight: This is separate from Things 1 and 3. Even if round-trips are perfectly stable (Thing 1) and mappers lose no data (Thing 3), coverage can still be zero if we simply haven't written any mappers yet. Coverage completeness is about breadth; round-trip fidelity is about depth.

Thing 3 — Mapper Fidelity (No Data Loss)

Question: When ORES maps ORE XML → ORES domain types and back → ORE XML, are all fields preserved?

Method:

  1. Read ORE XML using ores.ore XSD types (forward parse — same as Thing 1 step 1).
  2. Map ORE XSD types → ORES relational domain types using the forward mapper (ores::ore::domain::trade_mapper extended to call instrument mappers).
  3. Map ORES domain types → ORE XSD types using the reverse mapper (new in Phase 9).
  4. Serialize the reconstructed ORE type to XML.
  5. Diff against the original golden file (which was generated by Thing 1 step 2 from the same source).

If there is a diff, a field was lost in step 2 or corrupted in step 3. This is a mapper fidelity failure.

Owner: Per component — ores.trading owns mapper fidelity for trade economics; ores.market_data will own it for curves, vol surfaces, fixings.

Key insight: Thing 3 depends on Thing 1 (golden files must exist first). But Thing 1 does not depend on Thing 3. The right order is: implement Thing 1 first, generate golden files, then implement mappers and verify with Thing 3.

Golden Dataset

Location and Structure

assets/test_data/golden_dataset/
  Products/
    Example_Trades/
      IR_Swap_Vanilla.xml
      IR_FRA.xml
      IRFX_Cross_Currency_Swap_rebalancing.xml
      IR_Cap_on_IBOR.xml
      ...

Structure mirrors external/ore/examples/ exactly. When ores.market_data adds market data golden files (curves, fixings, etc.), they go into the same top-level structure under their respective subdirectory (Input/, etc.).

How Golden Files Are Created

Golden files are generated by ORES, not copied from external/ore/examples/. The workflow is:

  1. Read the corresponding ORE example XML using ores.ore XSD types.
  2. Serialize back to canonical XML using the ores.ore serializer.
  3. Write the result to assets/test_data/golden_dataset/.
  4. Commit the file to git.

This means the golden file represents what ORES currently produces for a given input. It is the baseline. If the code changes and the output changes, the test fails and the developer must either fix the regression or update the golden file with intent.

Why Not Copy ORE Example Files Directly

ORE example XML files are written by humans for readability. They may have inconsistent whitespace, optional elements absent, attribute ordering that differs from schema sequence, etc. ORES's serializer will produce canonical output that may differ in formatting even when semantically identical. We need the golden file to match what ORES produces, not what ORE upstream wrote.

Implementation Plan

Batch Strategy

Each batch covers one product family. Within a batch:

  1. Write Thing 1 test harness (XSD round-trip) for the family's example files.
  2. Generate and commit golden files for the family.
  3. Write forward mapper (ORE XSD → ORES domain types) for the family.
  4. Write reverse mapper (ORES domain types → ORE XSD) for the family.
  5. Write Thing 3 mapper fidelity tests.
  6. Run Thing 2 Python script; document gaps.

Definition of done per batch:

  • Thing 1: all golden diffs are zero.
  • Thing 3: no fields lost in mapper round-trip.
  • Thing 2: gap report produced; blocking enabled for this family once gaps zero.

Batch 0+1 — Infrastructure + Rates Core (merged)

Batch 0 was originally planned as infrastructure only. It is merged with Batch 1 (Rates core) because there is no value in infrastructure without at least one product family to exercise it. The infrastructure is verified by Batch 1 passing.

ORE example files:

File ORE Type
IR_Swap_Vanilla.xml Swap
IR_OIS_Swap.xml Swap (OIS)
IR_Swap_Amortising.xml Swap
IR_Swap_Amortising_Notionals.xml Swap
IR_Swap_Amortising_Amortizations.xml Swap
IR_Swap_Amortising_Amortizations_Accretion.xml Swap
IR_Swap_CMS.xml Swap (CMS leg)
IR_Swap_BMA.xml Swap (BMA)
IR_Swap_CNY-REPOFIX.xml Swap
IR_Swap_Custom_Fixings.xml Swap
IR_Swap_zero_coupon_fixed_leg.xml Swap
IR_Swap_zero_coupon_floating_leg.xml Swap
IR_Basis_Swap_Single_Currency.xml Swap
IRFX_Cross_Currency_Swap_rebalancing.xml CrossCurrencySwap
IRFX_Cross_Currency_Swap_non_rebalancing.xml CrossCurrencySwap
IRFX_Cross_Currency_Swap_NDIRS.xml CrossCurrencySwap
IR_FRA.xml FRA
IR_Cap_on_IBOR.xml CapFloor
IR_Cap_on_IBOR_amortising.xml CapFloor
IR_Collar_on_IBOR.xml CapFloor
IR_Cap_on_IBOR_MXNTIIE.xml CapFloor
IR_CapFloor_on_FedFunds_amortising.xml CapFloor
IR_Cap_on_CMS.xml CapFloor (CMS cap)
IR_OIS_capped_floored.xml CapFloor (OIS)

Infrastructure delivered:

  • assets/test_data/golden_dataset/ directory with Products/Example_Trades/ scaffold.
  • scripts/ore_coverage_check.py — Python gap check script.
  • .github/workflows/ore_coverage.yml — non-blocking GitHub Action.
  • Thing 1 test fixture in projects/ores.ore/tests/ (parametric over file list).
  • Thing 3 test fixture in projects/ores.ore/tests/ (mapper round-trip).

Mappers delivered:

  • ores::ore::domain::swap_instrument_mapper — forward + reverse for Swap and FRA.
  • ores::ore::domain::capfloor_instrument_mapper — forward + reverse for CapFloor.
  • ores::ore::domain::ccs_instrument_mapper — forward + reverse for CrossCurrencySwap.
  • Extend trade_mapper to call the appropriate instrument mapper after mapping the envelope.

Batch 2 — Rates Options (Swaption, Inflation)

File ORE Type
IR_Swaption_European.xml Swaption
IR_Swaption_Bermudan.xml Swaption
IR_Callable_Swap_Bermudan.xml CallableSwap
IR_DurAdjusted_CMS.xml Swap (DurAdjusted)
IR_Swap_CMS_Spread_Option.xml CMS Spread Option
IR_Swap_Digital_CMS_Spread_Option.xml CMS Spread Option
IR_Flexi_Swap.xml FlexiSwap
IR_BalanceGuaranteedSwap.xml BalanceGuaranteedSwap
IR_Swap_Capped_Amortising_.xml Swap
IR_Swap_NDIRS_nonXCCY.xml Swap
Inflation_CPI_Swap.xml InflationSwap
Inflation_CPI_Swap_ZeroCoupon.xml InflationSwap
Inflation_CPI_Swap_fixed_non_infl_leg.xml InflationSwap
Inflation_CPI_Cap.xml InflationCapFloor
Inflation_CPI_Floor.xml InflationCapFloor
Inflation_CPI_Cap_Leg.xml InflationCapFloor
Inflation_YoY_Cap.xml YoYInflationCapFloor
Inflation_YoY_Floor_AUCPI.xml YoYInflationCapFloor
Inflation_ZC_Cap_AUCPI.xml YoYInflationCapFloor

Batch 3 — FX

File ORE Type
FX_Swap.xml FxSwap
FX_Variance_Swap.xml FxVarianceSwap
(others from Products/ FX_* files) FxForward, FxOption, etc.

Full list from ls external/ore/examples/Products/Example_Trades/FX_*.

Batch 4 — Fixed Income

File ORE Type
Cash_Bonds.xml Bond
Cash_BondRepo_and_Bond.xml BondRepo
Cash_ConvertibleBond.xml ConvertibleBond
Cash_Ascot.xml ASCOT
BondOption_StrikePrice_StrikeYield.xml BondOption
(others from Cash_* and Bond* files)  

Batch 5 — Credit

File ORE Type
Credit_Default_Swap.xml CreditDefaultSwap
Credit_Index_Credit_Default_Swap.xml IndexCDS
Credit_Index_Credit_Default_Swap_Bespoke_Basket.xml IndexCDS
Credit_BondOption.xml CreditBondOption
Credit_Bond_Forward.xml CreditBondForward
Credit_CreditLinkedSwap.xml CreditLinkedSwap
Credit_RiskParticipationAgreement_on_Vanilla_Swap.xml RPA
Credit_RiskParticipationAgreement_on_CallableSwap.xml RPA
Credit_Synthetic_CDO_refdata.xml SyntheticCDO
Credit_Bond_TRS.xml TotalReturnSwap
Credit_Bond_TRS_with_Indexings.xml TotalReturnSwap
Credit_Index_CDS_Option.xml IndexCDSOption

Batch 6 — Equity

Full list from ls external/ore/examples/Products/Example_Trades/Equity_*.

Batch 7 — Commodity

Full list from ls external/ore/examples/Products/Example_Trades/Commodity_*.

Batch 8 — Hybrid and Composite

File ORE Type
Hybrid_CompositeTrade.xml CompositeTrade
Hybrid_GenericTRS_with_Bond.xml TotalReturnSwap
Hybrid_CFD.xml CFD
(other Hybrid_* files)  

Key Design Decisions

Where the Golden Dataset Lives

Placed at assets/test_data/golden_dataset/ rather than inside projects/ores.ore/ because it is cross-component. Future components (ores.market_data, etc.) will add market data golden files to the same tree. Having it at repo root level avoids coupling one component's test directory to another component's data.

Python Script for Thing 2 vs C++ Test

ORE XML contains fields in many different formats and namespaces. A C++ test comparing XML field presence would require parsing and deeply traversing two XML trees in a component-agnostic way — effectively reimplementing a general-purpose XML differ. Python with lxml or ElementTree is better suited for this structural analysis. The script is a build-time check, not a runtime component, so there is no language consistency constraint.

Non-blocking Initially

Making the coverage check blocking from day one would break CI immediately because we start at near-zero coverage. The plan is:

  1. Non-blocking phase: script runs, report is visible in CI output.
  2. Per-family blocking: each family's check becomes mandatory once its gaps reach zero (enforced by the script, controlled by a config list).
  3. Global mandatory: once all families have zero gaps, the entire check blocks.

Population Reconciliation Framework — Deferred

A richer C++ reconciliation framework (field_diff, trade_diff, population_diff types; NATS protocol for diff requests; Qt DiffReportWindow) was discussed but is deferred. The reasons:

  • The immediate dependency is mapper correctness, not user-facing diff reports.
  • The three-thing test framework covers mapper correctness without needing a general population rec framework.
  • Population rec is more relevant once ORES is processing live deal populations (not just ORE sample files).

The golden file mechanism in Thing 1 and the mapper fidelity test in Thing 3 cover the regression prevention use case. The population rec framework can be built on top of this later as a user-facing feature.

Deterministic Serialization Requirement

The ores.ore XML serializer must produce the same byte sequence for equivalent inputs. Requirements:

  • Element ordering: follow XSD sequence declarations; do not vary order between runs.
  • Date format: ISO 8601 (YYYY-MM-DD), no timezone suffix for date-only fields.
  • Floating point: fixed decimal notation, sufficient precision to round-trip (typically 15 significant digits for double). No locale-sensitive formatting.
  • Optional elements: emit only when value is non-default (mirrors what ORE does).

Without determinism, golden diffs will be noisy and the regression test is useless.

Mapper Architecture

Forward Mapper (ORE XSD → ORES Domain)

Extension of the existing trade_mapper in ores.ore.

After mapping the trade envelope (external_id, trade_type, netting_set_id), the trade mapper dispatches to a family-specific instrument mapper:

// Pseudocode
auto trade = map_envelope(ore_trade);
if (is_swap_family(trade.trade_type_code))
    trade.instrument = swap_instrument_mapper::forward(ore_trade.tradeData.swapData);
else if (is_capfloor_family(trade.trade_type_code))
    trade.instrument = capfloor_instrument_mapper::forward(ore_trade.tradeData.capFloorData);
// ...

Each family mapper is in its own file under projects/ores.ore/src/domain/.

Reverse Mapper (ORES Domain → ORE XSD)

New direction. Given ORES domain types, produce ORE XSD types ready for XML serialization:

// Pseudocode — new in Phase 9
ores::ore::domain::Trade trade_mapper::reverse(
    const trading::domain::trade& envelope,
    const trading::domain::swap_instrument& instrument,
    const std::vector<trading::domain::swap_leg>& legs);

The reverse mapper is the path from ORES → ORE XML needed for grid job submission.

Test Fixture Structure

Thing 1 (golden diff) and Thing 3 (mapper round-trip) use standalone TEST_CASE macros in projects/ores.ore/tests/, one per example file:

// Pseudocode
TEST_CASE("golden_roundtrip_ir_swap_vanilla", tags) {
    auto original_xml = read_file("external/ore/examples/Products/Example_Trades/IR_Swap_Vanilla.xml");
    auto ore_type = ores::ore::parse_trade(original_xml);         // XSD parse
    auto canonical_xml = ores::ore::serialize_trade(ore_type);    // canonical serialize

    // Thing 1: golden diff (bootstrap on first run, compare thereafter)
    auto golden_path = "assets/test_data/golden_dataset/Products/Example_Trades/IR_Swap_Vanilla.xml";
    if (!exists(golden_path)) { write_file(golden_path, canonical_xml); SUCCEED(); return; }
    REQUIRE(canonical_xml == read_file(golden_path));

    // Thing 3: mapper round-trip (when mapper exists for this type)
    auto ores_domain = swap_instrument_mapper::forward_swap(ore_type);
    auto reconstructed = swap_instrument_mapper::reverse_swap(ores_domain.instrument, ores_domain.legs);
    auto reconstructed_xml = ores::ore::serialize_trade(reconstructed);
    REQUIRE(reconstructed_xml == read_file(golden_path));
}

Batches can extend the type list. Thing 3 assertions are gated on mapper availability — early batches may have Thing 1 only for some types.

Relationship to Previous Phases

Phase Deliverable Status
0 Reference data tables Complete
1 Rates instruments (SQL + domain + repo + Qt + NATS) Complete
2 FX instruments Complete
3 Bond instruments Complete
4 Credit instruments Complete
5 Equity instruments Complete
6 Commodity instruments Complete
7 Extensions (options across asset classes) Complete
8 Composite and scripted instruments Complete
9 ORE instrument mappers + test infrastructure This plan

Phase 9 does not add new domain model tables or NATS endpoints. It adds the ORE serialization layer on top of the existing domain model.

Files to Create (Batch 0+1)

File Purpose
assets/test_data/golden_dataset/Products/Example_Trades/*.xml Golden files for IR/CCS/FRA/CapFloor batch
scripts/ore_coverage_check.py Thing 2 Python gap check
.github/workflows/ore_coverage.yml Non-blocking CI step for gap check
projects/ores.ore/tests/ore_golden_tests.cpp Thing 1 parametric test fixture
projects/ores.ore/tests/ore_mapper_roundtrip_tests.cpp Thing 3 mapper fidelity fixture
projects/ores.ore/src/domain/swap_instrument_mapper.cpp Swap + FRA forward + reverse mapper
projects/ores.ore/src/domain/capfloor_instrument_mapper.cpp CapFloor forward + reverse mapper
projects/ores.ore/src/domain/ccs_instrument_mapper.cpp CrossCurrencySwap forward + reverse mapper
projects/ores.ore/include/ores.ore/domain/swap_instrument_mapper.hpp Header
projects/ores.ore/include/ores.ore/domain/capfloor_instrument_mapper.hpp Header
projects/ores.ore/include/ores.ore/domain/ccs_instrument_mapper.hpp Header

Files to Modify (Batch 0+1)

File Change
projects/ores.ore/src/domain/trade_mapper.cpp Dispatch to instrument mappers after envelope mapping
projects/ores.ore/include/ores.ore/domain/trade_mapper.hpp Add reverse mapper declaration
projects/ores.ore/tests/CMakeLists.txt Register new test files
projects/ores.ore/src/CMakeLists.txt Register new mapper source files