ORE Market Data File Types — Phase 1 (ores.ore)

Table of Contents

Overview

Phase 1 introduces lightweight, faithful representations of ORE's text-based market data files into ores.ore, along with a parser and serializer that round-trip those files without loss.

Strong typing of the quote key fields (instrument type, currency, tenor, etc.) is explicitly deferred to ores.marketdata (Phase 2). ores.ore is responsible only for: parse file → structs → write file, byte-faithfully.

Background

ORE consumes two categories of text market data files. Both use the same whitespace-delimited three-column line structure:

DATE  KEY  VALUE
File pattern Date formats Delimiter Key semantics
market*.txt YYYYMMDD or YYYY-MM-DD whitespace or , Instrument quote key
fixings*.txt YYYYMMDD or YYYY-MM-DD whitespace or , Index name

Both file types appear in both date formats in the ORE example library. Some files use comma as the field delimiter instead of whitespace; some files (e.g. Legacy/Example_45) mix both delimiters within the same file. The parser auto-detects the date format and delimiter per line.

Comment lines begin with # and blank lines are silently skipped by ORE. Both conventions must be preserved on serialization.

Market data quote types

There are 38 distinct quote-key prefixes across the ORE example library, covering interest rates, FX, volatilities, credit, equity, commodity, inflation, and miscellaneous types. Representative examples:

20160205 ZERO/RATE/EUR/BANK_EUR_BORROW/A365/2Y            0.0024
20160205 IR_SWAP/RATE/CHF/2D/1D/1M                        0.0035
20160205 FX/RATE/EUR/CHF                                  1.0947
20160205 SWAPTION/RATE_LNVOL/CHF/25Y/10Y/ATM              0.3820
20160205 CAPFLOOR/RATE_LNVOL/CHF/20Y/6M/0/0/0.025         0.5410
20160205 EQUITY/PRICE/SP5/USD                             2023.81
20160205 CDS/CREDIT_SPREAD/BANK/SR/USD/1Y                 0.0050
20160205 COMMODITY_FWD/PRICE/GOLD/USD/2016-02-29         1207.40

Fixings

2021-09-30 EUR-EONIA          0.0074600000
2016-01-28 EQ-SP5             2244.2
2020-09-01 UKRPI              293.3

Design Decisions

Minimal struct surface

ores.ore needs at most two structs:

struct market_datum {
    std::chrono::year_month_day date;
    std::string key;    // opaque — not decomposed at this layer
    std::string value;  // raw string — exact round-trip, no float precision loss
};

struct fixing {
    std::chrono::year_month_day date;
    std::string index_name;
    std::string value;  // raw string
};

market_datum and fixing are structurally identical but semantically distinct, so they are separate types for clarity at the call site.

Value as std::string

Values are stored as std::string rather than double or a decimal type. This is intentional:

  • ORE files contain values with varying precision (e.g. 0.0074600000, 2244.2, -0.006119). Any floating-point representation introduces representation error that manifests as noise on serialization.
  • ores.ore is a file-format layer, not a computation layer. Callers that need arithmetic work with values through ores.marketdata types (Phase 2), which will use a decimal type.

One parser, two serializers

Both file types parse the same three-column line format. A single market_data_line_parser handles tokenization and date parsing for both, auto-detecting the date format (YYYYMMDD vs YYYY-MM-DD).

Serialization writes the canonical date format per file type:

  • market_data_serializer — outputs YYYYMMDD
  • fixings_serializer — outputs YYYY-MM-DD

No codegen

The structs are trivially simple and handwritten. Codegen adds overhead without benefit here. The existing XSD-driven codegen in ores.ore applies only to XML configuration files.

File Inventory

A content-deduplication pass across all external/ore/examples identifies 54 unique market data text files (many example directories share identical file content). All 54 must parse without error. The roundtrip test suite runs each unique file through parse → serialize → re-parse and asserts field-by-field equivalence.

Unique market data files (38 files)

Date fmt Path (relative to examples/)
YYYYMMDD Input/market_20160205.txt
YYYYMMDD Input/market_20160205_flat.txt
YYYYMMDD CurveBuilding/Input/market_20160205.txt
YYYYMMDD Legacy/Example_19/Input/market_20160205_smile.txt
YYYYMMDD AmericanMonteCarlo/Input/market_20160205_flat_fixed_fxfwd.txt
YYYYMMDD Exposure/Input/market_flipview.txt
YYYYMMDD InitialMargin/Input/Simm/simm_market.txt
YYYYMMDD Legacy/Example_56/Input/market.txt
YYYYMMDD Academy/TA002_IR_Swap/Input/market.txt
YYYYMMDD ORE-Python/Notebooks/Example_2/Input/market.txt
YYYYMMDD ORE-Python/Notebooks/Example_3/Input/market.txt
YYYYMMDD ORE-Python/Notebooks/Example_6/Input/market.txt
YYYYMMDD ORE-Python/Notebooks/Example_7/Input/market.txt
YYYYMMDD ORE-Python/Notebooks/Example_9/Input/market_20160205.txt
YYYYMMDD ORE-Python/Notebooks/Example_9/Input/market_20160205_flat.txt
YYYYMMDD XvaRisk/Input/market_20160205_eonia_200bp_up.txt
YYYYMMDD XvaRisk/Input/market_20160205_eur6m_200bp_up.txt
YYYY-MM-DD AmericanMonteCarlo/Input/market.txt
YYYY-MM-DD Academy/FC003_Reporting_Currency/Input/marketdata.txt
YYYY-MM-DD Academy/TA001_Equity_Option/Input/marketdata.txt
YYYY-MM-DD Exposure/Input/market_inflation.txt
YYYY-MM-DD Legacy/Example_45/Input/market.txt
YYYY-MM-DD Legacy/Example_46/Input/market.txt
YYYY-MM-DD Legacy/Example_47/Input/market.txt
YYYY-MM-DD Legacy/Example_48/Input/market.txt
YYYY-MM-DD Legacy/Example_55/Input/market.txt
YYYY-MM-DD Legacy/Example_73/Input/market.txt
YYYY-MM-DD MarketRisk/Input/Curvealgebra/market.txt
YYYY-MM-DD MarketRisk/Input/HistSimVar/market.txt
YYYY-MM-DD MarketRisk/Input/Pnl/market.txt
YYYY-MM-DD ORE-Python/Notebooks/Example_5/Input/market.txt
YYYY-MM-DD ORE-Python/Notebooks/Example_8/Input/Example_62/market.txt
YYYY-MM-DD Legacy/Example_61/Input/market.txt (comma-delimited)

Unique fixings files (15 files)

Date fmt Path (relative to examples/)
YYYYMMDD Legacy/Example_54/Input/fixings_20160205.txt
YYYYMMDD ORE-Python/Notebooks/Example_9/Input/fixings_20160205.txt
YYYY-MM-DD Input/fixings_20160205.txt
YYYY-MM-DD AmericanMonteCarlo/Input/fixings.txt
YYYY-MM-DD Exposure/Input/fixings_inflation.txt
YYYY-MM-DD Legacy/Example_45/Input/fixings.txt
YYYY-MM-DD Legacy/Example_46/Input/fixings.txt
YYYY-MM-DD Legacy/Example_47/Input/fixings.txt
YYYY-MM-DD Legacy/Example_48/Input/fixings.txt
YYYY-MM-DD MarketRisk/Input/HistSimVar/fixings.txt
YYYY-MM-DD MarketRisk/Input/Pnl/fixings.txt
YYYY-MM-DD ORE-API/Input/fixings_20160205.txt
YYYY-MM-DD ORE-Python/Notebooks/Example_3/Input/fixings.txt
YYYY-MM-DD ORE-Python/Notebooks/Example_5/Input/fixings.txt
YYYY-MM-DD Legacy/Example_61/Input/fixings.txt (comma-delimited)

Implementation Plan

Step 1 — Structs

New headers in projects/ores.ore/include/ores.ore/market/:

  • market_datum.hppmarket_datum struct
  • fixing.hppfixing struct

No implementation files needed; structs are value types only.

Step 2 — Parser

New files:

  • include/ores.ore/market/market_data_parser.hpp
  • src/market/market_data_parser.cpp

Interface:

namespace ores::ore::market {

/**
 * @brief Parses an ORE market data quote file (market*.txt).
 *
 * Skips blank lines and comment lines (starting with #).
 * Accepts both YYYYMMDD and YYYY-MM-DD date formats.
 */
std::vector<market_datum>
parse_market_data(std::istream& in);

/**
 * @brief Parses an ORE fixings file (fixings*.txt).
 *
 * Skips blank lines and comment lines (starting with #).
 * Accepts both YYYYMMDD and YYYY-MM-DD date formats.
 */
std::vector<fixing>
parse_fixings(std::istream& in);

} // namespace ores::ore::market

Both functions delegate to a shared internal line tokenizer. Error handling: malformed lines produce a std::invalid_argument with the line number and offending content in the message.

Step 3 — Serializer

New files:

  • include/ores.ore/market/market_data_serializer.hpp
  • src/market/market_data_serializer.cpp

Interface:

namespace ores::ore::market {

/**
 * @brief Serializes market data quotes to ORE text format (YYYYMMDD dates).
 */
void serialize_market_data(std::ostream& out,
                           const std::vector<market_datum>& data);

/**
 * @brief Serializes fixings to ORE text format (YYYY-MM-DD dates).
 */
void serialize_fixings(std::ostream& out,
                       const std::vector<fixing>& fixings);

} // namespace ores::ore::market

Output format:

  • One entry per line: DATE<TAB>KEY<TAB>VALUE
  • Values are written exactly as stored (no reformatting)
  • No comment lines are emitted (comments are not preserved through parse/serialize)

Step 4 — Round-trip Tests

New test file: projects/ores.ore/tests/market_roundtrip_tests.cpp

Test strategy: for each fixture file in the table above:

  1. Read the file content from external/ore/examples/...
  2. Parse into structs via the parser
  3. Serialize back to a string via the serializer
  4. Compare the entry count and each date / key / value triple against the original parse

The comparison is structural (field-by-field), not byte-identical string diff, because:

  • Comment lines are dropped on serialization
  • Column spacing may differ

A separate market_data_parser_tests.cpp covers unit-level cases: malformed lines, unsupported date formats, comment-only files, empty files.

Acceptance Criteria

  • [X] All seven fixture files parse without error
  • [X] Roundtrip test: every datum in the parsed output of market_20160205.txt survives serialize → re-parse with identical field values
  • [X] Roundtrip test: same for each fixings fixture file
  • [X] Parser rejects lines with fewer than 3 whitespace-delimited tokens with a descriptive error including the line number
  • [X] Value strings are preserved exactly (0.0074600000 stays 0.0074600000)
  • [X] All tests pass on Linux, macOS, and Windows CI

Out of Scope (Phase 2)

  • Decomposition of quote keys into typed fields (ZERO/RATE/EUR/...zero_rate_datum struct with typed currency, day_count, tenor fields)
  • CDM Observable / Price mapping
  • ores.marketdata project, domain types, and NATS messaging
  • Database persistence of market data quotes
  • UI / Qt views for market data