Sprint Backlog 15

Sprint Mission

  • Implement compute grid.

Stories

Active

Table 1: Clock summary at [2026-03-26 Thu 08:08], for Thursday, March 26, 2026.
Tags Headline Time     %
  Total time 2:23     100.0
  Stories 2:23     100.0
  Active   2:23   100.0
code End to end fixes for grid work     2:23 100.0
Table 2: Clock summary at [2026-03-26 Thu 08:08]
Tags Headline Time     %
  Total time 66:46     100.0
  Stories 66:46     100.0
  Active   66:46   100.0
agile Sprint and product backlog refinement     0:54 1.3
code Add BOINC-inspired distributed compute grid     7:03 10.6
code Fix ORE import bugs     6:35 9.9
code Add loading indicator to all entity list windows     2:49 4.2
code Replace feature_flags with unified system_settings     5:59 9.0
code Implement JWT refresh     3:16 4.9
code Add compute grid telemetry pipeline     2:10 3.2
code Add service dashboard with live RAG status     2:49 4.2
code E2E compute testing fixes and loading lifecycle refactor     8:43 13.1
code Add NATS service discovery for HTTP base URL     7:10 10.7
code Extract API from main component     8:13 12.3
code Fix missing tenant_id in system settings tests     0:30 0.7
code Fix sccache cache bloat causing slow/cold builds     0:30 0.7
code Move generators to api layer and split http.server     2:55 4.4
code Implement full environment isolation for database roles     1:53 2.8
code Add ORES_PRESET to .env; add start-client.sh     0:30 0.7
code End to end fixes for grid work     4:47 7.2

STARTED Sprint and product backlog refinement   agile

Updates to sprint and product backlog.

sprint_backlog_15_stories_pie_sorted.png

sprint_backlog_15_stories.png

sprint_backlog_15_tags.png

COMPLETED Add BOINC-inspired distributed compute grid   code

This pull request implements a comprehensive distributed compute grid inspired by BOINC, integrating it deeply into the ORE Studio ecosystem. It introduces new domain entities, services, and executables, leveraging modern messaging and database technologies for efficient and scalable computation.

Highlights:

  • Compute Grid Implementation: This PR introduces a BOINC-inspired distributed compute grid across all layers of the ORE Studio stack, including domain library, SQL schema, service executable, messaging protocol, wrapper executable, HTTP endpoints, CLI commands, and Qt list windows.
  • Key Components: Key components include ores.compute domain library, ores.compute.service executable, ores.compute.wrapper executable, and integration with NATS JetStream for message handling.
  • Event-Driven Architecture: The architecture is primarily event-driven, using NATS JetStream for lifecycle transitions and PostgreSQL triggers for eventing, replacing legacy PGMQ and pg_cron implementations.
  • Documentation Updates: The PR updates UML diagrams and removes legacy plan documents, providing comprehensive documentation for the new compute grid.

COMPLETED Fix ORE import bugs   code

This pull request significantly improves the robustness and usability of the application by addressing several critical bugs across ORE import, trading services, and the Qt UI. Key changes include enhanced data mapping for ORE currency types, better handling of trade statuses, and dynamic loading of activity types in the UI. Additionally, UI components now retain their layout, and the project version has been updated to reflect the new sprint cycle.

Highlights:

  • ORE Import Bug Fixes: Resolved an Invalid monetary_nature: Crypto error by implementing a mapping for ORE CurrencyType values to appropriate database codes (e.g., Metal to commodity, Crypto to synthetic, others to fiat).
  • UI Enhancements (Qt): Introduced a subject_prefix field to the AddItemDialog for environments (editable) and connections (read-only from linked environments), improving configuration clarity. Fixed an issue where ConnectionBrowserMdiWindow failed to restore its size and splitter position upon reopening.
  • Trading Service Stability: Addressed an Invalid status_id: nil UUID bug by ensuring trade_status_service::resolve_status is called before every trade write operation, guaranteeing valid status IDs.
  • Dynamic Activity Type Loading: Added a new trading.v1.activity_types.list NATS endpoint, allowing clients to dynamically fetch valid activity type codes from the database. This change also fixes Unknown activity type code: New import errors in the ORE Import Wizard by replacing hardcoded lifecycle events with an async database fetch.
  • Sprint Management & Version Bump: Opened Sprint 15 and updated the project version to 0.0.15, reflecting ongoing development and release cycles.

COMPLETED Add loading indicator to all entity list windows   code

This pull request significantly improves the user experience across all entity list windows by introducing a consistent loading indicator and centralizing the data reloading logic. By moving common loading state management to the base class, it reduces code duplication and simplifies future maintenance, ensuring that users receive clear visual feedback during data fetches.

Highlights:

  • Centralized Reload Logic: Refactored the data reload lifecycle in the base EntityListMdiWindow class, centralizing common steps like clearing stale indicators and managing loading states.
  • Loading Indicator Added: Introduced an indeterminate 4px progress bar (browser-style) to all entity list windows to visually indicate data loading.
  • Method Renaming and Delegation: Renamed the reload() override to doReload() in all 45 subclasses of EntityListMdiWindow, with the base class now handling the reload() call and delegating to doReload().
  • Explorer Window Integration: Applied the new loading indicator and reload logic to OrgExplorerMdiWindow and PortfolioExplorerMdiWindow, ensuring consistent loading feedback.

COMPLETED Replace feature_flags with unified system_settings   code

Summary:

Replaces the old boolean-only `ores_variability_feature_flags_tbl` (and its `system_flags_service` typed wrapper) with a single unified `ores_variability_system_settings_tbl` that supports boolean, integer, string, and JSON value types. All code across the stack — database, service, messaging, CLI, shell, HTTP, Qt UI — has been migrated to the new type. All legacy `feature_flags` and `system_flags` files and names have been removed.

Changes:

  • Phase 1–5 (variability): New `ores_variability_system_settings_tbl` SQL schema, domain type, JSON/table I/O, entity/mapper/repository, service with typed accessors (`get_bool`, `get_int`, `get_string`, `get_json`), NATS handler/protocol (`variability.v1.settings.*`), eventing (`system_setting_changed_event`)
  • Phase 6 (shell): Migrated `variability_commands` from `list-flags`/`add-flag` to `list-settings`/`save-setting`
  • Phase 7 (CLI): New `add_system_setting_options`, `system_settings_parser`; updated `entity` enum, `add_options` variant, application dispatch
  • Phase 8 (cross-module): Updated `iam_routes`, `iam` auth handler, `variability_routes`, `http.server` application, `wt.service` application context, Qt currency windows, `TenantProvisioningWizard`, event viewer
  • Phase 9 (deletion): Removed all legacy files — 29 `feature_flags`/`system_flags` source files, 5 test files, 4 CLI files; updated `registrar.cpp` and `variability_routes` to use new protocol/endpoint
  • Rename pass: 14 Qt files renamed (`FeatureFlag*` → `SystemSetting*`); all string literals, log messages, SQL function names, ORG docs, JSON codegen models updated throughout

COMPLETED Implement JWT refresh   code

Problem:

2026-03-20 10:04:27.471965 [DEBUG] [ores.trading.messaging.trade_handler] Completed ores.dev.local1.trading.v1.trades.list 2026-03-20 10:05:17.241863 [DEBUG] [ores.trading.messaging.trade_handler] Handling ores.dev.local1.trading.v1.trades.list 2026-03-20 10:05:17.241962 [TRACE] [ores.security.jwt.jwt_authenticator] Validating JWT token 2026-03-20 10:05:17.242485 [WARN] [ores.security.jwt.jwt_authenticator] JWT verification failed: token expired 2026-03-20 10:05:17.242569 [DEBUG] [ores.trading.service.trade_service] Listing trades with offset=0, limit=100

Add JWT auth telemetry hypertable:

This pull request significantly enhances the authentication system by introducing comprehensive telemetry for JWT-based authentication events and improving client-side session management. It establishes a TimescaleDB hypertable to meticulously log all authentication-related activities, providing valuable data for monitoring and analysis. Concurrently, the client application now proactively refreshes JWTs and gracefully handles session expirations, leading to a more resilient and user-friendly authentication experience.

Highlights:

  • Authentication Telemetry Database: Introduced ores_iam_auth_events_tbl TimescaleDB hypertable to record various authentication events, including login success/failure, logout, token refresh, max session exceeded, and signup success/failure.
  • Aggregated Views and Retention Policies: Created hourly and daily continuous aggregate views (ores_iam_auth_events_hourly_vw, ores_iam_auth_events_daily_vw) with retention policies set to 90 days for raw events and hourly views, and 3 years for daily views.
  • Robust Event Recording: Ensured the auth_event_repository is insert-only and that telemetry failures do not interrupt the core authentication response flow by wrapping calls in try/catch blocks.
  • Proactive JWT Refresh (Client-side): Implemented a QTimer in ClientManager to proactively refresh JWTs before they expire, improving user experience by preventing unexpected logouts.
  • Automatic JWT Refresh (Server-side): Modified nats_session to automatically attempt JWT refresh when a token_expired error is received from the server, transparently handling token renewal.
  • Session Expiry Handling: Added client-side logic in ClientManager and MainWindow to detect and notify users when a session has reached its maximum allowed duration (max_session_exceeded), prompting re-login.
  • Error Reply Refactoring: Updated various *handler.hpp files in ores.compute to use error_reply for handling failed request context creation, standardizing error responses.

JWT token refresh: configurable lifetimes, refresh subject, token_expired propagation:

This pull request significantly refines the Identity and Access Management (IAM) system by introducing flexible JWT token management and robust error handling. It allows administrators to configure token lifetimes, implements a secure token refresh mechanism, and ensures that authentication-related errors are explicitly communicated to clients rather than silently failing. These changes improve the system's security, configurability, and overall reliability for token-based authentication.

Highlights:

  • Configurable JWT Token Lifetimes: Introduced new system settings (iam.token.access_lifetime_seconds, iam.token.party_selection_lifetime_seconds, iam.token.max_session_seconds, iam.token.refresh_threshold_pct) to allow dynamic configuration of JWT access token, party selection token, and maximum session durations. Hardcoded values are replaced with these configurable settings.
  • JWT Token Refresh Mechanism: Implemented a new token refresh endpoint (iam.v1.auth.refresh) and a validate_allow_expired() method in the JWT authenticator. This allows clients to obtain new access tokens using an expired but otherwise valid token, subject to a maximum session ceiling.
  • Enhanced Error Propagation: Fixed a silent fallback bug in make_request_context(), which now explicitly returns error_code::token_expired for expired tokens and error_code::unauthorized for invalid/missing tokens. An error_reply() helper was added to send these errors via X-Error NATS headers.
  • Widespread Error Handling Update: Updated all 48 domain handler files across various services to correctly handle and propagate the new std::expected return type from make_request_context(), ensuring consistent client-side error feedback.
  • Hot-Reloading of Token Settings: Enabled account_handler and auth_handler to hot-reload token settings dynamically upon receiving ores.variability.system_setting_changed events, eliminating the need for service restarts when these settings are updated.

JWT refresh: shell reactive retry and Qt proactive timer (Phases 4-5):

This pull request implements JWT token refresh functionality in both the shell and Qt clients. It introduces reactive retries in the shell upon token expiration and proactive refresh timers in the Qt client to prevent session expiry. The changes ensure a smoother user experience by automatically refreshing tokens and prompting re-login only when necessary.

Highlights:

  • JWT Refresh - Shell: The nats_session::authenticated_request() function now validates the X-Error header in each reply. If a token_expired error is detected, it triggers a refresh, retrieves an updated JWT, and retries the original request once. A max_session_exceeded error will throw an exception, prompting the user to re-login.
  • JWT Refresh - Qt: The ClientManager now uses a QTimer that activates after each successful login and party selection. The timer fires at 80% of the token lifetime, triggering an iam.v1.auth.refresh call. A sessionExpired() signal is emitted if the refresh fails due to session expiry, which MainWindow connects to, displaying a warning dialog and reopening the login dialog.
  • Completion of JWT Refresh Plan: This pull request finalizes the JWT token refresh plan, incorporating all five phases: system settings registration, configurable lifetimes, token_expired error propagation, shell reactive retry, and Qt proactive timer with session-expired dialog.

COMPLETED Add compute grid telemetry pipeline   code

This pull request significantly enhances the compute grid's observability by implementing a dedicated telemetry pipeline. It transitions the dashboard's data sourcing from direct, ad-hoc database queries to a centralized, time-series-based system using TimescaleDB. This change improves performance, scalability, and consistency of monitoring data by introducing distinct server-side and node-side samplers that collect and persist metrics, and a unified NATS endpoint for dashboard consumption.

Highlights:

  • Compute Grid Telemetry System: Introduced a comprehensive telemetry system for the compute grid, moving dashboard statistics from ad-hoc live queries to a robust TimescaleDB time-series storage.
  • New TimescaleDB Hypertables: Added two new TimescaleDB hypertables, ores_compute_grid_samples_tbl for server-side metrics and ores_compute_node_samples_tbl for per-node execution statistics, both with 30-day retention policies.
  • Server-Side Poller (compute_grid_poller): Implemented a new compute_grid_poller in ores.compute.service that runs as an async Boost.ASIO coroutine, sampling and storing grid-wide metrics every 30 seconds.
  • Node-Side Reporter (node_stats_reporter): Developed a node_stats_reporter in ores.compute.wrapper to accumulate per-task timing and byte-transfer metrics, publishing node_sample_message to NATS every 30 seconds for persistence by the compute service.
  • NATS Request/Reply for Dashboard: Created a get_grid_stats NATS request/reply handler to serve the latest telemetry snapshot to the dashboard, replacing multiple ad-hoc NATS queries with a single, efficient call.
  • Dashboard Integration: Updated the ComputeDashboardMdiWindow in the Qt client to consume data from the new get_grid_stats endpoint, populating all six statistical boxes from the unified response.

COMPLETED Add service dashboard with live RAG status   code

This pull request introduces a service health monitoring system that provides real-time insights into the status of all running services. It includes a new TimescaleDB hypertable for storing service heartbeats, a reusable heartbeat publisher coroutine, and a Qt UI dashboard for visualizing service health using RAG status indicators. This enhancement improves operational visibility and enables proactive issue detection.

Highlights:

  • Service Health Monitoring: Introduces a comprehensive service health monitoring system using heartbeats and a RAG status dashboard.
  • New TimescaleDB Hypertable: Adds a new TimescaleDB hypertable for persistent storage of service heartbeat samples.
  • Heartbeat Publisher: Implements a reusable header-only coroutine for publishing service heartbeats via NATS.
  • Qt UI Dashboard: Creates a new Qt UI dashboard to display service status with RAG indicators.
  • Service Integration: Integrates heartbeat publishing into all domain services to ensure complete visibility.

COMPLETED E2E compute testing fixes and loading lifecycle refactor   code

This pull request significantly enhances the robustness and usability of the compute grid and related data management features. It introduces a standardized loading lifecycle for client-side models, streamlines change reason handling across various detail dialogs, and improves data integrity by fixing UUID generation and refining platform management for application versions. These changes collectively contribute to a more stable and user-friendly application, particularly for end-to-end testing and data entry workflows.

Highlights:

  • Change Reason Symmetry: Introduced applies_to_new to the change reason domain type and SQL schema, ensuring that 'new record' operations can also be associated with specific, data-driven change reasons. This centralizes change reason prompting into a single helper method in DetailDialogBase for all operation types (create, amend, delete), eliminating duplication across detail dialogs.
  • Compute RLS Policies: Fixed Row-Level Security (RLS) policies for compute applications and app versions, making them visible to all tenants as system-owned global registries.
  • UUID Generation: Replaced hardcoded c0ffee00 UUIDs in the ORE app seed script with gen_random_uuid() and name-based idempotency, improving data integrity and consistency.
  • ReflectCPP Compatibility: Removed a .response<list_settings_response>() call in variability routes that was causing a Clang consteval char/signed char deduction bug in reflectcpp.
  • Loading Lifecycle Refactor: Addressed missing endLoading() calls in all six compute grid MdiWindows, which previously caused the reload button to be permanently disabled after the first load. This was achieved by introducing an AbstractClientModel base class with standard dataLoaded() and loadError() signals, and wiring these signals to endLoading() automatically via EntityListMdiWindow::connectModel().
  • App Version Platform Management: Refactored app_version to support multiple platforms via a new junction table (ores_compute_app_version_platforms_tbl), replacing the previous single-string platform field. This includes updates to the domain, repository, CLI, and Qt UI to manage platform selections using a multi-select widget.
  • Improved UI for App and Workunit Selection: Enhanced AppVersionDetailDialog with a QComboBox for selecting parent applications and WorkunitDetailDialog with QComboBoxes for selecting batches and app versions, replacing plain UUID text boxes for better user experience.

COMPLETED Add NATS service discovery for HTTP base URL   code

This pull request introduces NATS service discovery for the HTTP base URL, refactors code for better dependency management, and includes an architecture migration plan. The changes aim to improve the client's ability to automatically configure itself and lay the groundwork for a cleaner service architecture.

Highlights:

  • NATS Service Discovery: Qt client now automatically discovers the HTTP server's base URL after login, eliminating manual http_port configuration.
  • Code Reorganization: Protocol types are moved to ores.http (shared utility lib) so ores.qt can include them without linking ores.http.server.
  • Architecture Migration Plan: An architecture migration plan is added for splitting service libraries into *.types (shared contract) and *.core (implementation) components.

COMPLETED Extract API from main component   code

Extract ores.iam.api and ores.compute.api:

This pull request implements the first phase of a significant architecture migration, separating the API contracts from the core logic for the Identity and Access Management (IAM) and Compute domains. This refactoring enhances the system's modularity by clearly defining interfaces and reducing tight coupling between components. Consumers now interact with dedicated API layers for data structures and messaging, while business logic and persistence are encapsulated within the core modules. This change paves the way for a more maintainable and scalable codebase.

Highlights:

  • Architectural Refactoring: The ores.iam and ores.compute modules have been refactored into distinct API and core components to improve modularity and dependency management.
  • New API Modules: Introduced ores.iam.api and ores.compute.api as shared contract layers, containing domain POCOs, IO helpers, eventing types, and protocol/messaging headers.
  • Core Module Renaming: Existing ores.iam and ores.compute modules were renamed to ores.iam.core and ores.compute.core respectively, now housing handlers, service logic, repositories, and registrars.
  • Dependency Updates: All consuming modules, including ores.qt, ores.cli, ores.shell, ores.http.server, ores.wt.service, ores.synthetic, ores.eventing/tests, ores.dq, ores.iam.service, ores.compute.service, and ores.compute.wrapper, have been updated to link against the correct new API or core layers.
  • Test Configuration Fixes: Corrected CMakeLists files in the new API components to properly link ores.testing.lib.

Split ores.refdata into ores.refdata.api and ores.refdata.core:

This pull request significantly improves the modularity and separation of concerns within the ores.refdata component by splitting it into an API-focused module and a core implementation module. This architectural change enhances maintainability, reduces coupling, and clarifies dependencies across the codebase. Additionally, it includes a quality-of-life improvement for local development by automating the loading of test database credentials.

Highlights:

  • Refdata Module Split: The ores.refdata component has been refactored into two distinct modules: ores.refdata.api for public contracts (domain types, eventing, messaging, CSV support) and ores.refdata.core for internal logic (handlers, repositories, generators, services).
  • Consumer Updates: All dependent modules (ores.cli, ores.qt, ores.ore, ores.http.server, ores.wt.service, ores.iam.core, ores.eventing) have been updated to correctly reference the new ores.refdata.api module for their public contract needs or ores.refdata.core for internal logic.
  • Local Development Setup: Added functionality to load ORES_TEST_DB_* credentials from the .env file during CMake configuration, streamlining local development testing.

Split ores.scheduler, ores.assets, ores.variability and ores.synthetic into api and core layers:

This pull request implements a significant architectural refactoring by splitting several key modules into separate API and core layers. This change enhances modularity, clarifies dependencies, and aligns the codebase with a standardized three-layer service pattern. The API layers now exclusively define interfaces and data structures, while the core layers encapsulate the business logic and implementation details. This separation improves maintainability and prepares the system for future development and scaling.

Highlights:

  • Architectural Split: The ores.scheduler, ores.assets, ores.variability, ores.synthetic, ores.trading, and ores.reporting modules have been refactored into distinct *.api and *.core layers. The *.api layers now contain domain types, eventing, and messaging protocol headers, while the *.core layers retain handlers, repositories, generators, and service logic.
  • New API Modules: New ores.assets.api, ores.reporting.api, and ores.scheduler.api modules were added, including their respective source, test, and modeling CMake configurations. This establishes clear boundaries for API definitions.
  • Module Renaming: Existing ores.assets, ores.reporting, and ores.scheduler modules were renamed to ores.assets.core, ores.reporting.core, and ores.scheduler.core respectively, to house their core implementation logic.
  • Dependency Updates: Numerous files across the codebase, including CMakeLists.txt files and C++ source/header files, were updated to reflect the new *.api and *.core naming conventions and adjust include paths and library linkages accordingly.
  • Migration Progress: This change completes Phase 2 of the service architecture migration, ensuring all ten service libraries now adhere to the three-layer api/core/service pattern.

COMPLETED Fix missing tenant_id in system settings tests   code

This pull request addresses a critical issue in the system settings repository tests where the tenant_id was not being properly set when creating system setting objects. By modifying the test helper function and its invocations, the tests now correctly provide a tenant_id, preventing database errors and ensuring the reliability of repository operations related to system settings.

Highlights:

  • Test Function Signature Update: The make_system_setting() function in repository_system_settings_repository_tests.cpp was updated to include tenant_id as a required parameter.
  • Tenant ID Assignment: The tenant_id is now correctly assigned within the make_system_setting() function, addressing a previous omission.
  • Call Site Corrections: All seven call sites of make_system_setting() across the test file have been modified to pass the h.tenant_id().to_string() value, ensuring proper tenant_id population.
  • PostgreSQL Error Resolution: This change resolves an issue where PostgreSQL rejected INSERT statements due to an 'invalid input syntax for type uuid: ""' error, caused by the missing tenant_id.

COMPLETED Fix sccache cache bloat causing slow/cold builds   code

CI builds were taking 2+ hours even for trivial single-file changes. Investigation revealed the GitHub Actions cache was at 11.5 GB — over the 10 GB repo limit, causing LRU evictions of sccache entries and forcing cold full rebuilds.

Root causes

Cause Detail
Per-PR sccache saves Every PR saved its own ~1 GB sccache entry. 12 entries across concurrent PRs = 8.5 GB of sccache alone
sccache too small 1000M limit caused within-build LRU eviction for 1,362 source files; next build starts with a partial cache
No cross-PR cache sharing GitHub cache scoping means each PR can only read from its own scope or main, so Qt (6 × ~450 MB = 2.7 GB) and sccache were duplicated per PR

Evidence

Step timings from a recent failed canary run (run 23465294275):

  • All pre-build steps (Qt, vcpkg, sccache restore): ~2 minutes
  • Run CTest workflow: 7,149 seconds (119 minutes)
  • Post sccache save: 7s — saved successfully, then evicted by next concurrent PR
  • The Cache CMake build output step added in some PR branches always completes in 1s (cache miss — the debug build output is 8.3 GB locally, far beyond GitHub's per-entry limits).

COMPLETED Move generators to api layer and split http.server   code

This pull request implements a significant architectural refactoring by relocating synthetic data generators from core to API layers across multiple components and streamlining the ores.http.server module. The changes enhance modularity and reduce tight coupling by moving domain-specific HTTP route handlers and an in-memory authentication service to their appropriate domain core or API layers. This effort aligns the codebase with a previously documented architectural plan, improving maintainability and clarity of component responsibilities.

Highlights:

  • Generator Migration: Moved synthetic generators from core to api layers across ores.trading, ores.reporting, ores.iam, ores.refdata, and ores.compute components, along with their associated tests.
  • Dependency Refinement: Corrected stale link dependencies in ores.dq.core and ores.ore to point to the appropriate api libraries, improving architectural clarity.
  • Service Relocation: Relocated the auth_session_service from ores.iam.core to ores.iam.api, as it has no core dependencies and is in-memory only.
  • HTTP Server Split: Restructured ores.http.server by moving domain-specific HTTP route handlers (iam, refdata, variability, assets) into their respective domain core modules, updating namespaces and dependencies.
  • HTTP Server Simplification: Ensured ores.http.server now exclusively manages generic server infrastructure, with compute_routes being the only remaining domain-specific route.
  • Directory Standardization: Standardized generator directory naming from generator/ (singular) to generators/ (plural) within the ores.trading component.
  • New Dependency: Added faker-cxx::faker-cxx as a private dependency to the CMakeLists of affected api libraries to support generator functionality.

COMPLETED Implement full environment isolation for database roles   code

This pull request significantly enhances the system's robustness and maintainability by introducing full environment isolation for PostgreSQL database roles and users. This change prevents cross-environment contamination during development and testing. Alongside this, the PR includes substantial architectural refactoring, moving data generators and core HTTP components to more appropriate API layers, and relocating a key authentication service. These structural improvements streamline development workflows and ensure more reliable testing across different environments.

Highlights:

  • Full Database Environment Isolation: Implemented environment-prefixed PostgreSQL roles and users, ensuring that database operations for one environment (e.g., local1) do not interfere with others (e.g., local2).
  • PostgreSQL Scripting Enhancements: Resolved issues with psql variable substitution within DO blocks by utilizing set_config and current_setting for robust scripting.
  • Improved Test Environment Setup: Introduced ORES_TEST_ENV_CMD to correctly pass environment variables to custom test runners (make rat), ensuring consistent test execution.
  • Architectural Refactoring of Generators: Relocated various data generators from core to api layers across multiple services (IAM, Refdata, Trading, Reporting, Compute), promoting better modularity and dependency management.
  • HTTP Core Extraction: Created a new ores.http.core module to house domain-layer HTTP route handlers, separating concerns from the ores.http.api module.
  • Service Relocation: Moved the auth_session_service from iam.core to iam.api to resolve cyclic dependencies and improve the IAM module's structure.
  • Test Fixes: Addressed variability core tests that were failing due to missing system tenant UUIDs.

COMPLETED Add ORES_PRESET to .env; add start-client.sh   code

This pull request significantly refines the development environment setup by centralizing the build preset configuration within the .env file, ensuring consistency across various operational scripts. It also introduces a new, flexible script for launching the Qt client, enhancing the developer experience with customizable instance colors and names. These changes collectively improve the robustness and usability of the build and runtime environment for ORE Studio.

Highlights:

  • Centralized Preset Configuration: The init-environment.sh script now requires a –preset argument, which is then stored as ORES_PRESET in the .env file, establishing a single source of truth for the build preset across all tools.
  • Service Script Enhancements: The start-services.sh, stop-services.sh, and status-services.sh scripts have been updated to source the ORES_PRESET from the .env file by default, with an option to override it via a command-line argument for ad-hoc use, removing hardcoded defaults.
  • New Qt Client Launch Script: A new start-client.sh script has been added to launch the ORE Studio Qt client, offering options for specifying the build preset, custom instance colors (named or hex), and display names, enabling multiple colored instances to run simultaneously.
  • Emacs Lisp Integration: The ores-prodigy.el Emacs Lisp file now automatically registers services based on the ORES_PRESET value loaded from the .env file, streamlining the setup process for Emacs users.

COMPLETED End to end fixes for grid work   code

This pull request delivers a comprehensive set of fixes and enhancements across the compute grid system, addressing critical end-to-end issues. The changes improve the reliability of service status reporting, ensure accurate result processing and data persistence, and refine the user interface for a more intuitive experience. Key areas of improvement include data flow integrity, authentication handling for internal services, and robust error management within the compute infrastructure.

Highlights:

  • Service Dashboard Improvements: Fixed services showing 'Red' on shutdown by raising the offline threshold to 120 seconds. Corrected version display from '1.0' to the actual version using the ORES_VERSION macro. Enhanced status-services.sh to distinguish between node and service counts.
  • Compute Host Heartbeat and Timestamp Fixes: Addressed the 'Online hosts always 0' issue by implementing an idle heartbeat sender in the wrapper. Resolved a silent timestamp bug where std::format produced nanoseconds that sqlgen::Timestamp couldn't parse, causing last_rpc_time to be NULL. Migrated timestamp conversion to ores.platform::datetime for cross-platform compatibility.
  • Result Processing and Authentication: Fixed results getting stuck 'InProgress' by allowing result_handler::submit to use the service context directly, as the wrapper sends unauthenticated requests.
  • Outcome Code Correction: Corrected the mapping of outcome codes, changing wrapper's internal 0/1 codes to the domain's 1=Success, 3=ClientError.
  • Host ID Persistence: Resolved 'host_id all-zeros' by adding host_id to submit_result_request, ensuring the wrapper sends its host ID on submission, and the handler persists it to the result record.
  • File Extension Preservation in URIs: Ensured original file extensions (.tar.gz, .csv, .xml) are preserved in package and workunit artifact URIs. Generalized HTTP compute routes from fixed /input and /config segments to a flexible /{artifact} parameter.
  • UI Bug Fixes: Fixed a double-slash URL bug in AppVersionDetailDialog during re-upload URL construction. Resolved a SIGSEGV issue caused by a dangling window pointer in onWindowMenuAboutToShow.
  • Result UI Enhancements: Improved the Result UI by rendering 'State' and 'Outcome' columns as colored pill badges (matching accounts table style) via ClientResultItemDelegate. Mapped labels to human-readable strings (e.g., Done, Running, Failed).
  • Main Window Geometry Persistence: Implemented saving and restoring the main window's geometry across sessions.
  • Compute Wrapper Robustness: Enhanced the compute wrapper with RAII thread cleanup and a terminate() guard on startup exceptions.

Add publication_params_schema to artefact types   code

The GLEIF librarian integration plan (2026-02-08) requires artefact types to declare what parameters they accept so that the publish wizard can render per-dataset configuration pages dynamically. The PublishBundleWizard and its LeiPartyConfigPage are already in place, but the database column that drives them is missing.

Tasks:

  • [ ] Add publication_params_schema jsonb nullable column to ores_dq_artefact_types_tbl (new migration/alter script under projects/ores.sql/create/dq/).
  • [ ] Populate the column for the two LEI artefact types (gleif.lei_parties.small and gleif.lei_counterparties.small) in projects/ores.sql/populate/ with the JSON schema describing their accepted parameters (e.g. target_party_ids for counterparties).
  • [ ] Update the artefact_type domain type and IO in ores.dq.api to include the new nullable field.
  • [ ] Expose the schema via the dq.v1.artefact_types.list response so the Qt client can read it.
  • [ ] Update LeiPartyConfigPage in PublishBundleWizard to read the schema and drive its UI from it rather than hard-coding the fields.

Add counterparty party scoping (librarian Phase 2)   code

Phase 2 of the librarian party/counterparty publication plan (2026-02-11). Phase 1 (dataset dependencies, optional bundle members, party RLS on party_counterparties_tbl) is complete. Phase 2 scopes counterparties to the publishing party, enabling per-party counterparty namespaces.

Prerequisite: the publication_params_schema story above must be merged first.

Tasks:

  • [ ] Add party_id uuid NOT NULL column to ores_refdata_counterparties_tbl (migration script; back-fill existing rows with the tenant root party).
  • [ ] Update counterparty domain type codegen model and regenerate counterparty.hpp / IO helpers to include party_id.
  • [ ] Add party-scoped RLS policy on counterparties_tbl using ores_iam_visible_party_ids_fn().
  • [ ] Update ores_dq_lei_counterparties_publish_fn() to accept and use target_party_ids from the p_params argument.
  • [ ] Populate publication_params_schema for gleif.lei_counterparties.small with the target_party_ids parameter definition.
  • [ ] Update the counterparty repository write path to stamp party_id.
  • [ ] Add LeiCounterpartyConfigPage to PublishBundleWizard (or extend the existing LeiPartyConfigPage) with a party picker for target_party_ids.

Unify compute wrapper nodes as replicated services   code

Currently compute wrapper nodes are a special case: hardcoded instance names (node1, node2, …), pattern-matched separately in status-services.sh, and given a dedicated "nodes" section in the service dashboard. All other services are singletons and need no such handling.

The goal is a uniform service model where any service type has a replica_count (defaulting to 1). Wrappers set theirs to N. Scripts and the dashboard treat all services identically — showing running/desired — with no special-casing for nodes.

The one non-trivial aspect is --host-id assignment: each wrapper replica needs a stable, unique UUID for heartbeat tracking and telemetry. Options are static UUIDs assigned per replica slot in a config file, or UUIDs generated once at first start and persisted to a state file.

Tasks:

  • [ ] Define a service manifest format (e.g. services.toml) with a replicas field per service type.
  • [ ] Rewrite start-services.sh and status-services.sh to drive from the manifest rather than hardcoded names; remove *node* pattern matching.
  • [ ] Generate or persist per-replica host-id values so restarts reuse the same UUID (avoids host churn in ores_compute_hosts_tbl).
  • [ ] Update the service dashboard to show running/desired uniformly for all service types; remove the separate "nodes" counter.
  • [ ] Remove the nodes_running / nodes_stopped special cases from the summary output.

Add support for running ORE samples   code

It would be nice to point to a ORE directory and have ORES automatically pack the samples, send it to the grid and retrieve the results.

End to end fixes for grid work   code

DONE CDM Phase 1: Rates instrument domain model   code

Implement the rates instrument domain model (Phase 1 of the CDM-inspired instrument design). See plan for architecture.

Covers Swap, CrossCurrencySwap, CapFloor, and Swaption, backed by two new temporal tables: ores_trading_instruments_tbl and ores_trading_swap_legs_tbl.

  • Tasks
    • [X] SQL: ores_trading_instruments_tbl + notify trigger + drop files
    • [X] SQL: ores_trading_swap_legs_tbl + notify trigger + drop files
    • [X] SQL: Register in trading_create.sql, drop_trading.sql, populate_trading.sql + seed data
    • [X] Domain: instrument struct, JSON I/O, table I/O, generator, protocol
    • [X] Domain: swap_leg struct, JSON I/O, table I/O, generator, protocol
    • [X] Repository: instrument entity, mapper, repository, service
    • [X] Repository: swap_leg entity, mapper, repository, service
    • [X] Server: messaging handler + registrar registration
    • [X] Qt UI: ClientInstrumentModel, InstrumentMdiWindow, InstrumentDetailDialog, InstrumentHistoryDialog, InstrumentController, MainWindow integration
    • [ ] CLI: instruments list, instruments add, instruments delete (deferred to follow-up)
  • Notes

    Story is BLOCKED pending PR review. All implementation is complete except CLI commands which are deferred to a follow-up story. Builds cleanly.

    Pull Request: OreStudio/OreStudio#569

DONE CDM Phase 2: FX instrument domain model   code

Implement the FX instrument domain model (Phase 2 of the CDM-inspired instrument design). See plan for architecture.

Covers FxForward, FxSwap, and FxOption, backed by one new temporal table: ores_trading_fx_instruments_tbl (bought/sold currencies and amounts, value_date, settlement, and option fields for FxOption).

  • Tasks
    • [X] SQL: ores_trading_fx_instruments_tbl + notify trigger + drop files
    • [X] SQL: Register in trading_create.sql, drop_trading.sql
    • [X] Domain: fx_instrument struct, JSON I/O, table I/O, generator, protocol
    • [X] Repository: fx_instrument entity, mapper, repository
    • [X] Service: fx_instrument_service
    • [X] Server: messaging handler + registrar registration
    • [X] Qt UI: ClientFxInstrumentModel, FxInstrumentMdiWindow, FxInstrumentDetailDialog, FxInstrumentHistoryDialog, FxInstrumentController, MainWindow integration
    • [ ] Database: recreate to pick up new table

DONE CDM Phase 4: Credit instrument domain model   code

Implement the credit instrument domain model (Phase 4 of the CDM-inspired instrument design). See plan for architecture.

Covers CreditDefaultSwap, CDSIndex, and SyntheticCDO, backed by one new temporal table: ores_trading_credit_instruments_tbl (reference_entity, currency, notional, spread, recovery_rate, tenor, start_date, maturity_date, day_count_code, payment_frequency_code, and optional fields for index_name, index_series, seniority, restructuring, description).

  • Tasks
    • [X] SQL: ores_trading_credit_instruments_tbl + notify trigger + drop files
    • [X] SQL: Register in trading_create.sql, drop_trading.sql
    • [X] Domain: credit_instrument struct, JSON I/O, table I/O, protocol messages
    • [X] Repository: credit_instrument entity, mapper, repository
    • [X] Service: credit_instrument_service
    • [X] Server: messaging handler + registrar registration
    • [X] Qt UI: ClientCreditInstrumentModel, CreditInstrumentMdiWindow, CreditInstrumentDetailDialog, CreditInstrumentHistoryDialog, CreditInstrumentController, MainWindow integration
    • [ ] Database: recreate to pick up new table

DONE CDM Phase 3: Bond instrument domain model   code

Implement the bond instrument domain model (Phase 3 of the CDM-inspired instrument design). See plan for architecture.

Covers Bond, ForwardBond, CallableBond, ConvertibleBond, and BondRepo, backed by one new temporal table: ores_trading_bond_instruments_tbl (issuer, currency, face_value, coupon_rate, coupon_frequency_code, day_count_code, issue_date, maturity_date, and optional fields for settlement_days, call_date, conversion_ratio, description).

  • Tasks
    • [X] SQL: ores_trading_bond_instruments_tbl + notify trigger + drop files
    • [X] SQL: Register in trading_create.sql, drop_trading.sql
    • [X] Domain: bond_instrument struct, JSON I/O, table I/O, protocol messages
    • [X] Repository: bond_instrument entity, mapper, repository
    • [X] Service: bond_instrument_service
    • [X] Server: messaging handler + registrar registration
    • [X] Qt UI: ClientBondInstrumentModel, BondInstrumentMdiWindow, BondInstrumentDetailDialog, BondInstrumentHistoryDialog, BondInstrumentController, MainWindow integration
    • [ ] Database: recreate to pick up new table

TODO Refactor trading reference type boilerplate into shared templates   code

The five new trading instrument reference data types (DayCountFractionType, BusinessDayConventionType, FloatingIndexType, PaymentFrequencyType, LegType) introduced in Phase 0 of the CDM instrument domain model follow the established per-type file convention used throughout the codebase. However, because all five types share the same structure (code + description + provenance), there is an opportunity to reduce boilerplate in several layers without changing the established pattern elsewhere.

Acceptance criteria:

  • CLI add_*_options structs: Replace the five identical add_*_type_options structs and their operator<< implementations with a single add_trading_ref_type_options struct shared across all five types, reducing 10 files to 1.
  • CLI application.cpp template helpers: Extract the repeated export_*, delete_*, and add_* function bodies into private template helpers (export_trading_ref_type, delete_trading_ref_type, add_trading_ref_type) so each of the 15 per-type methods becomes a one-line call.
  • Repository entity structs: Replace the five identical *_entity structs in ores.trading.core with a single trading_ref_type_entity base struct, and type-alias or inherit per type.

Scope: ores.cli and ores.trading.core only. The Qt UI layer (Controllers, MdiWindows, DetailDialogs, HistoryDialogs, ClientModels) follows the same per-type pattern as all other entity types in ores.qt and is out of scope for this story.

STARTED Move badges to database   code

At present we did some hackery to use badges on Qt UI. ideally we want the database to have some meta-data about badges and then use that to drive the UI so we can reuse it for Wt. See design.

Phase 1 — Infrastructure

  • Tasks
    • [ ] Create codegen model: badge_severity_domain_entity.json
    • [ ] Create codegen model: code_domain_domain_entity.json
    • [ ] Create codegen model: badge_definition_domain_entity.json
    • [ ] Create codegen model: badge_mapping_junction.json
    • [ ] Run codegen and integrate generated SQL artefacts
    • [ ] Run codegen and integrate generated C++ artefacts
    • [ ] Create population seed data from hardcoded colours
    • [ ] Implement BadgeCache in ores.qt
    • [ ] Wire BadgeCache into client startup sequence

Footer

Previous: Sprint Backlog 14 Next: TBD