Story: NATS migration

Table of Contents

This page documents a story in Sprint 14. It captures the goal, current status, acceptance criteria, and the tasks that compose it.

Goal

Replace the entire binary-protocol messaging stack with NATS — the sprint pivot. Cnats for request/reply; JetStream for queues; subject-based environment isolation. All 10 domain services migrated; ores.comms.shell renamed; pgmq-based ores.mq removed.

Status

Field Value
State DONE
Parent sprint Sprint 14
Now Completed 2026-03-18.
Waiting on None.
Next None.
Last touched 2026-03-18

Acceptance

  • Binary protocol stack removed (~107k lines).
  • ores.nats component + nats::service::client interface.
  • 10 domain services migrated.
  • ores.comms.shell renamed to ores.shell.
  • NATS telemetry sampling in place.
  • Tenant provisioning, RBAC, LEI handler, audit attribution, wt-service follow-ups landed.

Tasks

Task State Start End Description
Implement NATS support DONE 2026-05-20 2026-03-15 Replace the entire binary protocol stack (SSL/ASIO transport + NNG broker) with cnats request/reply: ~107k lines removed, ~20k lines added; new ores.nats component with nats::service::client interface; 10 domain services wired to NATS (iam, variability, refdata, assets, trading, dq, reporting, scheduler, synthetic, telemetry); ores.comms.shell + ores.http.server + ores.qt all migrated; subject prefix ores.{tier}.{instance} for environment isolation; _INBOX.* reply-subject prefix bug fixed; ores.mq removed in favour of JetStream via jetstream_admin.
Rename ores.comms.shell to ores.shell DONE 2026-05-20 2026-03-15 With NATS replacing the binary comms protocol, the shell no longer needs the .comms namespace; rename completes the cleanup; NATS documentation rewritten across all 10 services with subject-based references.
Add NATS telemetry sampling DONE 2026-05-20 2026-03-16 nats.service that stores NATS samples for observability; question of renaming to messaging-service or moving to telemetry left open.
Remaining NATS changes DONE 2026-05-20 2026-03-18 Tenant provisioning auto-creates system party + seeds bootstrap-mode flag; server-side UUID generation in tenant_handler::save(); account_handler uses make_request_context for tenant scoping; lei_entity_handler::summary() implementation calling new SQL functions; PQsetNoticeProcessor routes PL/pgSQL through Boost.Log; provisioning wizard parity restored (depth + contact + seed); performed_by audit field attributed to service account; ores.wt.service startup fix.

Decisions

NATS over NNG
standard, scalable, microservice-friendly base; much less plumbing than NNG broker for the same outcome.
JetStream replaces ores.mq
the custom MQ from earlier in this sprint becomes redundant once JetStream is on board.
Subject-based environment isolation
ores.{tier}.{instance} prefix; cleaner than separate broker instances per env.
ores.shell, not ores.messaging.shell
with NATS replacing the binary comms protocol, the .comms namespace doesn't apply to the shell any more.

Out of scope

  • NATS clustering / replication topology.
  • Multi-region NATS deployment.

See also

Emacs 29.1 (Org mode 9.6.6)