Automate new service registration
Table of Contents
This page is a capture in the next bucket of the product backlog — a pre-sprint idea, not yet pulled into a sprint as a story.
Background
When ores.marketdata.service was added to the running system, the following
manual steps were required across multiple files, with no single checklist and
no tooling to enforce completeness. Several were discovered only at runtime
(missing NATS cert, missing DB user, missing IAM role, missing IAM
permissions), requiring iterative recreate_database runs.
Manual steps identified (in discovery order)
build/scripts/generate_nats_certs.sh— add service name toSERVICESarray so that an mTLS client certificate is generated. Without this the service crashes immediately on NATS connect withConnection Closed.projects/ores.codegen/models/services/ores_services_service_registry.json— add service entry (name, psql_var, env_key, iam_role, dml_prefixes, select_tables). Then re-run theservice-registrycode-gen profile to regenerate five files:projects/ores.sql/teardown_all.sql— adddrop role if exists <env>_<service>_service;entry (not covered by code-gen).projects/ores.sql/populate/iam/iam_permissions_populate.sql— register all domain permissions (<service>::<resource>:<action>and<service>::*). These must exist before any role can reference them.projects/ores.sql/populate/iam/iam_roles_populate.sql— create the IAM role and assign permissions. Fails at runtime if permissions from step 4 are absent..env— addORES_<SERVICE>_SERVICE_DB_USERandORES_<SERVICE>_SERVICE_DB_PASSWORDvariables (currently done byinit-environment.sh, but only if the service is known to that script).
Proposed improvements
- Extend
generate_nats_certs.shto derive its service list fromservice_vars.sh(SERVICE_NAMESarray) rather than a hard-codedSERVICESarray, so adding to the registry automatically covers cert generation. - Extend the
service-registrycode-gen profile (or add a new template) to also emit the IAM permissions and IAM role seed SQL for each service, driven by thedml_prefixesandselect_prefixesfields already in the registry. This would cover steps 4 and 5 automatically. - Extend the
service-registrycode-gen profile to emit theteardown_all.sqlfragment for each service (or generate a separateteardown_services.sqlthat is included), covering step 3. - Add a
new-service checklisttoCLAUDE.mdor a dedicateddoc/how-to/add-a-new-service.mdcovering all manual steps, so that until full automation is in place nothing is missed. - Add a validation step to
recreate_database.sh(orvalidate_schemas.sh) that cross-checks: every service inservice_vars.shhas a NATS cert, an IAM role, and at least one registered permission.
Tasks
[ ]Drivegenerate_nats_certs.shfromSERVICE_NAMESinservice_vars.shinstead of a hard-coded array[ ]Add service-registry code-gen template for IAM permissions seed SQL[ ]Add service-registry code-gen template for IAM role seed SQL[ ]Add service-registry code-gen template (or include fragment) forteardown_all.sqlservice role drops[ ]Add cross-check torecreate_database.sh/validate_schemas.sh: every service has NATS cert + IAM role + ≥1 permission[ ]Document the remaining manual steps indoc/how-to/add-a-new-service.mdas an interim checklist