Bootstrap, Role Lifecycle and Architecture Violations
Table of Contents
This document is a follow-up to WSL DB Permission Errors: Cross-Service Grant Investigation. That investigation correctly identified two missing GRANTs and applied targeted fixes, but left the root cause unresolved — why an "old" Linux box did not see the same errors as a freshly-provisioned WSL box, given that both run identical code.
The answer is not about WSL or socket-vs-TCP authentication. It is about
the role lifecycle in recreate_database.sh:
recreate_database.sqldrops only the database. Cluster-level roles and users are created idempotently — they are never dropped or revoked. Any membership granted to a service user in a previous era of the schema persists across every subsequentrecreate_database.shrun.
On the old Linux box, the domain service users (iam_service,
refdata_service, compute_service, …) were originally created as members
of the broad rw_role. The 2026-03 least-privilege change removed
in role rw_role from the create paths but did not add any code to revoke
the prior membership. Result: the old box's service users keep their
historical rw_role grant — which carries
SELECT, INSERT, UPDATE, DELETE, TRUNCATE on all tables in schema public
(see setup_schema.sql:73-86) — and so missing per-service GRANTs never
trigger a failure there. A fresh box creates the users for the first time
under the strict model and immediately hits the missing GRANTs.
A second, distinct concern surfaced during the same investigation: the IAM
service today reads from and writes to tables it does not own
(ores_refdata_parties_tbl, ores_variability_system_settings_tbl), in
direct violation of the
Service Account RBAC
Design. The cross-service GRANTs added to plug the WSL outage encode that
violation into the permission matrix; they are a stop-gap, not a target
state.
This document covers:
- The cluster-level role lifecycle bug.
- The bootstrap sequence and what makes it special.
- The cross-service direct-access violations in IAM (and a smaller one in
compute). - A remediation plan that separates "must fix to make recreate-db reproducible" from "should fix to honour the RBAC architecture".
Part 1 — The Role Lifecycle Bug
What recreate_database.sh actually does
The shell wrapper validates environment, then runs two phases:
- As
postgressuperuser,recreate_database.sql:- DROPs the target database
(
drop database if exists :db_nameatrecreate_database.sql:55). - Creates/updates the group roles idempotently
(
recreate_database.sql:73-92—create role ... nologinguarded bynot exists). - Creates/updates the login users idempotently
(lines 110-189, with the domain service users delegated to the
generated file
create/iam/service_users_create.sql).
- DROPs the target database
(
- As the DDL user,
setup_database.shrunssetup_schema.sqlwhich recreates tables, runs population, then appliescreate/iam/iam_service_db_grants_create.sql.
The per-user blocks in (1) all follow the same pattern:
select set_config('ores.cur_user', :'iam_service_user', false); do $$ begin if not exists (select 1 from pg_roles where rolname = current_setting('ores.cur_user')) then execute format('create user %I', current_setting('ores.cur_user')); end if; end $$; alter user :iam_service_user with password :'iam_service_password'; grant :service_role to :iam_service_user; alter role :iam_service_user set search_path to public;
The block can only add state to an existing user: a password reset, a
new grant service_role to ..., a new search_path. It never:
- revokes prior group memberships,
- drops the user before recreating it,
- alters non-listed session settings (e.g.
app.current_tenant_idleft over from an earlier era), - removes object-level GRANTs the user accumulated previously.
Importantly, the script's docstring claims the operation is
"idempotent and scoped to this environment" (recreate_database.sql:25-27).
This is only true for fresh clusters. On a cluster that has carried
service users across schema epochs, "idempotent" means "monotonically
accumulating".
Why the two machines diverged
A standalone, never-touched drop_roles.sql exists at the top of
projects/ores.sql/ and does drop every user and group role for an
environment label. It is documented as
"The environment database must be dropped BEFORE running this script." and
"USAGE: psql -U postgres -v env_label=local2 -f drop_roles.sql".
recreate_database.sh never calls it. Nothing in the docs tells a developer
to call it before recreate_database.sh. As a result:
| Machine | Service users existed before strict-grant era? | Effect |
|---|---|---|
| Old Debian Linux | Yes — created as in role rw_role |
Carry rw_role membership ⇒ DML on every table in public. Missing per-service GRANTs are invisible. |
| Fresh WSL | No | Created under strict model with only service_role. Missing GRANTs surface immediately as permission denied. |
This is sufficient to explain every symptom in
wsl-db-permission-investigation.org. The TCP-vs-socket hypothesis at the
bottom of that document is a red herring: pg_hba.conf controls
authentication, not authorisation. Once a connection is authenticated as
iam_service, the same per-role GRANTs apply regardless of transport.
Evidence
The grant matrix file iam_service_db_grants_create.sql has only three
commits in its history (git log):
e0a707b5— initial introduction (merge offeature/party-isolation-rls-policies).c6e6d5aa— addsgrant select on ores_variability_system_settings_tbl.79e2db2e— addsores_refdata_partiesDML for IAM andores_dq_change_reasons_tblSELECT for compute.
For every commit prior to c6e6d5aa, the file lacked the IAM grants. A
recreate_database.sh run on any machine whose iam_service user was
created before the file existed would still successfully access those
tables — through the leftover rw_role membership.
The pre-existing setup_user.sql (one-shot script for new clusters,
setup_user.sql:357-368) creates service users with no group membership
and only adds them to service_role. The "old box" almost certainly
predates this version, because at that time setup_user.sql used
in role rw_role for every user; the change to strip rw_role from
service users is what motivated iam_service_db_grants_create.sql in the
first place.
Recommended fix
Three options, in increasing order of intrusiveness:
- (Minimum) Document the requirement: a clear note in the
recreate_database.shhelp text and inCLAUDE.mdthatdrop_roles.sqlmust be run once after pulling any schema change that touches role membership. This is unsatisfying — it relies on humans noticing. - (Recommended) Add a
--reset-roles(or default-on) phase torecreate_database.shthat invokesdrop_roles.sqlbeforerecreate_database.sql. The destructive nature warrants a flag, but it should be on by default and require--keep-rolesto opt out. (Stronger) Make the create blocks converge rather than accumulate. Every user block should explicitly:
revoke <every_other_group_role> from <user>before granting the intended one,alter role <user> reset allto clear stale session settings,reassign owned by ... to ... ; drop owned by ...patterns where object-level state could leak across epochs.
This is harder to get right (must enumerate all group roles, must avoid nuking the database we just created) but means
recreate_database.shbecomes truly idempotent regardless of cluster history.
Option (2) is the right pragmatic choice: drop_roles.sql already exists,
already handles all dependency ordering, and the cost of a flag is
trivial. Option (3) can follow if and when we hit a second class of
"persistent state" bug.
The drop_roles.sql list of users is hard-coded
(drop_roles.sql:59-79). It must be regenerated from the service registry
when new services are added; today this is manual.
Related: populate files diverge from the service registry
Already noted in the WSL investigation — http_user, wt_user,
compute_wrapper_user exist in
populate/iam/iam_service_accounts_populate.sql and
populate/iam/iam_service_account_roles_populate.sql but are not in the
service registry model. A full codegen regeneration would silently strip
these entries, breaking IAM accounts for HTTP, Wt and the compute wrapper
on a clean rebuild. Either:
- Move the static infrastructure-user inserts into a non-generated
populate/iam/iam_infrastructure_service_accounts_populate.sql, or - Add
"category": "infrastructure"entries to the service registry model so codegen emits them deterministically.
Part 2 — The Bootstrap Sequence
Bootstrap is the only operating mode in which the system runs before IAM
has a usable account or party graph. It is therefore the only place where
the normal RBAC story (every NATS call carries a verified JWT, every DB
write is authorised by check_permission()) cannot fully apply. The rules
of engagement must be stated explicitly so we don't accidentally relax
them outside of bootstrap.
Phases
The flow, from a cold cluster to a working tenant admin, has five phases:
Phase 0 — Cluster preparation (postgres superuser)
Run once per OS install (or once per WSL reset). Performed by
recreate_database.sh:
- Drop the target database.
- Create/update group roles:
owner,rw,ro,service(env-scoped, e.g.ores_local1_owner). - Create/update login users:
ddl_user,cli_user,wt_user,shell_user,http_user,readonly_user, test users, and one<service>_serviceper entry in the service registry.
Bootstrap-specific note: there is no notion of tenant or authentication here. The script is run by a privileged operator (DBA / developer). This is acceptable because the operation is purely DDL on the cluster.
Phase 1 — Schema and data setup (DDL user)
Performed by setup_database.sh → setup_schema.sql as the DDL user
(member of owner_role):
- Create all tables/functions/triggers under
publicviacreate/create.sql. - Populate foundation data (
foundation_populate.sql). - Populate all other lookup/dimension data.
- Apply broad role GRANTs:
rw_rolegets DML on all tables,ro_rolegets SELECT. - Apply per-service GRANTs via
create/iam/iam_service_db_grants_create.sql(the "least-privilege matrix"). - Initialise instance-level feature flags
(
instance/init_instance.sql).
A key artefact of phase 1 is the system tenant (UUID
ffffffff-ffff-ffff-ffff-ffffffffffff) with:
- a "system" party owned by the system tenant,
- the service account rows (one per registered
<service>_service_user), - the bootstrap-mode flag set to
trueinores_variability_system_settings_tblfor the system tenant.
Phase 2 — Service startup (each service)
Every service (iam, refdata, compute, …) starts as its dedicated DB
user and:
- Connects to NATS.
- Logs in to IAM at
ores.<env>.<inst>.iam.v1.auth.service-loginusing its service-account credentials, receiving a JWT. - Caches the JWT, attaches it as
Authorization: Bearer ...on every outbound request, and renews on expiry.
Bootstrap-specific note: the IAM service itself cannot use the same login path; it would need to talk to itself. It uses a self-issued JWT signed with the same private key it uses to verify others.
Phase 3 — Initial admin (bootstrap mode only)
While system.bootstrap_mode = true, IAM exposes
ores.<env>.<inst>.iam.v1.bootstrap.create-admin which:
- Hashes the password in C++ (scrypt).
- Calls SQL function
ores_iam_create_initial_admin_fn(...)asSECURITY DEFINER. This function creates the account, assigns theSuperAdminrole, associates it with the system party, and flipssystem.bootstrap_modetofalse.
The handler at bootstrap_handler.hpp:90 guards on
bootstrap_mode_service::is_in_bootstrap_mode(), so this subject becomes a
no-op once the flag is cleared. After this point, no JWT-less subject can
mutate identity data.
Phase 4 — Tenant provisioning (admin only)
ores.<env>.<inst>.iam.v1.bootstrap.provision-tenant creates a new tenant
with its own admin. The handler at bootstrap_handler.hpp:153:
- Switches to system-tenant context.
- Calls
ores_iam_provision_tenant_fn(...)asSECURITY DEFINER— this creates the tenant row, copies system-tenant reference data (roles, permissions, lookups), seeds the WRLD business centre, creates the new tenant's system party, and sets that tenant'sbootstrap_modeflag. - Switches to the new tenant's context.
- Calls
account_service::create_account(...)for the tenant admin as the IAM service DB user, directly — no SQL function, no security definer. This needsINSERTonores_iam_accounts_tbl— which the IAM service already has via itsores_iam_DML prefix grant. - Calls
refdata::repository::party_repository::read_system_party(...)to look up the new tenant's system party, as the IAM service DB user. This needsSELECTonores_refdata_parties_tbl— which is the architecture violation. - Calls
account_party_service::save_account_party(...)to link the new admin to the system party, as the IAM service DB user. This needsINSERTonores_iam_account_parties_tbl(own table) andSELECTonores_refdata_parties_tblfor FK validation, depending on how the constraint is defined.
Why split: steps 1-3 absolutely need SECURITY DEFINER (they cross tenant
boundaries and write to system-tenant rows that the IAM service must not be
able to mutate at will). Steps 4-6 do not strictly need it, but the
ergonomic price is the cross-service GRANT for ores_refdata_parties_*.
Why bootstrap is special
Three structural reasons the normal RBAC story can't fully apply during bootstrap:
- No accounts yet. JWT issuance requires an authenticated account. The first account is the very thing we are creating.
- No party graph yet. Many cross-service operations rely on the tenant having a "system party". Provisioning is what creates it.
- Service-to-service auth on the boundary. The IAM service issues JWTs; it can't validate its own callers via the same loop without an infinite recursion at startup.
The accepted relaxations during bootstrap are:
- The two
bootstrapsubjects are reachable without a JWT, gated by thesystem.bootstrap_modeflag. SECURITY DEFINERSQL functions run as the DDL user for the operations that cross schemas (tenant provisioning, initial admin creation).- Inside the bootstrap handler, the IAM C++ code runs as the IAM service DB user and performs the residual writes/reads it needs to "stitch" the new admin to the new tenant.
The relaxations must end the moment bootstrap_mode flips off. Today
they do — but the cross-service GRANTs added to make Phase 4 work do
not automatically expire, so the IAM service retains DML on
ores_refdata_parties_tbl forever. That is the bridge between bootstrap
and the next section.
Part 3 — Architecture Violations in IAM
The Service Account RBAC Design is explicit about cross-component access (excerpt):
| Service | Own schema | Shared SELECT |
|---|---|---|
ores_iam |
iam |
— (none) |
In practice, the IAM service today reaches into two other components' tables, and the GRANTs file encodes the violation:
-- iam_service_db_grants_create.sql:122-126 select _ores_grant_dml_fn('ores_iam_', :'iam_service_user'); select _ores_grant_dml_fn('ores_refdata_parties', :'iam_service_user'); -- VIOLATION grant select on ores_variability_system_settings_tbl to :iam_service_user;-- VIOLATION
Violation 1 — variability.system_settings
Code paths:
auth_handler.hpp:139-155(auth_is_tenant_bootstrap_mode): every login checks the target tenant's bootstrap flag.auth_handler.hpp:173-183(reload_token_settings): at IAM startup, read JWT TTL settings.account_handler.hpp: same pattern (signup / account create flows consult signup-policy flags).bootstrap_mode_service.hpp:112: linked directly,variability::service::system_settings_serviceis a member.
Why it exists. ores.variability owns runtime configuration — feature
flags, TTLs, signup policy. The IAM service uses it as a settings store.
Linking the variability core was the path of least resistance: it is the
same C++ binary, the SQL is already written, no NATS hop in the hot path
of login.
Architectural intent. These reads should be cached. They are read-mostly: settings change at human cadence (minutes/hours), login happens at subsecond cadence. Two clean options:
- IAM caches the settings in-process and refreshes them on a NATS "system-settings-changed" notification published by the variability service.
- IAM owns its own settings table for IAM-relevant flags
(
iam.token_settings,iam.signup_policy,iam.bootstrap_state) and stops reading the cross-tenantores_variability_system_settings_tblentirely. This is the strictest interpretation of the RBAC plan, and probably the right end state, because the bootstrap flag and JWT settings are really IAM concerns, not generic "variability".
Option (2) is cleaner. Option (1) is a smaller intermediate step.
Violation 2 — refdata.parties
Code paths:
bootstrap_handler.hpp:203:party_repository.read_system_party(...)during tenant provisioning.auth_handler.hpp:75-106(auth_compute_visible_party_ids,auth_lookup_party): every login enriches the JWT with the party graph the user can see (read_descendantson the party hierarchy).account_handler.hpp: same lookups when servicing account CRUD that needs to display "associated party".
Why it exists.
- Provisioning: the IAM service needs the system-party id of the
newly-created tenant to link the first admin. Could be returned from the
SECURITY DEFINERprovisioning function as an extra OUT parameter. - Login enrichment: visible-party computation is a hot path; doing a NATS round-trip to refdata on every login adds latency and a hard availability coupling.
Architectural intent.
- The provisioning case is easy to remove: extend
ores_iam_provision_tenant_fnto return(tenant_id, system_party_id)and remove the directparty_repositorycall. Thesave_account_partyinsert intoores_iam_account_parties_tblstays (that table belongs to IAM), and the FK toores_refdata_parties_tblcan be enforced by an internal cached id rather than a runtime SELECT. - The login case is harder. Options:
- Refdata publishes a "party graph snapshot per tenant" via NATS
JetStream; IAM consumes it and answers
visible_party_idslocally. - Refdata exposes
refdata.v1.parties.read-descendantsandrefdata.v1.parties.read-latestas NATS request/reply subjects, IAM calls them with a short timeout and a fallback ("user sees only their own party"). - The visible-party set is precomputed and cached in
ores_iam_account_parties_tblextension columns, kept in sync by an IAM-side subscriber to refdata party-change notifications.
- Refdata publishes a "party graph snapshot per tenant" via NATS
JetStream; IAM consumes it and answers
(2) is the smallest change. (3) is the best for hot-path latency. (1) is the most "right" architecturally but requires snapshot semantics in JetStream we don't have today.
Smaller violation in compute
The same GRANTs file gives compute_service SELECT on
ores_refdata_parties_tbl and ores_dq_change_reasons_tbl. These are
similarly hot-path reads (every pricing job needs the party / change
reason) and should follow the same remediation pattern as the IAM
visible-party case.
Part 4 — Remediation Plan
Two independent tracks; both are needed but they have very different risk profiles.
Track A — Make recreate_database.sh deterministic
Goal: identical results on any cluster, regardless of history.
- Add
drop_roles.shwrapper that:- Generates the user list from the service registry (so new services
are picked up automatically), parameterised by
env_label. - Calls
drop_roles.sqlwith that list (replacing the current hard-coded array).
- Generates the user list from the service registry (so new services
are picked up automatically), parameterised by
- Have
recreate_database.shcalldrop_roles.shafterdrop databaseand beforerecreate_database.sql. Gate with--keep-rolesfor the rare case where role membership must be preserved (CI parallelism using the same role across DBs, etc.). - Regenerate
drop_roles.sqlfrom the service registry as part of codegen (or replace it with a generated filecreate/iam/service_users_drop.sqlthat complementsservice_users_create.sql). - Update
recreate_database.sqldocstring to drop the "idempotent" wording until step (1) lands. - Move static infrastructure-service accounts (
http_user,wt_user,compute_wrapper_user) out of the generated populate files into a non-generated counterpart, or add a category to the registry. (Already noted in the WSL doc.) - Add a smoke test:
ci/scripts/test_recreate_idempotent.shthat runsrecreate_database.shtwice and asserts both runs produce identicalpg_roles/pg_auth_members/has_table_privilegesnapshots for a representative table set.
Effort: small. Risk: low (everything is local DDL). This should land before any further work on Track B.
Track B — Honour the RBAC architecture
Goal: remove cross-component DB GRANTs from the IAM (and compute) service users. Done in this order to minimise breakage:
- Move IAM-owned settings out of variability. Create
ores_iam_token_settings_tblandores_iam_bootstrap_state_tbl, migrate the relevant keys, delete the IAM ⇒ variability GRANT. Keep the variability service for genuinely cross-cutting flags. Smallest change with the highest ratio of mess removed. - Return system_party_id from provisioning. Modify
ores_iam_provision_tenant_fnto return a record, drop theparty_repository.read_system_partycall inbootstrap_handler.hpp:204. Re-evaluate whethersave_account_partystill works without DML onores_refdata_parties_tbl(it should — that table is read-only from IAM's perspective once the id is in hand, and the IAM-sideaccount_partiestable is what we write). - Introduce refdata NATS read API for parties. Add
refdata.v1.parties.read-latestandrefdata.v1.parties.read-descendantsas authenticated request/reply subjects. Switchauth_compute_visible_party_ids/auth_lookup_partyto use them. This is a real refactor with testing and a latency-budget review, but it's the heart of the RBAC plan. - Drop the IAM ⇒ refdata DML grant once (2) and (3) ship. Tighten the
compute service the same way: replace its
grant select on ores_refdata_parties_tblwith a NATS call to the new subjects. - Cache aggressively in IAM and compute. Hot paths cannot survive an uncached NATS hop. Put the cached values behind a clear refresh contract (TTL + invalidation on refdata change notifications).
Effort: medium-to-large. Risk: medium (touches login latency and the provisioning flow). Sequence the steps so each one independently de-risks the next.
Out of scope but worth recording
drop_roles.sql's hard-coded user list will rot. Track A item 3 addresses it.- The IAM service core links
ores.variability.coreat the CMake level. Once Track B step 1 lands, that link should be removed entirely (withiam.core/CMakeLists.txtlosingores.variability.core). If we keep it around for any reason, it must be flagged as deprecated to prevent new code from adding more cross-component reads. - Bootstrap-mode flag location: today it lives in
ores_variability_system_settings_tbl. Moving it toores_iam_bootstrap_state_tblalso removes the need for variability to know about IAM concepts.
Part 5 — Resolution (2026-05-13)
A code audit completed in May 2026 clarified the true root cause of the cross-component SELECT grants and identified a much simpler resolution than the cache-table approach originally proposed.
Root cause of most SELECT grants
The SELECT ON ores_iam_tenants_tbl grants on 16 service users and the
SELECT ON ores_dq_change_reasons_tbl grants on 4 service users existed
because the shared trigger validator functions:
ores_iam_validate_tenant_fn(iniam_tenant_functions_create.sql)ores_dq_validate_change_reason_fn(indq_change_reason_functions_create.sql)
were defined as SECURITY INVOKER (the PostgreSQL default). Every
service component's INSERT/UPDATE triggers call these functions. Because
they execute as the triggering session user, that user needed SELECT on
the tables the functions read.
Note: PostgreSQL FK constraint enforcement does not require SELECT on
the referenced table — the check is done internally by the kernel. These
grants were not for FK enforcement; they existed solely because of the
SECURITY INVOKER trigger functions.
Fix applied
Both validator functions were changed to SECURITY DEFINER with
SET search_path = public, pg_temp. They now run as their DDL-user
definer; no service user requires cross-component SELECT for trigger
validation. The 16 tenant SELECT grants and the 4+3 change-reason
SELECT grants were removed from the service registry and regenerated out
of iam_service_db_grants_create.sql.
Two additional grants had no code justification at all:
SELECT ON ores_refdata_parties_tblforcompute_service_user— no trigger or C++ code in compute queries this table.SELECT ON ores_scheduler_*(prefix) forreporting_service_user—scheduler_job_idin the reporting tables is a soft UUID column; no trigger or C++ code in reporting queries scheduler tables.
Both were removed.
What remains
After the fix, the only cross-component grants that survive are genuine architectural coupling documented in the isolation plan (strict-service-table-isolation.org):
- IAM reads
ores_variability_system_settings_tbl(Phase 4.1) - IAM has DML on
ores_refdata_parties(Phase 4.2/4.3) - Workflow has DML on
ores_iam_*andores_refdata_parties(Phase 5.2) - ORE has DML on
ores_workflow_*(Phase 5.3) - Trading holds SELECT on
ores_refdata_*prefix (Phase 5.1 — inline trigger validations that need their own SECURITY DEFINER functions)
Track A from Part 4 has been implemented (mandatory role drops in
recreate_database.sh). Track B is the isolation plan Phases 4–5.
Appendix — File Pointers
| Concern | File |
|---|---|
| Role lifecycle (drop) | projects/ores.sql/drop_roles.sql |
| Role lifecycle (create) | projects/ores.sql/recreate_database.sql |
| Role lifecycle (one-shot create) | projects/ores.sql/setup_user.sql |
| Service users (generated) | projects/ores.sql/create/iam/service_users_create.sql |
| Per-service GRANT matrix | projects/ores.sql/create/iam/iam_service_db_grants_create.sql |
| RW role broad GRANT | projects/ores.sql/setup_schema.sql:73-86 |
| Bootstrap handler | projects/ores.iam.core/include/ores.iam.core/messaging/bootstrap_handler.hpp |
| IAM auth handler (party / settings) | projects/ores.iam.core/include/ores.iam.core/messaging/auth_handler.hpp |
| IAM bootstrap-mode service | projects/ores.iam.core/src/service/bootstrap_mode_service.cpp |
| RBAC design | doc/plans/2026-03-28-service-rbac-design.org |
| WSL investigation (predecessor) | doc/analysis/wsl-db-permission-investigation.org |
See also
- System Bootstrapping Concepts and Provisioning — user-facing walkthrough of the bootstrap sequence and provisioning wizards.