ORE Studio SQL Schema

Database schema and management for ORE Studio. To understand the database lifecycle see Database Lifecycle. For recipes related to SQL see SQL Recipes.

Schema Mental Model

The ORES database uses a single public schema with an ores_ prefix on all table names. This flat structure simplifies queries, tooling, and debugging while maintaining clear table categorization through naming conventions.

Single Public Schema

┌─────────────────────────────────────────────────────────────────┐
│  public schema (all tables use ores_ prefix)                    │
│  ─────────────────────────────────────────────────────────────  │
│                                                                 │
│  Data Governance (ores_dq_* prefix):                            │
│  • Governance: dimensions, domains, change_reasons              │
│  • Registry: datasets, catalogs, bundles, methodologies         │
│  • Staging: *_artefact_tbl (source for publication)             │
│  • Audit: publications, bundle_publications                     │
│                                                                 │
│  Reference Data (ores_refdata_* prefix):                        │
│  • Countries, currencies, coding schemes, etc.                  │
│  • FPML financial products reference data                       │
│                                                                 │
│  IAM (ores_iam_* prefix):                                       │
│  • Accounts, roles, permissions                                 │
│  • Login tracking and sessions                                  │
│                                                                 │
│  Assets (ores_assets_* prefix):                                 │
│  • Images, tags, image-tag mappings                             │
│                                                                 │
│  Variability (ores_variability_* prefix):                       │
│  • System settings for runtime configuration                    │
│                                                                 │
│  Geo (ores_geo_* prefix):                                       │
│  • IP to country mappings                                       │
└─────────────────────────────────────────────────────────────────┘

Key Principles

Consistent Naming Convention

All ORES tables follow the pattern ores_<category>_<name>_tbl:

  • ores_dq_* - Data quality/governance tables
  • ores_refdata_* - Reference data tables
  • ores_iam_* - Identity and access management tables
  • ores_assets_* - Asset management tables
  • ores_variability_* - Feature flag tables
  • ores_geo_* - Geolocation tables

This prefix-based approach provides namespace isolation within the public schema while enabling simple, unqualified table references in queries.

Unidirectional Data Flow for Reference Data

Reference data flows in one direction: from staging artefact tables to production reference tables via the publication mechanism. This ensures:

  • Clear provenance: all reference data originates from catalogued datasets
  • Audit trail: publications are tracked with timestamps and record counts
  • Reproducibility: reference data state can be reconstructed from artefacts
ores_dq_*_artefact_tbl ──[publish]──► ores_refdata_*_tbl

Data Governance is Extensible

The data governance tables (ores_dq_*) are not read-only. Users can extend them with:

  • Custom change reasons for domain-specific audit trails
  • New methodologies for data derivation/transformation logic
  • Custom datasets (e.g., stress testing scenarios, derived market data)
  • New artefact tables for additional data types

This allows the same infrastructure to support both shipped reference data and user-generated derived data (e.g., stress test scenarios).

Multiple Write Sources

While reference data enters production tables exclusively via publication, the database also receives writes from normal operational processes:

  • Trading activity writing to transaction tables
  • Market data feeds updating prices/rates
  • IAM operations (user registration, role assignments)
  • Session tracking and audit logs

Clear Responsibility Boundaries

Question Answer
Where do I define a new dataset? ores_dq_datasets_tbl
Where do I stage data before publishing? ores_dq_*_artefact_tbl
Where does my application read from? ores_refdata_*, ores_iam_*, etc.
How does reference data get into production? Only via publication
Can I extend the governance model? Yes, add to ores_dq_*
Can I create custom derived datasets? Yes, same workflow as shipped data

Audit Trail Metadata Conventions

Every versioned table includes four audit metadata columns:

Column Type Set by Meaning
modified_by text Server (from auth->username) The end user who requested the change.
performed_by text DB trigger (current_user) The database service account that executed the write.
change_reason_code text Client (soft FK to change reasons) Categorises why the change was made.
change_commentary text Client Free-text explanation of the change.
  • modified_by: identifies who requested the change (the human or API account). The application server sets this from the authenticated session. For seed/bootstrap data where no session exists, the trigger defaults it to current_user (the DB role).
  • performed_by: identifies how the change was executed (which DB service account). Always set by the database trigger to current_user. Application code must never set this field; it should be left empty so the trigger fills it.
  • change_reason_code: soft foreign key to ores_dq_change_reasons_tbl. Used for governance audit trails to categorise modifications.
  • change_commentary: optional free-text explanation providing additional context for the change.

Future improvement: modified_by and performed_by should have a soft FK to ores_iam_accounts_tbl.username to prevent inconsistencies and enable referential auditing. This requires handling seed/bootstrap rows where no account exists yet (e.g. by using a sentinel system account).

Schema Contents Summary

Data Governance Tables (ores_dq_* prefix)

Category Tables Extensible
Change Control ores_dq_change_reason_categories_tbl Yes
  ores_dq_change_reasons_tbl Yes
Data Domains ores_dq_data_domains_tbl Rarely
  ores_dq_subject_areas_tbl Yes
Dimensions ores_dq_origin_dimensions_tbl Rarely
  ores_dq_nature_dimensions_tbl Rarely
  ores_dq_treatment_dimensions_tbl Rarely
Coding Schemes ores_dq_coding_scheme_authority_types_tbl Rarely
  ores_dq_coding_schemes_tbl Yes
Methodologies ores_dq_methodologies_tbl Yes
Artefact Types ores_dq_artefact_types_tbl Yes
Catalogs ores_dq_catalogs_tbl Yes
Datasets ores_dq_datasets_tbl Yes
  ores_dq_dataset_bundles_tbl Yes
  ores_dq_dataset_bundle_members_tbl Yes
  ores_dq_dataset_dependencies_tbl Yes
Staging ores_dq_*_artefact_tbl (20+ tables) Yes
Audit ores_dq_publications_tbl System
  ores_dq_bundle_publications_tbl System

Reference Data Tables (ores_refdata_* prefix)

Category Tables Write Source
Countries ores_refdata_countries_tbl Publication
Currencies ores_refdata_currencies_tbl Publication
FPML Data ores_refdata_*_tbl (various) Publication

IAM Tables (ores_iam_* prefix)

Tables Write Source
ores_iam_accounts_tbl Operational + Publication
ores_iam_roles_tbl Publication
ores_iam_permissions_tbl Publication
ores_iam_account_roles_tbl Operational
ores_iam_role_permissions_tbl Operational
ores_iam_login_info_tbl Operational

Assets Tables (ores_assets_* prefix)

Tables Write Source
ores_assets_images_tbl Publication
ores_assets_tags_tbl Publication
ores_assets_image_tags_tbl Publication

Other Tables

Category Tables Write Source
Geo ores_geo_ip2country_tbl Publication
Variability ores_variability_system_settings_tbl Operational

Schema Architecture

Diagram:

The ORES database is organised into four domains:

IAM (Identity & Access Management)

  • accounts (temporal): user accounts with authentication credentials
  • login_info: security tracking for login attempts and lock status
  • sessions (temporal): user session tracking with geolocation

RBAC (Role-Based Access Control)

  • roles (temporal): named roles for grouping permissions
  • permissions (temporal): fine-grained permission codes
  • account_roles (junction): many-to-many mapping of accounts to roles
  • role_permissions (junction): many-to-many mapping of roles to permissions

Assets (Images & Currencies)

  • images (temporal): SVG image storage with version tracking
  • tags (temporal): image categories (flag, currency, commodity)
  • image_tags (junction): many-to-many image-to-tag mapping
  • currencies (temporal): ISO 4217 currency definitions with formatting and optional image_id for flags

Variability (System Settings)

  • system_settings (temporal): typed system settings for runtime configuration

Schema Design Patterns

All temporal tables use valid_from and valid_to timestamptz fields for bitemporal data management. Key patterns include:

  • Junction tables enable soft deletes via validity period closure
  • EXCLUDE constraints prevent overlapping validity periods using gist indexes
  • Unique indexes filter on current records where valid_to = '9999-12-31 23:59:59'::timestamptz
  • Version fields provide optimistic locking (version 0 is "force overwrite" sentinel)
  • Triggers manage version auto-increment and temporal validity period transitions

For detailed implementation patterns and step-by-step instructions on creating new tables, see the SQL Schema Creator skill.

PostgreSQL Extensions

ORES uses the following PostgreSQL extensions:

btree_gist (Required)

Provides GiST index support for B-tree data types. Required for temporal exclusion constraints that prevent overlapping validity periods.

CREATE EXTENSION IF NOT EXISTS btree_gist;

TimescaleDB (Optional)

Time-series database extension for efficient session analytics. If installed, the sessions table becomes a hypertable with automatic time-based partitioning.

Installation

TimescaleDB must be installed at the system level:

# Debian/Ubuntu (official TimescaleDB repo recommended for full features)
sudo apt install timescaledb-2-postgresql-16

# macOS
brew install timescaledb

Add to postgresql.conf:

shared_preload_libraries = 'timescaledb'

Restart PostgreSQL after configuration changes.

License Editions

TimescaleDB has two license editions:

Edition License Features
Apache Apache 2.0 Hypertables, time-based partitioning, chunk pruning
Community Timescale Above + compression, retention policies, continuous aggregates

Debian/Ubuntu packages (+dfsg versions) are Apache-only. For full features, use the official TimescaleDB repository or build from source.

To check and enable the community license:

-- Check current license
SHOW timescaledb.license;

-- Enable community features (if installed)
ALTER SYSTEM SET timescaledb.license = 'timescale';
SELECT pg_reload_conf();

Feature Detection

The ORES schema scripts automatically detect TimescaleDB availability and license at runtime:

Condition Sessions Table Behaviour
No TimescaleDB Regular PostgreSQL table with standard indexes
Apache license Hypertable with 7-day chunks (no compression)
Timescale license Hypertable + compression + retention policies

Continuous aggregates (session_stats_daily, session_stats_hourly) are only created with the Timescale license.

Hypertable Benefits

Even with Apache-only, hypertables provide:

  • Automatic time-based partitioning (7-day chunks)
  • Efficient time-range query pruning
  • Parallel chunk processing
  • Simplified data lifecycle management

Data Population Layers

The ORES database uses a layered approach to data population within the single public schema. Each layer builds upon the previous:

┌─────────────────────────────────────────────────────────────────────────────┐
│                         public schema                                       │
│                                                                             │
│  Foundation → Data Governance → Data Catalogues → Published Reference Data  │
│  (prerequisites)  (rules/dimensions)   (datasets + artefacts)   (live data) │
│                                                                             │
│  ores_dq_* tables (staging)  ────[publish]────►  ores_refdata_* tables      │
└─────────────────────────────────────────────────────────────────────────────┘

All population scripts write to the ores_dq_* tables. The ores_refdata_* tables are populated via the publication mechanism (and operational processes).

For detailed execution ordering and ordering constraints, see the Database Lifecycle and Script Reference.

Foundation Layer

Essential lookup and configuration data required for schema integrity. This data is populated in the template database and must be present before other data can be inserted.

Category Tables Purpose
Change Control ores_dq_change_reason_categories_tbl Audit trail categorization
  ores_dq_change_reasons_tbl Audit trail reasons
Data Governance Framework ores_dq_data_domains_tbl Data domain classification
  ores_dq_subject_areas_tbl Subject area classification
  ores_dq_coding_scheme_authority_types_tbl Coding scheme authority types
  ores_dq_coding_schemes_tbl Party/entity identification schemes

Foundation layer is populated via populate/foundation/populate_foundation.sql.

Data Governance Layer

Metadata that defines the rules and classifications for how data is organized. This layer sets out the landscape for data quality management.

Category Tables Purpose
Dimensions ores_dq_origin_dimensions_tbl Data origin classification
  ores_dq_nature_dimensions_tbl Data nature classification
  ores_dq_treatment_dimensions_tbl Data treatment classification
Methodologies ores_dq_methodologies_tbl Data sourcing/processing methods
Artefact Types ores_dq_artefact_types_tbl Types of artefacts and mappings

Data governance layer is populated via populate/governance/populate_governance.sql.

Data Catalogues Layer

Catalogued reference data organized according to governance rules. This layer contains the actual data awaiting publication to reference data tables.

Category Tables Purpose
Catalogs ores_dq_catalogs_tbl Groupings of datasets by source
Datasets ores_dq_datasets_tbl Data collection definitions
Bundles ores_dq_dataset_bundles_tbl Named collections for publication
  ores_dq_dataset_bundle_members_tbl Bundle membership with ordering
Dependencies ores_dq_dataset_dependencies_tbl Relationships between datasets
Artefact Data ores_dq_*_artefact_tbl Staged reference data

Data sources include:

  • Flag Icons: Visual assets for countries
  • ISO Standards: Countries, currencies (ISO 3166, ISO 4217)
  • Solvaris: Extended currency metadata
  • IP to Country: IP geolocation mappings
  • FPML: Financial products reference data
  • Cryptocurrency: Digital currency data

Data catalogues layer is populated via populate/catalogues/populate_catalogues.sql.

Reference Data Layer

Live reference data in production tables, published from the data catalogues layer via the "Publish Datasets" feature in the Data Librarian window.

Category Tables Source Artefact Table
Countries ores_refdata_countries_tbl ores_dq_countries_artefact_tbl
Currencies ores_refdata_currencies_tbl ores_dq_currencies_artefact_tbl
Images ores_assets_images_tbl ores_dq_images_artefact_tbl
FPML Data ores_refdata_*_tbl ores_dq_*_artefact_tbl

Additional tables populated via operational processes (not publication):

Category Tables Write Source
IAM ores_iam_accounts_tbl User registration
  ores_iam_login_info_tbl Login tracking
Variability ores_variability_system_settings_tbl Runtime configuration

Multi-Environment Architecture

ORE Studio supports multiple isolated development environments. For full details on database layout, naming conventions, environment isolation, creation paths, roles, credentials, and management scripts, see Database Lifecycle and Script Reference.

Project Structure

For detailed directory structure, file organization, and naming conventions, see the SQL Schema Creator skill documentation.

SQL Recipes

For executable SQL query examples, see the SQL Recipes notebook. It contains ready-to-run queries for:

  • Account management and temporal history
  • RBAC queries (roles, permissions, account assignments)
  • Currency and image lookups
  • Feature flag inspection
  • Session and login tracking
  • Administrative queries (database size, table sizes)
Top: Documentation Previous: System Model