Story: Codegen CI zero-diff invariant

Table of Contents

This page documents a story in Sprint 21. It captures the goal, current status, acceptance criteria, and the tasks that compose it.

Goal

The zero-diff invariant — running codegen.sh regenerate produces no change to production files — is currently verified manually, once, at the end of each codegen refactoring story. Without a CI gate, any subsequent template change that is not paired with a production file update silently breaks the invariant. The system can revert to drift within one sprint.

This story adds a CI job (GitHub Actions step or CMake test target) that machine-enforces the invariant on every relevant pull request:

  1. Bootstraps the ores.codegen Python venv.
  2. Runs codegen.sh regenerate --all --profile all-cpp and codegen.sh regenerate --all --profile sql.
  3. Runs git diff --exit-code on the affected output directories.
  4. Fails if any file differs; reports the diff in the job log.

The job runs on every PR that touches files under projects/ores.codegen/library/templates/, projects/ores.codegen/models/, projects/ores.codegen/src/, or any production C++ / SQL file under a registered component.

See Codegen architecture analysis and unified model roadmap for context.

Status

Field Value
State BACKLOG
Parent sprint Sprint 21
Now Not yet started.
Waiting on Phase 2: single model file per entity (stable model format before CI is worth adding).
Next Design CI check; choose GitHub Actions vs. CMake test target.
Last touched 2026-05-30

Acceptance

  • A CI job (GitHub Actions step or CMake test target) runs codegen.sh regenerate for each registered component.
  • The job fails with a non-zero exit code if git diff --exit-code detects any change to production files after regeneration.
  • The job is triggered on every PR that touches any file in projects/ores.codegen/, any production C++ file in a registered component, or any production SQL file in projects/ores.sql/.
  • The job passes on a clean checkout with zero drift.
  • The job runs successfully on Linux, macOS, and Windows CI runners.
  • A deliberate drift (changing a template without updating production files) is caught and reported within one CI run.

Tasks

Task State Start End Description
Task: Design the CI check BACKLOG     Decide GitHub Actions step vs. CMake test target; map out venv bootstrap on all three CI platforms; define path-based triggers; document the exit-code contract.
Task: Implement CI check for refdata (pilot) BACKLOG     Write the CI step or CMake target; run against refdata-cpp and refdata SQL; verify it catches a deliberate template drift; verify it passes on a clean checkout.
Task: Expand CI check to all registered components BACKLOG     Enable the check for all components in manifest.py; validate on Linux, macOS, and Windows; document trigger paths and expected runtime.

Decisions

  • Prefer CMake test target over a standalone Actions step. A CMake add_test() target can be run locally with ctest -R codegen-zero-diff, making it easier for engineers to reproduce CI failures without pushing.
  • Trigger on affected paths only. Running regeneration on every PR is expensive (~1,746 files). Path-based triggers limit the job to PRs where templates, models, or production output files actually changed.
  • SQL-only pilot can start before Phase 2. The SQL zero-diff check (refdata _table.json models only) is self-contained and can be implemented and enabled as soon as the C++ audit achieves a clean baseline. It does not require the unified _entity.json format from Phase 2 to be complete.

Out of scope

  • Checking Qt template output in the same CI step — Qt requires a separate --profile qt run; add as a follow-on task once the main check is stable.
  • Windows venv bootstrap issues — if Windows CI requires special Python handling, that is a separate infrastructure task.

Emacs 29.1 (Org mode 9.6.6)