Story: GLEIF data integration
Table of Contents
This page documents a story in Sprint 12. It captures the goal, current status, acceptance criteria, and the tasks that compose it.
Goal
Turn the GLEIF CSV subsets into proper datasets; expand anchor coverage with central banks; add LEI-to-BIC mapping.
Status
| Field | Value |
|---|---|
| State | DONE |
| Parent sprint | Sprint 12 |
| Now | Completed 2026-02-13. |
| Waiting on | None. |
| Next | None. |
| Last touched | 2026-02-13 |
Continued from: Party schemes and FPML reference data (sprint 10) — that story shipped the GLEIF download + split script; this one turns the resulting CSVs into proper datasets + populate scripts.
Acceptance
- GLEIF artefact tables + 4 datasets registered via codegen.
- Python + shell pipeline automates SQL populate generation.
- Central-bank LEIs in the subset; 3× sampling priority.
- LEI-to-BIC mapping dataset published.
Tasks
| Task | State | Start | End | Description |
|---|---|---|---|---|
| Add GLEIF data to datasets | DONE | 2026-05-19 | 2026-02-09 | Codegen for lei_entities + lei_relationships artefact tables; Python lei_generate_metadata_sql.py + shell wrapper; GLEIF catalog + methodology + 4 datasets; methodology.txt updated. |
| Add central-bank-related LEIs | DONE | 2026-05-19 | 2026-02-11 | ~93 GLEIF-verified LEIs (33 sovereign issuers, 51 central banks, 9 supranationals); CENTRAL_BANK sector keyword detection with multilingual support; 3x financial-priority sampling. |
| Add LEI-to-BIC dataset | DONE | 2026-05-19 | 2026-02-13 | Mapping dataset for BIC settlements; CSV in external/, dataset + populate scripts + publisher; download from GLEIF mapping site. |
Decisions
- Idempotent PL/pgSQL populate
- regenerate without truncate.
- Multilingual sector keyword detection
CENTRAL_BANKmust out-rank genericBANKacross languages.
Out of scope
- Real-time GLEIF sync — daily snapshot is sufficient.
See also
- Party schemes and FPML reference data (sprint 10) — predecessor (download script).