ORE Studio 0.0.4
Loading...
Searching...
No Matches
Functions | Variables
src.lei_generate_metadata_sql Namespace Reference

Functions

Path get_repo_root ()
 
str get_header ()
 
str escape_sql (str s)
 
str parse_as_of_date (str filename)
 
dict find_csv_files (Path lei_dir)
 
Path|None find_lei_bic_csv (Path lei_dir)
 
str sql_value (str val, bool is_timestamp)
 
list read_csv_rows (Path csv_file, dict column_map, set timestamp_cols)
 
 generate_catalog_sql (dict manifest, Path output_file)
 
 generate_methodology_sql (dict manifest, Path output_file)
 
 generate_dataset_sql (dict manifest, str as_of_date, Path output_file)
 
 generate_dataset_dependency_sql (dict manifest, Path output_file)
 
 generate_artefact_populate_sql (Path csv_file, str dataset_name, str dataset_code, str table_name, dict column_map, set timestamp_cols, Path output_file)
 
 generate_master_sql (Path output_file)
 
 main ()
 

Variables

dict LEI2_COLUMN_MAP
 
dict RR_COLUMN_MAP
 
dict LEI2_TIMESTAMP_COLS
 
dict RR_TIMESTAMP_COLS
 
dict LEI_BIC_COLUMN_MAP
 

Detailed Description

Generates SQL populate scripts for GLEIF LEI metadata and artefact data.

Reads the manifest.json from external/lei/ and CSV subset files to generate:
  - lei_catalog_populate.sql
  - lei_methodology_populate.sql
  - lei_dataset_populate.sql
  - lei_dataset_dependency_populate.sql
  - lei_entities_small_artefact_populate.sql
  - lei_entities_large_artefact_populate.sql
  - lei_relationships_small_artefact_populate.sql
  - lei_relationships_large_artefact_populate.sql
  - lei_populate.sql (master include file)

Usage:
    python3 lei_generate_metadata_sql.py
    python3 lei_generate_metadata_sql.py --manifest-dir /path/to/external/lei
    python3 lei_generate_metadata_sql.py --output-dir /path/to/output

Function Documentation

◆ get_repo_root()

Path get_repo_root ( )
Get the repository root directory.
Here is the caller graph for this function:

◆ get_header()

str get_header ( )
Generate SQL file header.
Here is the caller graph for this function:

◆ escape_sql()

str escape_sql ( str  s)
Escape single quotes for SQL strings.
Here is the caller graph for this function:

◆ parse_as_of_date()

str parse_as_of_date ( str  filename)
Extract YYYYMMDD date from CSV filename and return as YYYY-MM-DD.
Here is the caller graph for this function:

◆ find_csv_files()

dict find_csv_files ( Path  lei_dir)
Find LEI CSV subset files and return them grouped by type and size.
Here is the caller graph for this function:

◆ find_lei_bic_csv()

Path | None find_lei_bic_csv ( Path  lei_dir)
Find the latest LEI-BIC mapping CSV file.
Here is the caller graph for this function:

◆ sql_value()

str sql_value ( str  val,
bool  is_timestamp 
)
Format a CSV value as a SQL literal.
Here is the caller graph for this function:

◆ read_csv_rows()

list read_csv_rows ( Path  csv_file,
dict  column_map,
set  timestamp_cols 
)
Read CSV file and extract mapped columns as SQL value tuples.
Here is the caller graph for this function:

◆ generate_catalog_sql()

generate_catalog_sql ( dict  manifest,
Path  output_file 
)
Generate the catalog populate SQL file.
Here is the caller graph for this function:

◆ generate_methodology_sql()

generate_methodology_sql ( dict  manifest,
Path  output_file 
)
Generate the methodology populate SQL file.
Here is the caller graph for this function:

◆ generate_dataset_sql()

generate_dataset_sql ( dict  manifest,
str  as_of_date,
Path  output_file 
)
Generate the dataset populate SQL file.
Here is the caller graph for this function:

◆ generate_dataset_dependency_sql()

generate_dataset_dependency_sql ( dict  manifest,
Path  output_file 
)
Generate the dataset dependency populate SQL file.
Here is the caller graph for this function:

◆ generate_artefact_populate_sql()

generate_artefact_populate_sql ( Path  csv_file,
str  dataset_name,
str  dataset_code,
str  table_name,
dict  column_map,
set  timestamp_cols,
Path  output_file 
)
Generate an artefact populate SQL file from a CSV subset.
Here is the caller graph for this function:

◆ generate_master_sql()

generate_master_sql ( Path  output_file)
Generate the lei_populate.sql master include file.
Here is the caller graph for this function:

Variable Documentation

◆ LEI2_COLUMN_MAP

dict LEI2_COLUMN_MAP
Initial value:
1= {
2 'LEI': 'lei',
3 'Entity.LegalName': 'entity_legal_name',
4 'Entity.EntityCategory': 'entity_entity_category',
5 'Entity.EntitySubCategory': 'entity_entity_sub_category',
6 'Entity.EntityStatus': 'entity_entity_status',
7 'Entity.LegalForm.EntityLegalFormCode': 'entity_legal_form_entity_legal_form_code',
8 'Entity.LegalForm.OtherLegalForm': 'entity_legal_form_other_legal_form',
9 'Entity.LegalJurisdiction': 'entity_legal_jurisdiction',
10 'Entity.LegalAddress.FirstAddressLine': 'entity_legal_address_first_address_line',
11 'Entity.LegalAddress.City': 'entity_legal_address_city',
12 'Entity.LegalAddress.Region': 'entity_legal_address_region',
13 'Entity.LegalAddress.Country': 'entity_legal_address_country',
14 'Entity.LegalAddress.PostalCode': 'entity_legal_address_postal_code',
15 'Entity.HeadquartersAddress.FirstAddressLine': 'entity_headquarters_address_first_address_line',
16 'Entity.HeadquartersAddress.City': 'entity_headquarters_address_city',
17 'Entity.HeadquartersAddress.Region': 'entity_headquarters_address_region',
18 'Entity.HeadquartersAddress.Country': 'entity_headquarters_address_country',
19 'Entity.HeadquartersAddress.PostalCode': 'entity_headquarters_address_postal_code',
20 'Entity.EntityCreationDate': 'entity_entity_creation_date',
21 'Registration.InitialRegistrationDate': 'registration_initial_registration_date',
22 'Registration.LastUpdateDate': 'registration_last_update_date',
23 'Registration.NextRenewalDate': 'registration_next_renewal_date',
24 'Registration.RegistrationStatus': 'registration_registration_status',
25 'Entity.TransliteratedOtherEntityNames.TransliteratedOtherEntityName.1': 'entity_transliterated_name_1',
26 'Entity.TransliteratedOtherEntityNames.TransliteratedOtherEntityName.1.type': 'entity_transliterated_name_1_type',
27}

◆ RR_COLUMN_MAP

dict RR_COLUMN_MAP
Initial value:
1= {
2 'Relationship.StartNode.NodeID': 'relationship_start_node_node_id',
3 'Relationship.StartNode.NodeIDType': 'relationship_start_node_node_id_type',
4 'Relationship.EndNode.NodeID': 'relationship_end_node_node_id',
5 'Relationship.EndNode.NodeIDType': 'relationship_end_node_node_id_type',
6 'Relationship.RelationshipType': 'relationship_relationship_type',
7 'Relationship.RelationshipStatus': 'relationship_relationship_status',
8 'Relationship.Period.1.startDate': 'relationship_period_1_start_date',
9 'Relationship.Period.1.endDate': 'relationship_period_1_end_date',
10 'Registration.InitialRegistrationDate': 'registration_initial_registration_date',
11 'Registration.LastUpdateDate': 'registration_last_update_date',
12 'Registration.RegistrationStatus': 'registration_registration_status',
13 'Registration.ValidationSources': 'registration_validation_sources',
14}

◆ LEI2_TIMESTAMP_COLS

dict LEI2_TIMESTAMP_COLS
Initial value:
1= {
2 'entity_entity_creation_date',
3 'registration_initial_registration_date',
4 'registration_last_update_date',
5 'registration_next_renewal_date',
6}

◆ RR_TIMESTAMP_COLS

dict RR_TIMESTAMP_COLS
Initial value:
1= {
2 'relationship_period_1_start_date',
3 'relationship_period_1_end_date',
4 'registration_initial_registration_date',
5 'registration_last_update_date',
6}

◆ LEI_BIC_COLUMN_MAP

dict LEI_BIC_COLUMN_MAP
Initial value:
1= {
2 'LEI': 'lei',
3 'BIC': 'bic',
4}