ORE Studio 0.0.4
Loading...
Searching...
No Matches
Public Attributes | List of all members
dataset Struct Referencefinal

Represents a data quality dataset with lineage tracking. More...

#include <dataset.hpp>

Collaboration diagram for dataset:
Collaboration graph

Public Attributes

int version = 0
 Version number for optimistic locking and change tracking.
 
boost::uuids::uuid id
 UUID uniquely identifying this dataset.
 
std::string code
 Unique code for stable referencing.
 
std::optional< std::string > catalog_name
 Optional catalog this dataset belongs to.
 
std::string subject_area_name
 Subject area this dataset belongs to.
 
std::string domain_name
 Data domain this dataset applies to.
 
std::optional< std::string > coding_scheme_code
 Optional coding scheme used for identifiers in this dataset.
 
std::string origin_code
 Code indicating the origin of the data.
 
std::string nature_code
 Code indicating the nature of the data.
 
std::string treatment_code
 Code indicating how the data was treated or processed.
 
std::optional< boost::uuids::uuid > methodology_id
 Optional methodology used to produce this dataset.
 
std::string name
 Human-readable name for the dataset.
 
std::string description
 Detailed description of the dataset's contents and purpose.
 
std::string source_system_id
 Identifier of the source system where data originated.
 
std::string business_context
 Business context describing the dataset's role and usage.
 
std::optional< boost::uuids::uuid > upstream_derivation_id
 Optional reference to an upstream dataset this was derived from.
 
int lineage_depth = 0
 Depth in the derivation chain from the original source.
 
std::chrono::system_clock::time_point as_of_date
 Business date the data represents.
 
std::chrono::system_clock::time_point ingestion_timestamp
 Timestamp when the data was ingested into the system.
 
std::optional< std::string > license_info
 Optional license information for the data.
 
std::optional< std::string > artefact_type
 Type of artefact this dataset populates.
 
std::string recorded_by
 Username of the person who last modified this dataset.
 
std::string change_commentary
 Free-text commentary explaining the change.
 
std::chrono::system_clock::time_point recorded_at
 Timestamp when this version of the record was recorded.
 

Detailed Description

Represents a data quality dataset with lineage tracking.

A dataset captures metadata about a collection of data including its origin, nature, treatment methodology, and lineage information.

Member Data Documentation

◆ id

boost::uuids::uuid id

UUID uniquely identifying this dataset.

This is the surrogate key for the dataset.

◆ code

std::string code

Unique code for stable referencing.

Uses dot notation for namespacing (e.g., "iso.currencies", "fpml.currencies", "crypto.large").

◆ catalog_name

std::optional<std::string> catalog_name

Optional catalog this dataset belongs to.

Links to catalog for organizational grouping.

◆ subject_area_name

std::string subject_area_name

Subject area this dataset belongs to.

Links to subject_area for organizational structure.

◆ domain_name

std::string domain_name

Data domain this dataset applies to.

Links to data_domain for domain categorization.

◆ coding_scheme_code

std::optional<std::string> coding_scheme_code

Optional coding scheme used for identifiers in this dataset.

Links to coding_scheme.

◆ origin_code

std::string origin_code

Code indicating the origin of the data.

Links to origin_dimension.

◆ nature_code

std::string nature_code

Code indicating the nature of the data.

Links to nature_dimension.

◆ treatment_code

std::string treatment_code

Code indicating how the data was treated or processed.

Links to treatment_dimension.

◆ methodology_id

std::optional<boost::uuids::uuid> methodology_id

Optional methodology used to produce this dataset.

Links to methodology by UUID.

◆ upstream_derivation_id

std::optional<boost::uuids::uuid> upstream_derivation_id

Optional reference to an upstream dataset this was derived from.

Links to another dataset by UUID for lineage tracking.

◆ lineage_depth

int lineage_depth = 0

Depth in the derivation chain from the original source.

0 indicates an original source dataset.

◆ as_of_date

std::chrono::system_clock::time_point as_of_date

Business date the data represents.

Stored as time_point, typically truncated to day precision.

◆ artefact_type

std::optional<std::string> artefact_type

Type of artefact this dataset populates.

Used for categorization. Links to artefact_type which provides target_table and populate_function for publication.