ORE Studio 0.0.4
Loading...
Searching...
No Matches
Public Attributes | List of all members
generation_options Struct Referencefinal

Options for controlling synthetic dataset generation. More...

#include <generation_options.hpp>

Collaboration diagram for generation_options:
Collaboration graph

Public Attributes

std::optional< std::uint64_t > seed
 Optional seed for reproducible generation.
 
std::size_t account_count = 5
 Number of IAM accounts to generate.
 
std::size_t catalog_count = 3
 Number of DQ catalogs to generate.
 
std::size_t data_domain_count = 4
 Number of data domains to generate.
 
std::size_t subject_areas_per_domain = 3
 Number of subject areas per domain to generate.
 
std::size_t origin_dimension_count = 5
 Number of origin dimensions to generate.
 
std::size_t nature_dimension_count = 4
 Number of nature dimensions to generate.
 
std::size_t treatment_dimension_count = 4
 Number of treatment dimensions to generate.
 
std::optional< boost::uuids::uuid > methodology_id
 Optional methodology ID to link to generated datasets.
 
std::size_t dataset_count = 20
 Number of DQ datasets to generate.
 
std::vector< std::string > dependencies
 Catalog dependencies to declare for the generated catalog.
 

Detailed Description

Options for controlling synthetic dataset generation.

These options allow fine-grained control over the size and composition of the generated synthetic dataset.

Member Data Documentation

◆ seed

std::optional<std::uint64_t> seed

Optional seed for reproducible generation.

If not set, a random seed will be used.

◆ methodology_id

std::optional<boost::uuids::uuid> methodology_id

Optional methodology ID to link to generated datasets.

If provided, all generated datasets will reference this methodology. Typically this would be the ID of the "Synthetic Data Generation" methodology, looked up by name before calling the generator.

If not provided, datasets will not have a methodology linked.

◆ dependencies

std::vector<std::string> dependencies

Catalog dependencies to declare for the generated catalog.

Each string is the name of a catalog that the generated catalog depends on. These will be stamped on the generated synthetic_catalog for use during injection.

Example: {"ISO Reference Data", "Core DQ Dimensions"}