Config API

AnnealConfig

Bases: BaseModel

Configuration for curriculum annealing schedules.

Config

Bases: BaseModel

Main configuration for NAICS training.

ConfigDict

Pydantic v2 configuration.

from_yaml(yaml_path) classmethod

Load configuration from YAML file.

override(overrides)

Apply overrides using dot notation.

to_dict()

Convert config to dictionary.

to_yaml(path)

Save config to YAML file.

CurriculumConfig

Bases: BaseModel

Structure-Aware Dynamic Curriculum (SADC) scheduler configuration.

validate_phase_boundaries()

Ensure curriculum phases progress monotonically.

DataConfig

Bases: BaseModel

Data configuration.

DataLoaderConfig

Bases: BaseModel

Data loading and preprocessing configuration.

warn_large_batch(v) classmethod

Warn about potentially problematic batch sizes.

DirConfig

Bases: BaseModel

File system directory configuration.

DistancesConfig

Bases: BaseModel

Configuration for computing pairwise distances.

from_yaml(yaml_path) classmethod

Load configuration from YAML file.

DownloadConfig

Bases: BaseModel

Configuration for downloading and preprocessing NAICS data.

from_yaml(yaml_path) classmethod

Load configuration from YAML file.

FalseNegativeConfig

Bases: BaseModel

Configuration for handling false negatives during training.

GraphConfig

Bases: BaseModel

Base configuration for HGCN training.

from_yaml(yaml_path) classmethod

Load GraphConfig from YAML file.

LoRAConfig

Bases: BaseModel

LoRA (Low-Rank Adaptation) configuration.

LossConfig

Bases: BaseModel

Loss function configuration.

MoEConfig

Bases: BaseModel

Mixture of Experts configuration.

validate_top_k_vs_experts()

Ensure top_k doesn't exceed num_experts.

ModelConfig

Bases: BaseModel

Model architecture configuration.

RelationsConfig

Bases: BaseModel

Configuration for computing pairwise relations.

from_yaml(yaml_path) classmethod

Load configuration from YAML file.

SamplingConfig

Bases: BaseModel

Top-level sampling configuration (data layer strategies).

SansStaticConfig

Bases: BaseModel

Configuration for static SANS-style sampling buckets.

StreamingConfig

Bases: BaseModel

Configuration for streaming dataset.

validate_difficulty_ratios()

Ensure easy + semi <= 1.0 at both start and end (hard is derived).

TokenizationConfig

Bases: BaseModel

Configuration for tokenization caching.

from_yaml(yaml_path) classmethod

Load configuration from YAML file.

TrainerConfig

Bases: BaseModel

PyTorch Lightning Trainer configuration.

validate_accelerator(v) classmethod

Validate accelerator choice.

validate_precision(v) classmethod

Validate precision choice.

TrainingConfig

Bases: BaseModel

Optimizer and training configuration.

TripletsConfig

Bases: BaseModel

Configuration for generating training triplets.

from_yaml(yaml_path) classmethod

Load configuration from YAML file.

load_config(config_class, yaml_path)

Generic configuration loader for any Pydantic model.

Parameters:

Name Type Description Default
config_class Type[T]

The Pydantic model class to instantiate

required
yaml_path Union[str, Path]

Path to YAML config file (absolute, relative, or under conf/)

required

Returns:

Type Description
T

Instance of config_class with values from YAML

Example

cfg = load_config(DownloadConfig, 'data/download.yaml')

parse_override_value(value)

Parse override value from string to appropriate type.