QCEW Metrics API

QCEW (Quarterly Census of Employment and Wages) downstream regression benchmark.

Contains: - QCEWBenchmarkConfig: Configuration dataclass for the benchmark - run_qcew_employment_benchmark: Compare embedding, one-hot, and hybrid regressors - run_qcew_multilevel_benchmark: Compare across all NAICS levels (2-6 digits)

QCEWBenchmarkConfig dataclass

Configuration for the QCEW downstream regression benchmark.

QCEWMultilevelConfig dataclass

Configuration for multi-level QCEW benchmark.

print_multilevel_comparison(results)

Print a formatted comparison table of multi-level benchmark results.

run_qcew_employment_benchmark(config)

Compare embedding, one-hot, and hybrid regressors on QCEW employment prediction.

run_qcew_multilevel_benchmark(config)

Compare embedding vs one-hot encoding across all NAICS levels.

This benchmark evaluates how well learned embeddings generalize across different levels of the NAICS hierarchy compared to simple one-hot encoding.

Parameters:

Name Type Description Default
config QCEWMultilevelConfig

Multi-level benchmark configuration.

required

Returns:

Type Description
Dict[str, Any]

Nested dictionary: {level_name: {model_type: {metric: value}}}

Dict[str, Any]

Example structure:

Dict[str, Any]

{ 'level_2_sector': { 'embedding': {'r2': 0.85, 'rmse': 0.12}, 'one_hot': {'r2': 0.72, 'rmse': 0.18}, 'hybrid': {'r2': 0.87, 'rmse': 0.11}, 'metadata': {...} }, 'level_3_subsector': {...}, ... 'summary': { 'embedding_avg_r2': 0.82, 'one_hot_avg_r2': 0.65, 'embedding_advantage': 0.17, ... }

Dict[str, Any]

}