QCEW Metrics API¶
QCEW (Quarterly Census of Employment and Wages) downstream regression benchmark.
Contains: - QCEWBenchmarkConfig: Configuration dataclass for the benchmark - run_qcew_employment_benchmark: Compare embedding, one-hot, and hybrid regressors - run_qcew_multilevel_benchmark: Compare across all NAICS levels (2-6 digits)
QCEWBenchmarkConfig
dataclass
¶
Configuration for the QCEW downstream regression benchmark.
QCEWMultilevelConfig
dataclass
¶
Configuration for multi-level QCEW benchmark.
print_multilevel_comparison(results)
¶
Print a formatted comparison table of multi-level benchmark results.
run_qcew_employment_benchmark(config)
¶
Compare embedding, one-hot, and hybrid regressors on QCEW employment prediction.
run_qcew_multilevel_benchmark(config)
¶
Compare embedding vs one-hot encoding across all NAICS levels.
This benchmark evaluates how well learned embeddings generalize across different levels of the NAICS hierarchy compared to simple one-hot encoding.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config
|
QCEWMultilevelConfig
|
Multi-level benchmark configuration. |
required |
Returns:
| Type | Description |
|---|---|
Dict[str, Any]
|
Nested dictionary: {level_name: {model_type: {metric: value}}} |
Dict[str, Any]
|
Example structure: |
Dict[str, Any]
|
{ 'level_2_sector': { 'embedding': {'r2': 0.85, 'rmse': 0.12}, 'one_hot': {'r2': 0.72, 'rmse': 0.18}, 'hybrid': {'r2': 0.87, 'rmse': 0.11}, 'metadata': {...} }, 'level_3_subsector': {...}, ... 'summary': { 'embedding_avg_r2': 0.82, 'one_hot_avg_r2': 0.65, 'embedding_advantage': 0.17, ... } |
Dict[str, Any]
|
} |