Rank Order Preservation Loss

Bases: Module

Loss component that explicitly optimizes for rank order preservation (Spearman correlation).

This loss penalizes violations of rank order: if code A is closer to code B than to code C in the tree (ground truth), then the embedding distance A-B should be smaller than A-C.

This directly optimizes for Spearman correlation by ensuring relative distance ordering matches ground truth ordering.

forward(embeddings, codes, lorentz_distance_fn)

Compute rank order preservation loss.

For each anchor code, we check if the relative ordering of distances to other codes matches the ground truth ordering. We penalize violations using a margin-based ranking loss.

Parameters:

Name Type Description Default
embeddings Tensor

Hyperbolic embeddings (N, D+1)

required
codes List[str]

List of NAICS codes corresponding to embeddings

required
lorentz_distance_fn Callable[[Tensor, Tensor], Tensor]

Function to compute Lorentz distances

required

Returns:

Type Description
Tensor

Loss scalar

Bases: Module

LambdaRank loss for global ranking optimization.

Unlike pairwise ranking loss, LambdaRank considers the full ranking list (1 positive + k negatives) for each anchor and directly optimizes for NDCG.

Key advantages: 1. Position-aware: Top positions matter more (via NDCG) 2. Global optimization: Considers entire ranking list, not just pairs 3. Direct NDCG optimization: Gradients scaled by NDCG change from swapping

This is particularly effective for contrastive learning where we have 1 positive and k negatives (e.g., 24-32 negatives) per anchor.

forward(anchor_emb, positive_emb, negative_embs, anchor_codes, positive_codes, negative_codes, lorentz_distance_fn, batch_size, k_negatives)

Compute LambdaRank loss.

For each anchor, creates a ranking list: [positive, negative_1, ..., negative_k] and optimizes for NDCG using LambdaRank gradients.

Parameters:

Name Type Description Default
anchor_emb Tensor

Anchor embeddings (batch_size, embedding_dim+1)

required
positive_emb Tensor

Positive embeddings (batch_size, embedding_dim+1)

required
negative_embs Tensor

Negative embeddings (batch_size * k_negatives, embedding_dim+1)

required
anchor_codes List[str]

List of anchor NAICS codes (batch_size,)

required
positive_codes List[str]

List of positive NAICS codes (batch_size,)

required
negative_codes List[List[str]]

List of lists of negative codes (batch_size, k_negatives)

required
lorentz_distance_fn Callable[[Tensor, Tensor], Tensor]

Function to compute Lorentz distances

required
batch_size int

Batch size

required
k_negatives int

Number of negatives per anchor

required

Returns:

Type Description
Tensor

Loss scalar