Core Metrics API

Core metrics classes for embedding evaluation.

Contains: - EmbeddingEvaluator: Pairwise distance and similarity computation - RetrievalMetrics: Precision@k, Recall@k, MAP, NDCG - HierarchyMetrics: Cophenetic/Spearman correlation, NDCG ranking, distortion - EmbeddingStatistics: Embedding space analysis and collapse detection

EmbeddingEvaluator

__init__()

Evaluator for embedding quality metrics.

Device is automatically detected via get_device().

compute_pairwise_distances(embeddings, metric='euclidean', curvature=1.0)

Compute pairwise distances between embeddings.

Parameters:

Name Type Description Default
embeddings Tensor

Tensor of shape (N, D) for Euclidean or (N, D+1) for Lorentz

required
metric str

Distance metric ('euclidean', 'cosine', or 'lorentz')

'euclidean'
curvature float

Curvature parameter for Lorentz metric (default: 1.0)

1.0

Returns:

Type Description
Tensor

Distance matrix of shape (N, N)

compute_similarity_matrix(embeddings, metric='cosine')

Compute pairwise similarities between embeddings.

Parameters:

Name Type Description Default
embeddings Tensor

Tensor of shape (N, D)

required
metric str

Similarity metric ('cosine' or 'dot')

'cosine'

Returns:

Type Description
Tensor

Similarity matrix of shape (N, N)

EmbeddingStatistics

__init__()

Statistics for analyzing embedding space.

Device is automatically detected via get_device().

check_collapse(embeddings, variance_threshold=0.001, norm_cv_threshold=0.05, distance_cv_threshold=0.05)

Check if embeddings have collapsed (become too similar). Uses more informative metrics including actual values and ratios.

Parameters:

Name Type Description Default
embeddings Tensor

Tensor of shape (N, D)

required
variance_threshold float

Threshold for mean variance per dimension

0.001
norm_cv_threshold float

Threshold for coefficient of variation in norms

0.05
distance_cv_threshold float

Threshold for coefficient of variation in distances

0.05

Returns:

Type Description
Dict[str, Any]

Dictionary with collapse indicators and actual values

compute_statistics(embeddings)

Compute comprehensive statistics about embeddings.

Parameters:

Name Type Description Default
embeddings Tensor

Tensor of shape (N, D)

required

Returns:

Type Description
Dict[str, Tensor]

Dictionary of statistics

HierarchyMetrics

__init__()

Metrics for evaluating hierarchy preservation.

Device is automatically detected via get_device().

cophenetic_correlation(embedding_distances, tree_distances, min_distance=0.1)

Compute cophenetic correlation coefficient with better handling.

Measures how well embedding distances preserve hierarchical tree distances. Filters out pairs with very small tree distances to avoid noise.

Parameters:

Name Type Description Default
embedding_distances Tensor

Distance matrix from embeddings (N, N)

required
tree_distances Tensor

Ground truth tree distances (N, N)

required
min_distance float

Minimum tree distance to include (filters out same-level codes)

0.1

Returns:

Type Description
Dict[str, Union[Tensor, int]]

Dictionary with correlation and metadata

distortion(embedding_distances, tree_distances)

Compute distortion metrics (how much distances are stretched/compressed).

Parameters:

Name Type Description Default
embedding_distances Tensor

Distance matrix from embeddings (N, N)

required
tree_distances Tensor

Ground truth tree distances (N, N)

required

Returns:

Type Description
Dict[str, Tensor]

Dictionary with distortion metrics

ndcg_ranking(embedding_distances, tree_distances, k_values=[5, 10, 20], min_distance=0.1)

Compute NDCG@k for ranking evaluation.

For each anchor code, ranks all other codes by embedding distance and evaluates using NDCG based on tree distance relevance.

Parameters:

Name Type Description Default
embedding_distances Tensor

Distance matrix from embeddings (N, N)

required
tree_distances Tensor

Ground truth tree distances (N, N)

required
k_values List[int]

List of k values for NDCG@k

[5, 10, 20]
min_distance float

Minimum tree distance to consider

0.1

Returns:

Type Description
Dict[str, Tensor]

Dictionary with NDCG@k for each k value

spearman_correlation(embedding_distances, tree_distances, min_distance=0.1)

Compute Spearman rank correlation coefficient with filtering.

Parameters:

Name Type Description Default
embedding_distances Tensor

Distance matrix from embeddings (N, N)

required
tree_distances Tensor

Ground truth tree distances (N, N)

required
min_distance float

Minimum tree distance to include

0.1

Returns:

Type Description
Dict[str, Union[Tensor, int]]

Dictionary with correlation and metadata

RetrievalMetrics

__init__()

Metrics for evaluating retrieval quality.

Device is automatically detected via get_device().

mean_average_precision(distances, ground_truth, k=None)

Compute Mean Average Precision (MAP).

Parameters:

Name Type Description Default
distances Tensor

Distance matrix of shape (N, N)

required
ground_truth Tensor

Binary relevance matrix of shape (N, N)

required
k Optional[int]

Maximum rank to consider (None = all)

None

Returns:

Type Description
Tensor

MAP score (scalar)

ndcg_at_k(distances, relevance_scores, k=10)

Compute Normalized Discounted Cumulative Gain (NDCG@k).

Parameters:

Name Type Description Default
distances Tensor

Distance matrix of shape (N, N)

required
relevance_scores Tensor

Relevance scores (higher = more relevant), shape (N, N)

required
k int

Number of top results to consider

10

Returns:

Type Description
Tensor

NDCG@k for each query (shape: N)

precision_at_k(distances, ground_truth, k=10)

Compute precision@k for retrieval.

Parameters:

Name Type Description Default
distances Tensor

Distance matrix of shape (N, N)

required
ground_truth Tensor

Binary relevance matrix of shape (N, N)

required
k int

Number of top results to consider

10

Returns:

Type Description
Tensor

Precision@k for each query (shape: N)

recall_at_k(distances, ground_truth, k=10)

Compute recall@k for retrieval.

Parameters:

Name Type Description Default
distances Tensor

Distance matrix of shape (N, N)

required
ground_truth Tensor

Binary relevance matrix of shape (N, N)

required
k int

Number of top results to consider

10

Returns:

Type Description
Tensor

Recall@k for each query (shape: N)