Recall

Definition
Taxonomy
Why it matters
Required columns
Test configuration examples
Related

Definition

The recall test measures the ability to find all positive instances, calculated as TP / (TP + FN). For binary classification, it considers class 1 as “positive.” For multiclass classification, it uses the macro-average of the recall score for each class, treating all classes equally.

Taxonomy

Task types: Tabular classification, text classification.
Availability: and .

Why it matters

Recall measures how many of the actual positive cases the model correctly identifies, making it crucial when missing positive cases is costly.
It’s particularly important in applications like medical diagnosis, fraud detection, or safety systems where failing to detect positive cases can have serious consequences.
Higher recall values indicate better model performance, with 1.0 representing no false negatives.
Recall complements precision to provide a complete picture of model performance, especially in the context of the precision-recall trade-off.

Required columns

To compute this metric, your dataset must contain the following columns:

Predictions: The predicted class labels from your classification model
Ground truths: The actual/true class labels

Test configuration examples

If you are writing a tests.json, here are a few valid configurations for the recall test:

[
  {
    "name": "Recall above 0.8",
    "description": "Ensure that the recall is above 0.8",
    "type": "performance",
    "subtype": "metricThreshold",
    "thresholds": [
      {
        "insightName": "metrics",
        "insightParameters": null,
        "measurement": "recall",
        "operator": ">",
        "value": 0.8
      }
    ],
    "subpopulationFilters": null,
    "mode": "development",
    "usesValidationDataset": true,
    "usesTrainingDataset": false,
    "usesMlModel": true,
    "syncId": "b4dee7dc-4f15-48ca-a282-63e2c04e0689"
  }
]

Precision test - Measure positive prediction accuracy.
F1 test - Harmonic mean of precision and recall.
Geometric mean test - Alternative balanced metric.
Accuracy test - Overall classification correctness.
Aggregate metrics - Overview of all available metrics.

Precision

F1 score

⌘I

Get started

Workspace setup

Governance

Observability

Data quality monitoring

Offline testing

Tests

Administration

Other resources

Definition

Taxonomy

Why it matters

Required columns

Test configuration examples

Get started

Workspace setup

Governance

Observability

Data quality monitoring

Offline testing

Tests

Administration

Other resources

​Definition

​Taxonomy

​Why it matters

​Required columns

​Test configuration examples

​Related

Definition

Taxonomy

Why it matters

Required columns

Test configuration examples

Related