Definition

The recall test measures the ability to find all positive instances, calculated as TP / (TP + FN). For binary classification, it considers class 1 as “positive.” For multiclass classification, it uses the macro-average of the recall score for each class, treating all classes equally.

Taxonomy

  • Task types: Tabular classification, text classification.
  • Availability: and .

Why it matters

  • Recall measures how many of the actual positive cases the model correctly identifies, making it crucial when missing positive cases is costly.
  • It’s particularly important in applications like medical diagnosis, fraud detection, or safety systems where failing to detect positive cases can have serious consequences.
  • Higher recall values indicate better model performance, with 1.0 representing no false negatives.
  • Recall complements precision to provide a complete picture of model performance, especially in the context of the precision-recall trade-off.

Required columns

To compute this metric, your dataset must contain the following columns:
  • Predictions: The predicted class labels from your classification model
  • Ground truths: The actual/true class labels

Test configuration examples

If you are writing a tests.json, here are a few valid configurations for the recall test:
[
  {
    "name": "Recall above 0.8",
    "description": "Ensure that the recall is above 0.8",
    "type": "performance",
    "subtype": "metricThreshold",
    "thresholds": [
      {
        "insightName": "metrics",
        "insightParameters": null,
        "measurement": "recall",
        "operator": ">",
        "value": 0.8
      }
    ],
    "subpopulationFilters": null,
    "mode": "development",
    "usesValidationDataset": true,
    "usesTrainingDataset": false,
    "usesMlModel": true,
    "syncId": "b4dee7dc-4f15-48ca-a282-63e2c04e0689"
  }
]