Definition

The log loss test measures the dissimilarity between predicted probabilities and the true distribution. Also known as cross-entropy loss or binary cross-entropy (in the binary classification case), it evaluates how well the model’s predicted probabilities match the actual class labels.

Taxonomy

  • Task types: Tabular classification, text classification.
  • Availability: and .

Why it matters

  • Log loss provides a probabilistic measure of classification performance, considering not just correctness but also confidence in predictions.
  • It heavily penalizes confident wrong predictions, making it sensitive to model calibration and overconfidence.
  • Lower log loss values indicate better model performance, with 0 representing perfect probability predictions.
  • This metric is particularly valuable when you need well-calibrated probability estimates, not just class predictions.

Required columns

To compute this metric, your dataset must contain the following columns:
  • Prediction probabilities: The predicted class probabilities from your classification model
  • Ground truths: The actual/true class labels
Log loss requires predicted probabilities, not just class labels. Ensure your model outputs probability estimates for each class.

Test configuration examples

If you are writing a tests.json, here are a few valid configurations for the log loss test:
[
  {
    "name": "Log loss below 0.3",
    "description": "Ensure that the log loss is below 0.3",
    "type": "performance",
    "subtype": "metricThreshold",
    "thresholds": [
      {
        "insightName": "metrics",
        "insightParameters": null,
        "measurement": "logLoss",
        "operator": "<",
        "value": 0.3
      }
    ],
    "subpopulationFilters": null,
    "mode": "development",
    "usesValidationDataset": true,
    "usesTrainingDataset": false,
    "usesMlModel": true,
    "syncId": "b4dee7dc-4f15-48ca-a282-63e2c04e0689"
  }
]