Definition

The JSON score test measures how close the output is to a valid JSON format, evaluating the structural correctness of generated JSON data.

Taxonomy

  • Task types: LLM.
  • Availability: and .

Why it matters

  • JSON score is crucial for applications that require structured data output, such as API responses, configuration files, or data extraction tasks.
  • This metric helps ensure that your LLM generates properly formatted JSON that can be parsed and used by downstream systems.
  • It’s particularly important for applications where malformed JSON could cause system failures or data processing errors.

Required columns

To compute this metric, your dataset must contain the following columns:
  • Outputs: The generated text from your LLM (expected to be JSON format)
This metric evaluates the structural validity of JSON output and doesn’t require ground truth data for comparison.

Test configuration examples

If you are writing a tests.json, here are a few valid configurations for the JSON score test:
[
  {
    "name": "Mean JSON score above 0.95",
    "description": "Ensure that the mean JSON score is above 0.95 for valid JSON structure",
    "type": "performance",
    "subtype": "metricThreshold",
    "thresholds": [
      {
        "insightName": "metrics",
        "insightParameters": null,
        "measurement": "meanJsonScore",
        "operator": ">",
        "value": 0.95
      }
    ],
    "subpopulationFilters": null,
    "mode": "development",
    "usesValidationDataset": true,
    "usesTrainingDataset": false,
    "usesMlModel": false,
    "syncId": "b4dee7dc-4f15-48ca-a282-63e2c04e0689"
  }
]