Median latency

Definition
Taxonomy
Why it matters
Test configuration examples
Related

Definition

The median latency test ensures that the median (50th percentile) latency for the data is within a given range.

Taxonomy

Task types: LLM, tabular classification, tabular regression, text classification.
Availability: .

Why it matters

The median latency represents the typical user experience, as it’s the latency value that half of all requests fall below.
This metric is less sensitive to outliers than the mean latency, providing a more stable measure of central tendency.
Monitoring median latency helps ensure that the typical user receives acceptable performance, making it a key indicator for overall system health.

Test configuration examples

If you are writing a tests.json, here are a few valid configurations for the median latency test:

[
  {
    "name": "Median latency below 3000 msec",
    "description": "Make sure that the median latency is below 3000 msec",
    "type": "performance",
    "subtype": "metricThreshold",
    "thresholds": [
      {
        "insightName": "metrics",
        "insightParameters": null,
        "measurement": "medianLatency",
        "operator": "<",
        "value": 3000
      }
    ],
    "subpopulationFilters": null,
    "mode": "development",
    "usesValidationDataset": true, // Apply test to the validation set
    "usesTrainingDataset": false,
    "usesMlModel": false,
    "syncId": "b4dee7dc-4f15-48ca-a282-63e2c04e0689" // Some unique id
  }
]

Min latency

BLEU score

⌘I

Get started

Workspace setup

Governance

Observability

Data quality monitoring

Offline testing

Tests

Administration

Other resources

Definition

Taxonomy

Why it matters

Test configuration examples

Get started

Workspace setup

Governance

Observability

Data quality monitoring

Offline testing

Tests

Administration

Other resources

​Definition

​Taxonomy

​Why it matters

​Test configuration examples

​Related

Definition

Taxonomy

Why it matters

Test configuration examples

Related