> ## Documentation Index
> Fetch the complete documentation index at: https://docs.openlayer.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Ill-formed rows

> Learn how to use the ill-formed rows test

## Definition

A row with text is considered ill-formed if it contains more non-alphabetical characters than alphabetical. The ill-formed rows
test allows you to set a threshold on the number of rows that are ill-formed.

## Taxonomy

* **Task types**: LLM, text classification.
* **Availability**: <Tooltip tip="Continuously evaluate your models and datasets as you iterate on their versions.">development</Tooltip>
  and <Tooltip tip="Monitor a model in production, measure its health, check for drifts and set up alerts.">monitoring</Tooltip>.

## Why it matters

* Ill-formed rows can be a sign of data quality issues.
* Understanding the extent of ill-formed data helps in designing models that are robust to such anomalies. If your model is expected to encounter similar data in production, you might want to train it with some level of noise tolerance.

## Test configuration examples

If you are writing a `tests.json`, here are a few valid configurations for the character length test:

<CodeGroup>
  ```json Development theme={null}
  [
    {
      "name": "No rows with ill-formed text",
      "description": "Asserts that there are no rows with more non-alpha characters than alpha characters",
      "type": "integrity",
      "subtype": "illFormedRowCount",
      "thresholds": [
        {
          "insightName": "illFormedRowCount",
          "insightParameters": [{ "name": "column_name", "value": "output" }], // Check column `output`
          "measurement": "illFormedRowCount", // Using the absolute row count
          "operator": "<=",
          "value": 0
        }
      ],
      "subpopulationFilters": null,
      "mode": "development",
      "usesValidationDataset": true, // Apply test to the validation set
      "usesTrainingDataset": false,
      "usesMlModel": false,
      "syncId": "b4dee7dc-4f15-48ca-a282-63e2c04e0689" // Some unique id
    },
    {
      "name": "Less than 20% of rows with ill-formed text",
      "description": "Asserts that less than 20% of the rows have more non-alpha characters than alpha characters",
      "type": "integrity",
      "subtype": "illFormedRowCount",
      "thresholds": [
        {
          "insightName": "illFormedRowCount",
          "insightParameters": [{ "name": "column_name", "value": "output" }], // Check column `output`
          "measurement": "illFormedRowPercentage", // Using the absolute row count
          "operator": "<",
          "value": 0.2
        }
      ],
      "subpopulationFilters": null,
      "mode": "development",
      "usesValidationDataset": true, // Apply test to the validation set
      "usesTrainingDataset": false,
      "usesMlModel": false,
      "syncId": "96622fba-ea00-4e42-8f42-5e8f5f60805f" // Some unique id
    }
  ]
  ```

  ```json Monitoring theme={null}
  [
    {
      "name": "No rows with ill-formed text",
      "description": "Asserts that there are no rows with more non-alpha characters than alpha characters",
      "type": "integrity",
      "subtype": "illFormedRowCount",
      "thresholds": [
        {
          "insightName": "illFormedRowCount",
          "insightParameters": [{ "name": "column_name", "value": "output" }], // Check column `output`
          "measurement": "illFormedRowCount", // Using the absolute row count
          "operator": "<=",
          "value": 0
        }
      ],
      "subpopulationFilters": null,
      "mode": "monitoring",
      "usesProductionData": true,
      "evaluationWindow": 3600, // 1 hour
      "delayWindow": 0,
      "syncId": "b4dee7dc-4f15-48ca-a282-63e2c04e0689" // Some unique id
    },
    {
      "name": "Less than 20% of rows with ill-formed text",
      "description": "Asserts that less than 20% of the rows have more non-alpha characters than alpha characters",
      "type": "integrity",
      "subtype": "illFormedRowCount",
      "thresholds": [
        {
          "insightName": "illFormedRowCount",
          "insightParameters": [{ "name": "column_name", "value": "output" }], // Check column `output`
          "measurement": "illFormedRowPercentage", // Using the absolute row count
          "operator": "<",
          "value": 0.2
        }
      ],
      "subpopulationFilters": null,
      "mode": "monitoring",
      "usesProductionData": true,
      "evaluationWindow": 3600, // 1 hour
      "delayWindow": 0,
      "syncId": "96622fba-ea00-4e42-8f42-5e8f5f60805f" // Some unique id
    }
  ]
  ```
</CodeGroup>

## Related

* [Special characters ratio test](/tests/integrity/special-characters-ratio).
