> ## Documentation Index
> Fetch the complete documentation index at: https://docs.openlayer.com/llms.txt
> Use this file to discover all available pages before exploring further.

# SQL query

> Learn how to use the SQL query test to create custom data validation tests

## Definition

The SQL query test allows you to write custom SQL queries to validate your data and set
thresholds on the query results. This test executes a user-defined SQL query that
returns a numerical result, which can then be compared against specified thresholds.

The SQL query must reference your dataset as `df` (the table name) and should return a single numerical value.

## Taxonomy

* **Task types**: LLM, tabular classification, tabular regression, text classification.
* **Availability**: <Tooltip tip="Continuously evaluate your models and datasets as you iterate on their versions.">development</Tooltip>
  and <Tooltip tip="Monitor a model in production, measure its health, check for drifts and set up alerts.">monitoring</Tooltip>.

## Why it matters

* The SQL query test provides flexibility for creating custom data validation
  rules tailored to your specific use case.
* You can implement complex business logic and data quality checks that
  aren't covered by standard tests.
* This enables domain-specific validation rules, such as checking data consistency
  across multiple columns, validating ranges, or ensuring specific business constraints are met.
* Custom SQL queries allow you to leverage the full power of SQL for data analysis
  and validation within your testing pipeline.

## Test configuration examples

If you are writing a `tests.json`, here are a few valid configurations for the SQL query test:

<CodeGroup>
  ```json Development theme={null}
  [
    {
      "name": "Minimum record count check",
      "description": "Ensures the dataset has at least 1000 records",
      "type": "integrity",
      "subtype": "sqlQuery",
      "thresholds": [
        {
          "insightName": "sqlQuery",
          "insightParameters": [
            { "name": "query", "value": "SELECT COUNT(*) FROM df" } // Custom SQL query
          ],
          "measurement": "result",
          "operator": ">=",
          "value": 1000
        }
      ],
      "subpopulationFilters": null,
      "mode": "development",
      "usesValidationDataset": true, // Apply test to the validation set
      "usesTrainingDataset": false,
      "usesMlModel": false,
      "syncId": "b4dee7dc-4f15-48ca-a282-63e2c04e0689" // Some unique id
    },
    {
      "name": "Age range validation",
      "description": "Checks that no records have age values outside the expected range (18-100)",
      "type": "integrity",
      "subtype": "sqlQuery",
      "thresholds": [
        {
          "insightName": "sqlQuery",
          "insightParameters": [
            { "name": "query", "value": "SELECT COUNT(*) FROM df WHERE age < 18 OR age > 100" }
          ],
          "measurement": "result",
          "operator": "<=",
          "value": 0
        }
      ],
      "subpopulationFilters": null,
      "mode": "development",
      "usesValidationDataset": true,
      "usesTrainingDataset": false,
      "usesMlModel": false,
      "syncId": "96622fba-ea00-4e42-8f42-5e8f5f60805f" // Some unique id
    }
  ]
  ```

  ```json Monitoring theme={null}
  [
    {
      "name": "Revenue consistency check",
      "description": "Validates that average transaction amount is within expected range",
      "type": "integrity",
      "subtype": "sqlQuery",
      "thresholds": [
        {
          "insightName": "sqlQuery",
          "insightParameters": [
            { "name": "query", "value": "SELECT AVG(transaction_amount) FROM df" }
          ],
          "measurement": "result",
          "operator": ">=",
          "value": 50.0
        }
      ],
      "subpopulationFilters": null,
      "mode": "monitoring",
      "usesProductionData": true,
      "evaluationWindow": 3600, // 1 hour
      "delayWindow": 0,
      "syncId": "b4dee7dc-4f15-48ca-a282-63e2c04e0689" // Some unique id
    },
    {
      "name": "Data completeness check",
      "description": "Ensures that critical fields are not null for more than 5% of records",
      "type": "integrity",
      "subtype": "sqlQuery",
      "thresholds": [
        {
          "insightName": "sqlQuery",
          "insightParameters": [
            {
              "name": "query",
              "value": "SELECT (COUNT(*) - COUNT(customer_id)) * 100.0 / COUNT(*) FROM df"
            }
          ],
          "measurement": "result",
          "operator": "<=",
          "value": 5.0
        }
      ],
      "subpopulationFilters": null,
      "mode": "monitoring",
      "usesProductionData": true,
      "evaluationWindow": 3600, // 1 hour
      "delayWindow": 0,
      "syncId": "96622fba-ea00-4e42-8f42-5e8f5f60805f" // Some unique id
    }
  ]
  ```
</CodeGroup>
