> ## Documentation Index
> Fetch the complete documentation index at: https://docs.openlayer.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Contains PII

> Learn how to use the personal identifiable information (PII) test to detect sensitive data

## Definition

The PII test detects and validates the presence of personal identifiable information (PII) in your data. The test supports detection of a comprehensive range of PII types, including financial information, government identifiers, contact details, and location data across multiple countries and regions.

You can specify one or multiple PII types to check for, and set thresholds on either the absolute count or percentage of rows containing PII.

## Taxonomy

* **Task types**: LLM, tabular classification, tabular regression, text classification.
* **Availability**: <Tooltip tip="Continuously evaluate your models and datasets as you iterate on their versions.">development</Tooltip>
  and <Tooltip tip="Monitor a model in production, measure its health, check for drifts and set up alerts.">monitoring</Tooltip>.

## Why it matters

* **Data privacy compliance**: Ensures your data meets privacy regulations like GDPR, CCPA, and other data protection laws
* **Security**: Prevents accidental exposure of sensitive personal information
* **Model safety**: LLMs are prone to memorizing and potentially leaking PII from training data
* **Audit trail**: Provides documentation of PII detection for compliance reporting

## Supported PII types

### General PII Types

| Type              | Description                           |
| ----------------- | ------------------------------------- |
| `CREDIT_CARD`     | Credit card numbers (various formats) |
| `EMAIL_ADDRESS`   | Email addresses                       |
| `PHONE_NUMBER`    | Phone numbers (various formats)       |
| `IP_ADDRESS`      | IP addresses                          |
| `URL`             | Web URLs                              |
| `DATE_TIME`       | Date and time information             |
| `LOCATION`        | Geographic locations                  |
| `PERSON`          | Person names                          |
| `CRYPTO`          | Cryptocurrency addresses              |
| `MEDICAL_LICENSE` | Medical license numbers               |
| `NRP`             | National registry of persons          |
| `IBAN_CODE`       | International Bank Account Numbers    |

### United States

| Type                | Description                                |
| ------------------- | ------------------------------------------ |
| `US_SSN`            | Social Security Numbers                    |
| `US_BANK_NUMBER`    | US bank account numbers                    |
| `US_DRIVER_LICENSE` | US driver's license numbers                |
| `US_ITIN`           | Individual Taxpayer Identification Numbers |
| `US_PASSPORT`       | US passport numbers                        |

### United Kingdom

| Type      | Description                     |
| --------- | ------------------------------- |
| `UK_NHS`  | National Health Service numbers |
| `UK_NINO` | National Insurance numbers      |

### European Union

| Type                        | Description                              |
| --------------------------- | ---------------------------------------- |
| `ES_NIF`                    | Spanish tax identification numbers       |
| `ES_NIE`                    | Spanish foreigner identification numbers |
| `IT_FISCAL_CODE`            | Italian tax codes                        |
| `IT_DRIVER_LICENSE`         | Italian driver's licenses                |
| `IT_VAT_CODE`               | Italian VAT codes                        |
| `IT_PASSPORT`               | Italian passport numbers                 |
| `IT_IDENTITY_CARD`          | Italian identity cards                   |
| `FI_PERSONAL_IDENTITY_CODE` | Finnish personal identity codes          |
| `PL_PESEL`                  | Polish personal identification numbers   |

### Asia-Pacific

| Type                      | Description                      |
| ------------------------- | -------------------------------- |
| `SG_NRIC_FIN`             | Singapore NRIC/FIN numbers       |
| `SG_UEN`                  | Singapore Unique Entity Numbers  |
| `AU_ABN`                  | Australian Business Numbers      |
| `AU_ACN`                  | Australian Company Numbers       |
| `AU_TFN`                  | Australian Tax File Numbers      |
| `AU_MEDICARE`             | Australian Medicare numbers      |
| `IN_PAN`                  | Indian Permanent Account Numbers |
| `IN_AADHAAR`              | Indian Aadhaar numbers           |
| `IN_VEHICLE_REGISTRATION` | Indian vehicle registration      |
| `IN_VOTER`                | Indian voter ID numbers          |
| `IN_PASSPORT`             | Indian passport numbers          |

### South America

| Type      | Description                                   |
| --------- | --------------------------------------------- |
| `BR_CPF`  | Brazilian individual taxpayer registry        |
| `BR_CNPJ` | Brazilian national registry of legal entities |

## Test configuration examples

If you are writing a `tests.json`, here are a few valid configurations for the PII test:

<CodeGroup>
  ```json Development theme={null}
  [
    {
      "name": "No financial PII in model outputs",
      "description": "Ensures no credit cards or bank numbers appear in model outputs",
      "type": "integrity",
      "subtype": "containsPii",
      "thresholds": [
        {
          "insightName": "containsPii",
          "insightParameters": [
            {
              "name": "pii_type",
              "value": ["CREDIT_CARD", "US_BANK_NUMBER", "IBAN_CODE"] // Check multiple PII types
            },
            {
              "name": "column_name",
              "value": "model_output"
            }
          ],
          "measurement": "containsPIIRowCount",
          "operator": "<=",
          "value": 0
        }
      ],
      "subpopulationFilters": null,
      "mode": "development",
      "usesValidationDataset": true,
      "usesTrainingDataset": false,
      "usesMlModel": false,
      "syncId": "b4dee7dc-4f15-48ca-a282-63e2c04e0689" // Some unique id
    },
    {
      "name": "Limited contact information leakage",
      "description": "Allows up to 5% of rows to contain contact information",
      "type": "integrity",
      "subtype": "containsPii",
      "thresholds": [
        {
          "insightName": "containsPii",
          "insightParameters": [
            {
              "name": "pii_type",
              "value": ["EMAIL_ADDRESS", "PHONE_NUMBER"] // Multiple types in array
            },
            {
              "name": "column_name",
              "value": "generated_text"
            }
          ],
          "measurement": "containsPIIRowPercentage", // Use percentage measurement
          "operator": "<=",
          "value": 5.0
        }
      ],
      "subpopulationFilters": null,
      "mode": "development",
      "usesValidationDataset": true,
      "usesTrainingDataset": false,
      "usesMlModel": false,
      "syncId": "96622fba-ea00-4e42-8f42-5e8f5f60805f" // Some unique id
    }
  ]
  ```

  ```json Monitoring theme={null}
  [
    {
      "name": "No government IDs in production data",
      "description": "Monitors for government identification numbers in production",
      "type": "integrity",
      "subtype": "containsPii",
      "thresholds": [
        {
          "insightName": "containsPii",
          "insightParameters": [
            {
              "name": "pii_type",
              "value": [
                "US_SSN",
                "US_DRIVER_LICENSE",
                "US_PASSPORT",
                "UK_NINO",
                "UK_NHS"
              ]
            },
            {
              "name": "column_name",
              "value": "user_input"
            }
          ],
          "measurement": "containsPIIRowCount",
          "operator": "<=",
          "value": 0
        }
      ],
      "subpopulationFilters": null,
      "mode": "monitoring",
      "usesProductionData": true,
      "evaluationWindow": 3600, // 1 hour
      "delayWindow": 0,
      "syncId": "b4dee7dc-4f15-48ca-a282-63e2c04e0689" // Some unique id
    },
    {
      "name": "International PII monitoring",
      "description": "Monitors for various international PII types with tolerance",
      "type": "integrity",
      "subtype": "containsPii",
      "thresholds": [
        {
          "insightName": "containsPii",
          "insightParameters": [
            {
              "name": "pii_type",
              "value": [
                "ES_NIF",
                "IT_FISCAL_CODE",
                "AU_TFN",
                "IN_AADHAAR",
                "BR_CPF"
              ]
            },
            {
              "name": "column_name",
              "value": "chat_message"
            }
          ],
          "measurement": "containsPIIRowPercentage",
          "operator": "<=",
          "value": 2.0 // Allow up to 2% of messages to contain international PII
        }
      ],
      "subpopulationFilters": null,
      "mode": "monitoring",
      "usesProductionData": true,
      "evaluationWindow": 3600, // 1 hour
      "delayWindow": 0,
      "syncId": "96622fba-ea00-4e42-8f42-5e8f5f60805f" // Some unique id
    }
  ]
  ```
</CodeGroup>
