Databricks

Openlayer integrates with Databricks so you can run data quality tests directly on your Databricks tables. The integration uses a personal access token (PAT) tied to a secure Databricks connection. This ensures auditable, key-based access without requiring usernames or passwords.

Prerequisites

To follow this guide, you need:

A Databricks account and workspace with SQL warehouses enabled
Permissions to create and use a personal access token (PAT)
A table in Databricks you want to monitor (with timestamp and unique ID columns recommended)
An Openlayer project with monitoring mode enabled

Setup Guide

Step 1: Generate a personal access token

In your Databricks workspace:

Go to User Settings → Developer → Access Tokens.
Click Generate new token.
Copy and store the PAT securely — you will provide it when connecting Openlayer.

See Databricks documentation for details.

Step 2: Collect connection details

You will need:

Hostname: your workspace URL (e.g. https://dbc-247310bd-93fc.cloud.databricks.com)
Port: typically 443
SQL Warehouse endpoint: path to the warehouse, e.g. /sql/1.0/warehouses/<warehouse-id>
Personal access token (PAT): generated in step 1

Step 3: Connect inside Openlayer

In your Openlayer workspace:

Go to Data sources and select Databricks.
Click Connect.
Fill in the fields:

Hostname: your workspace hostname (e.g. https://dbc-247310bd-93fc.cloud.databricks.com)
Port: usually 443
SQL Warehouse endpoint: path to your warehouse
Personal access token: PAT you generated
Name: a descriptive label for this connection

Step 4: Configure your table

After the connection is created, select the table to monitor:

Catalog: Databricks catalog containing the table
Schema: schema containing the table
Table: table name (e.g. workspace.openlayer_demo.landing_inferences)
Timestamp column: column used to order/filter rows (e.g. timestamp)
Unique ID column: column identifying unique rows (e.g. inference_id)
Data source name: a descriptive label in Openlayer

Optional: ML-specific settings

If the table contains ML outputs, you can provide additional context:

Class names
Feature names
Categorical feature names
Predictions column

This enables Openlayer to run ML-aware tests such as drift detection and performance monitoring.

Troubleshooting

Authentication errors → verify that your PAT is valid and not expired.
Connection errors → confirm the hostname, port, and SQL warehouse endpoint are correct.
Empty results → check that the timestamp column is populated and you’ve selected the correct table.
Permission errors → ensure your PAT user has access to the warehouse and the target tables.

Integrations

Model

Data

Instrumentation

Notifications

Other

Prerequisites

Setup Guide

Step 1: Generate a personal access token

Step 2: Collect connection details

Step 3: Connect inside Openlayer

Step 4: Configure your table

Optional: ML-specific settings

Troubleshooting

Integrations

Model

Data

Instrumentation

Notifications

Other

​Prerequisites

​Setup Guide

​Step 1: Generate a personal access token

​Step 2: Collect connection details

​Step 3: Connect inside Openlayer

​Step 4: Configure your table

​Optional: ML-specific settings

​Troubleshooting

Prerequisites

Setup Guide

Step 1: Generate a personal access token

Step 2: Collect connection details

Step 3: Connect inside Openlayer

Step 4: Configure your table

Optional: ML-specific settings

Troubleshooting