Integrity
Correlated features
Definition
The correlated features test checks if there are features that are strongly correlated with one another.
Taxonomy
- Category: Integrity.
- Task types: Tabular classification, tabular regression.
- Availability: and .
Why it matters
- Removing highly correlated features improves model interpretability and can improve generalization performance.
- For some models, multicollinearity can be an issue, and the coefficients learned are unreliable.
- Sometimes, correlated features can indicate data quality issues — such as duplicate or near-duplicate columns.
Test configuration examples
If you are writing a tests.json
, here are a few valid configurations for the character length test:
Related
Was this page helpful?