Data GovernanceEU AI Act Art.10 April 17, 2026 · 12 min read

AI Data Governance: Building a Practical Framework

EU AI Act Article 10 imposes specific data governance requirements on high-risk AI systems. GDPR data quality, minimisation, and erasure obligations also apply. Here is how to build a data governance framework that satisfies both.

Why AI Needs Dedicated Data Governance

Standard data governance frameworks were not designed for AI. Three characteristics of AI systems create new governance challenges:

Scale and complexity

AI training datasets may contain billions of records, from hundreds of sources, with complex lineage. Standard ROPA records do not capture this level of detail.

Data as code

In AI, the training data is inseparable from the model output. Biases in data become biases in decisions. Data governance is therefore AI safety governance.

Erasure is not deletion

Deleting data from a training set does not remove it from a trained model's weights. The right to erasure takes on a new dimension that standard governance frameworks have not addressed.

The Five Pillars of AI Data Governance

Data inventory and classification

Know what data you have, where it comes from, and what it is used for in AI systems.

AI system data register: for each AI system, document all data inputs and their sources
Data classification: personal data / sensitive personal data / anonymised / synthetic / public
Data lineage: where did training data come from? Is it appropriately licensed?
Third-party data: what data from vendors, data brokers, or partners enters AI systems?

EU AI Act: Article 10(3): Training data must be documented including data sources, characteristics, and limitations.

GDPR: Article 30 (Records of Processing): AI data processing should be reflected in your ROPA.

Data quality requirements

EU AI Act Article 10 requires specific data quality practices for high-risk AI systems.

Relevance: training data must be relevant and representative for the intended purpose
Sufficiency: training data must be sufficient in volume for the intended use
Freedom from errors: appropriate data cleaning and validation before use
Completeness: known data gaps must be identified and documented
Bias examination: examination for possible biases that could lead to discrimination

EU AI Act: Article 10(2)–10(4): Mandatory for high-risk AI — data quality, bias examination, and documentation.

GDPR: Article 5(1)(d) accuracy principle — data must be accurate and kept up to date.

Data minimisation for AI

AI systems tend to consume more data than they need. Governance requires active minimisation.

Feature selection policy: document why each data feature is necessary for the AI model's purpose
Training data pruning: remove data that is not required once the model is trained
Synthetic data evaluation: where synthetic data can replace real personal data, prefer it
Aggregation vs individual-level data: use aggregated data where individual-level is not necessary

EU AI Act: Article 10(6): Only personal data that is strictly necessary may be used for high-risk AI training (narrow exception).

GDPR: Article 5(1)(c) data minimisation principle.

Retention and deletion

AI creates new retention challenges: model weights encode patterns from training data, and deletion from training data does not remove it from model weights.

Training data retention schedule: define how long training data is kept
Model retraining schedule: when do you retrain to incorporate data deletions?
Right to erasure policy for AI: what is your response when a data subject requests deletion of data used in AI training?
Model versioning: keep records of which model version used which data vintage

EU AI Act: Article 10(5): High-risk AI must process personal data used for bias monitoring with Article 9 safeguards.

GDPR: Article 5(1)(e) storage limitation. Article 17 right to erasure creates challenges for AI training data.

Access controls and governance

Who can access AI training data, modify it, or use it for new purposes.

Least-privilege access to AI training datasets
Audit logs for data access to AI training environments
Separation of duties: model training team vs data governance team
Approval process for using new data sources in AI systems

EU AI Act: Article 10 and Article 9 quality management system.

GDPR: Articles 25, 32: privacy by design and security of processing.

The Right to Erasure Challenge in AI

The interaction between GDPR Article 17 (right to erasure) and AI model training is one of the most difficult unsolved compliance problems. Here are the two main issues and how organisations are addressing them:

Training data deletion

Problem: Individual requests erasure of their data. You can delete it from your dataset — but the model has already been trained on it.

Current approaches:

•Model retraining without the data (expensive, not always feasible)
•Machine unlearning techniques (emerging, not yet reliable at scale)
•Documenting that deletion from training set occurred, and scheduling next model retrain

Inference memorisation

Problem: Large language models can memorise and reproduce training data verbatim. If your model has memorised someone's personal data, deletion from training set doesn't remove it from model weights.

Current approaches:

•Differential privacy techniques during training to reduce memorisation
•Post-training testing for memorisation of sensitive data
•Privacy-preserving fine-tuning methods

EU AI Act Article 10: Bias Examination Requirement

For high-risk AI systems, Article 10(2)(f) requires that training data be "examined in view of possible biases, in particular as regards persons or groups of persons on which the high-risk AI system is to be used." This means:

Demographic representation analysis: does the training dataset reflect the population the AI will make decisions about?

Historical bias detection: does the data encode historical discrimination patterns that the AI might perpetuate?

Protected characteristic proxies: do any features in the dataset correlate strongly with protected characteristics?

Under-representation audit: are any groups under-represented in ways that might lead to worse performance for those groups?

Documentation of findings: bias examination results must be documented — not just performed

Governance Structure: Who Owns What

Data Governance Lead / DPO

Overall framework, GDPR intersection, ROPA updates, erasure response process

ML Engineering / Data Science

Training data pipelines, quality checks, bias examination, feature selection documentation

Legal / Compliance

Data licensing, copyright compliance for training data, regulatory change monitoring

IT / Security

Access controls, audit logs, data encryption, infrastructure security

Product Management

Use case documentation, change requests for new data sources, feature retirement

Track Article 10 data governance compliance

ComplianceIQ tracks EU AI Act Article 10 requirements for each AI system in your inventory, including bias examination status, data source documentation, and quality check records.

Start free