Pseudonymization
Processing personal data so it cannot be attributed to an individual without additional separately-held information
Pseudonymization is the processing of personal data so that it can no longer be attributed to a specific individual without the use of additional information, provided that additional information is kept separately and protected by technical and organizational measures. Unlike anonymization, pseudonymized data remains personal data under GDPR and other privacy frameworks—full regulatory obligations continue to apply.
The critical distinction is reversibility: pseudonymization is designed to be reversible (the data can be re-linked to individuals when needed), while anonymization aims to be permanent. This makes pseudonymization a risk-reduction tool rather than a scope-exclusion mechanism. GDPR Recital 26 explicitly states that pseudonymized data "should be considered to be information on an identifiable natural person."
GDPR incentivizes pseudonymization through multiple provisions. Article 6(4)(e) recognizes it as a safeguard enabling compatible further processing. Article 32 names pseudonymization and encryption as the only two specific security measures cited as "appropriate." Article 25 identifies it as a data protection by design technique. Article 89 requires pseudonymization as an appropriate safeguard for research processing unless it would "seriously impair" the purpose.
Four primary techniques achieve pseudonymization. Tokenization replaces identifiers with random tokens using either a lookup table or encryption. Encryption transforms data using cryptographic algorithms; deterministic encryption enables matching while probabilistic encryption provides stronger security. Hashing applies one-way mathematical functions producing fixed-length outputs. Data masking partially obscures data while retaining some original information.
The January 2025 EDPB Guidelines 01/2025 provide the first dedicated EU guidance on implementation, introducing the "pseudonymisation domain" concept—the organizational separation between units with access only to pseudonymized data and those with access to linking information. Three elements must be present: data processed so it cannot be attributed without additional information, that additional information kept separately, and technical/organizational measures ensuring non-attribution.
Following Schrems II, the EDPB identified effective pseudonymization as one of only three technical measures qualifying as supplementary safeguards for transfers to countries without adequate protection. Four conditions apply: pre-transfer pseudonymization by the exporter, key retention exclusively within the EEA, exporter's sole control of re-identification capabilities, and analysis-only use by the importer.
For liability quantification, pseudonymized datasets should be scored as personal data with a modest risk reduction factor (0.85-0.95 multiplier) based on implementation quality. Claims that pseudonymization eliminates compliance obligations are red flags. The key assessment question is: who controls the linking information?