Model Disgorgement
Regulatory remedy forcing deletion of AI models trained on non-compliant data
Model disgorgement is an emerging regulatory remedy that requires organizations to delete AI/ML models that were trained on data collected or used in violation of privacy laws. Unlike simple data deletion, disgorgement addresses the fact that personal information can become "baked into" model weights, making traditional erasure insufficient. As FTC Commissioner Rebecca Kelly Slaughter stated: "When companies collect data illegally, they should not be able to profit from either the data or any algorithm developed using it."
The FTC established the modern framework for algorithmic disgorgement through a series of enforcement actions beginning with Cambridge Analytica (2019). The settlement required deletion of not just the improperly obtained Facebook user data, but any work product derived from that data—including the psychographic models that were the company's core product. Subsequent cases expanded and refined the remedy: Everalbum (2021), Weight Watchers/Kurbo (2022), Amazon Ring (2023), and Rite Aid (2023) each required destruction of algorithms and models built on non-compliant data.
The Everalbum case established the legal template now applied across sectors. The company operated a free consumer photo storage app ("Ever"), added facial recognition features enabled by default, extracted face embeddings from millions of user photos, then pivoted to selling enterprise facial recognition technology under the "Paravision" brand—including to law enforcement and military contractors—without user consent. The settlement defined "Affected Work Product" comprehensively: face embeddings, models, and algorithms derived from photos of users who did not give affirmative consent, or whose photos were retained after account deactivation despite promises of deletion. The company was required to delete all training datasets, face embedding databases, and production models.
Face embeddings illustrate why disgorgement extends beyond source data. A face embedding is a numerical vector (typically 128 or 512 dimensions) representing the unique geometric features of a face. These vectors are biometric data under state laws (Illinois BIPA, Texas CUBI), personal data under GDPR, and distinct from the source photos. The FTC's requirement to delete face embeddings established that derived biometric data—not just source images—must be destroyed.
The Weight Watchers/Kurbo case extended disgorgement to children's data. Kurbo, a weight management app for children, collected data from users as young as eight without parental consent required by COPPA. Beyond monetary penalties, the FTC required deletion of all algorithms and models trained on illegally collected children's data. This established that COPPA violations—historically treated as disclosure and consent failures—could trigger algorithmic remedies.
The financial implications of disgorgement are severe. AI/ML models represent years of development, millions of dollars in compute costs, and potentially the core intellectual property of a company. The Everalbum/Paravision facial recognition models that were destroyed represented the company's primary enterprise product; the business model pivot from consumer storage to enterprise AI was entirely unwound. For acquirers in M&A transactions, model provenance and training data consent chains become critical due diligence factors—models built on tainted data carry contingent destruction liability.
The remedy creates a unique form of deterrence because it targets the value extraction itself, not just the data collection. Traditional penalties (fines, injunctions) allow companies to profit from violations while paying costs of doing business. Disgorgement removes the business benefit entirely, making violations unprofitable regardless of scale. This shifts the incentive structure: it becomes economically rational to invest in consent infrastructure and data governance rather than risk losing the resulting models.
Disgorgement also addresses the technical limitation that deleting training data does not remove its influence from trained models. Research shows that large language models and neural networks can memorize substantial portions of their training data. Simply deleting the source records while retaining the model leaves patterns and potentially recoverable information embedded in model weights. Complete erasure would require retraining from scratch—prohibitively expensive for large models—or the emerging techniques of machine unlearning. Until unlearning matures, disgorgement remains the FTC's primary remedy for ensuring that illegally collected data provides no ongoing benefit.
For AI companies and their investors, the remedy creates a clear compliance imperative: training data provenance must be documented, consent must be obtained before collection, and consent scope must match intended use. Consumer data collected for one purpose (photo storage) cannot be repurposed for another (enterprise facial recognition) without additional consent. Business model pivots that monetize consumer data create regulatory exposure that can result in loss of core technology assets.
Related Terms
Related Regulations
See Also
Sources
- FTC v. Everalbum/Paravision. (2021). Consent Order Requiring Algorithmic Disgorgement.
- FTC v. Cambridge Analytica. (2019). Settlement and Work Product Deletion Requirements.
- FTC Commissioner Slaughter. Statement on Algorithmic Accountability.
- FTC v. Rite Aid. (2023). AI Facial Recognition Disgorgement Order.