← heapsort
RESEARCH28

Epidemiology of Model Collapse: Modeling Synthetic Data Contamination via Bilayer SIR Dynamics

arXiv CS.CLΒ·June 5, 2026

The paper proposes a bilayer SIR/SIRS framework to model synthetic data contamination and model collapse within the AI ecosystem. This phenomenological mean-field model treats data corpora and AI models as interacting populations, deriving a basic reproduction number to analyze cross-contamination.

Read original β†—