RESEARCHarXiv CS.CL·4d ago
Epidemiology of Model Collapse: Modeling Synthetic Data Contamination via Bilayer SIR Dynamics
The paper proposes a bilayer SIR/SIRS framework to model synthetic data contamination and model collapse within the AI ecosystem. This phenomenological mean-field model treats data corpora and AI models as interacting populations, deriving a basic reproduction number to analyze cross-contamination.
28