RESEARCHarXiv CS.CL·8d ago
Exploring Autonomous Agentic Data Engineering for Model Specialization
This paper introduces 'Autonomous Agentic Data Engineering,' a novel task to evaluate LLMs as autonomous data engineers for model specialization through end-to-end data curation. Experiments show autonomous LLM data engineers achieve substantial gains, with GPT-5.2 improving a student model by 57.29%.
29