Part 2: The Data — Building the First Public Coffee Roasting Audio Dataset with Warp/Oz
This article describes the creation of the first public audio dataset for coffee roasting first crack detection, addressing a significant gap in available resources. The dataset, comprising 973 annotated 10-second segments, was meticulously built from scratch and led to a model achieving 100% precision thanks to careful data splitting and loss weighting.