Can Large Language Models Detect Methodological Flaws? Evidence from Gesture Recognition for UAV-Based Rescue Operation Based on Deep Learning
This research investigates whether Large Language Models (LLMs) can identify methodological flaws, such as data leakage, in published machine learning studies. A case study showed six state-of-the-art LLMs consistently detected evaluation flaws in a gesture recognition paper due to non-independent data partitioning.