Kaggle

4 items

ARTICLE↑ trendingReddit r/MachineLearning·4/23/2026

2b or not 2b ? Custom LLM Scheduling Competition [P]

A Kaggle competition has been launched, focusing on optimizing token costs for LLM answers by deciding whether to run a small model or skip a question. The goal is to minimize weighted cost, considering compute, failure, and penalty for skipping a correct answer.

Kaggle Benchmarking model optimization resource management

RESEARCHarXiv CS.LG·8d ago

LongDS-Bench: On the Failure of Long-Horizon Agentic Data Analysis

This research introduces LongDS, a new benchmark for evaluating AI agents in long-horizon, multi-turn data analysis tasks, featuring 68 tasks from real-world Kaggle notebooks. It reveals that state-of-the-art models achieve only 48.45% accuracy, with performance significantly dropping in later turns, highlighting a critical failure in tracking evolving analytical context.

Long-horizon tasks Kaggle AI Benchmarks data analysis

NEWSGoogle DeepMind Blog·3/17/2026

Measuring progress toward AGI: A cognitive framework

Uma nova estrutura cognitiva está sendo introduzida para medir o progresso em direção à AGI. Para auxiliar no desenvolvimento das avaliações pertinentes, um hackathon no Kaggle será lançado.

framework Kaggle Avaliação de IA progresso de IA

NEWSGoogle AI Blog·4/27/2026

Join the new AI Agents Vibe Coding Course from Google and Kaggle

Google and Kaggle are re-launching their 5-Day AI Agents Intensive Course, with registration currently open. This program focuses on training in AI Agents, presented by two major tech entities.

education Kaggle Google AI agents

Join the new AI Agents Vibe Coding Course from Google and Kaggle