Open-Weight Models

2 items

RESEARCHarXiv CS.AI·5/4/2026

AgentFloor: How Far Up the tool use Ladder Can Small Open-Weight Models Go?

This work introduces AgentFloor, a deterministic 30-task benchmark organized as a six-tier capability ladder, to evaluate tool-use abilities in AI models. Results indicate that small and mid-sized open-weight models are sufficient for much of the short-horizon, structured tool-use work prevalent in real agent pipelines.

Open-Weight Models LLMs benchmarking tool use

ARTICLEKDNuggets·26d ago

5 Small Language Models for Agentic Tool Calling

This content highlights five small, open-weight language models that are notable for their support of structured tool calling, making them suitable for agentic applications.

Open-Weight Models AI models LLMs tool-calling

5 Small Language Models for Agentic Tool Calling