RESEARCHHugging Face Blog·5d ago
EVA-Bench Data 2.0: 3 Domains, 121 Tools, 213 Scenarios
EVA-Bench Data 2.0 introduces an updated benchmark featuring 3 domains, 121 tools, and 213 scenarios. This dataset is designed for evaluating AI systems and tools.
28