← heapsort-ai

LLM

609 items

ARTICLEDEV.to AI·4/19/2026

Aprenda avaliar a qualidade do seu agente de AI, RAG e LLM

The author discusses the importance and lack of awareness regarding AI system evaluation (evals) for agents, RAG, and LLMs, explaining that they will present key metrics and frameworks. The article aims to teach how to improve the quality of AI project delivery, combining theory and practice, with a study repository using Openrouter.

33
NEWS↑ trendingReddit r/LocalLLaMA·4/24/2026

r/LocalLLaMa Rule Updates

The r/LocalLLaMa subreddit announced rule updates, including minimum karma requirements, to combat increased spam and low-quality content generated by bots and AI tools. These changes aim to improve community quality, which sees over 1 million weekly visitors.

r/LocalLLaMa Rule Updates
32
RESEARCHarXiv CS.LG·4/17/2026

MixAtlas: Uncertainty-aware Data Mixture Optimization for Multimodal LLM Midtraining

MixAtlas introduces an uncertainty-aware method for optimizing data mixtures in multimodal LLM midtraining by decomposing corpora along image concepts and task supervision. Using proxy models and a Gaussian-process surrogate, it finds better-performing data recipes for improved sample efficiency and generalization.

32
RESEARCHarXiv CS.CL·15d ago

Multi-Persona Debate System for Automated Scientific Hypothesis Generation

The Multi-Persona Debate System (MPDS) is a literature-grounded framework designed to automate scientific hypothesis generation, specifically addressing the challenge of synthesizing fragmented knowledge in areas like battery materials research. It combines literature retrieval, large language model reasoning, and multi-agent debate to enable negotiation between personas while preserving evidence traceability.

32
DOCDEV.to AI·4/16/2026

LLM vs RAG

This content compares LLMs (Large Language Models) and RAG (Retrieval-Augmented Generation), outlining their core differences in terms of type, knowledge source, accuracy, and use cases. It explains that RAG enhances LLMs' factual grounding by integrating external, real-time data, thus mitigating hallucinations.

31
RESEARCHarXiv CS.LG·4/21/2026

Beyond Verifiable Rewards: Rubric-Based GRM for Reinforced Fine-Tuning SWE Agents

This research introduces a rubric-based Generative Reward Model (GRM) to enhance Reinforced Fine-Tuning (RFT) for LLM Agents in Software Engineering (SWE) tasks. By providing richer learning signals beyond binary terminal rewards, this approach shapes intermediate behaviors and significantly improves the quality of the resolution process.

31