← heapsort-ai

model performance

22 items

RESEARCH↑ trendingReddit r/MachineLearning·4/17/2026

Low accuracy (~50%) with SSL (BYOL/MAE/VICReg) on hyperspectral crop stress data — what am I missing? [R]

The content details a persistent problem with achieving low accuracy (~50%) using self-supervised learning methods like BYOL, MAE, and VICReg for hyperspectral crop stress detection. Despite trying various techniques, performance remains barely better than random for three classes, leading to suspicions about data separability or SSL method suitability.

42
ARTICLE↑ trendingReddit r/LocalLLaMA·4/21/2026

Did Google hide the best version of Gemma 4 e4b in Android? The extracted model beats Unsloth and everything else I've tried.

The user observed that a Gemma 4 e4b model extracted from Google AI Edge Gallery on Android performs significantly better and smarter than versions from Unsloth or litertlm, despite being slightly smaller. They question if Google might be hiding a superior, optimized version of the model on Android.

38
RESEARCHarXiv CS.CL·19d ago

Improving Quantized Model Performance in Qualitative Analysis with Multi-Pass Prompt Verification

This research examines how various lower-bit quantization levels impact LLaMA-3.1's performance in qualitative analysis, noting that low-bit models often produce hallucinations. It proposes a quantization-aware multi-pass prompt verification method to enhance accuracy by systematically reducing hallucinations and filtering unreliable content.

28
RESEARCHDEV.to AI·20d ago

How Far Can a Small Coding Model Go With a Better Harness?

The article explores the performance of a small coding model (GPT-5.1-Codex-Mini) on Terminal-Bench 2.0, achieving a 61.6% score by optimizing its "harness" rather than swapping for a larger model. It highlights that the model's wrapper plays a crucial role in performance, especially evident when using smaller models where harness mistakes have a greater impact.

27
ARTICLEDEV.to AI·15d ago

Most people starting with local LLMs jump straight to 4-bit quantization because it's fast and uses

This article compares 16-bit, 8-bit, and 4-bit LLM quantization, revealing that 4-bit, while faster, significantly compromises quality on reasoning and math tasks. The real trade-off is between the task and required precision, with 8-bit being optimal for precision-demanding tasks, offering minimal quality loss with only a slight speed reduction. Quantization choice should be based on the task and hardware considerations, not solely on hardware.

27
ARTICLEDEV.to AI·22d ago

Saturday Night Fights

This article reveals a significant gap between AI models' benchmark scores and their practical performance in agent-readiness tests, where many high-scoring models fail real-world challenges. The author proposes a "fight card" to evaluate AI models based on their true operational capabilities rather than superficial metrics.

27