← heapsort-ai

model comparison

20 items

ARTICLE↑ trendingReddit r/LocalLLaMA·4/16/2026

Gemma 4 31b 3D geometry

The author expresses strong satisfaction with Gemma 4's quality, highlighting its coding ability and adaptability in conversations and reasoning. A test involving 3D model generation from an F1 car image demonstrated that Gemma significantly outperformed models like Claude Sonnet, Gemini Pro, and ChatGPT, which exhibited notable flaws.

Gemma 4 31b 3D geometry
41
ARTICLE↑ trendingReddit r/LocalLLaMA·5/4/2026

The more I use it, the more I'm impressed

A user found Qwen 3.6 27b capable of discovering a critical bug that both GPT 5.5 and Claude Opus 4.7 initially missed and denied. This observation suggests that slower, more thorough processing by models like Qwen can sometimes outperform faster, frontier models in critical problem-solving.

The more I use it, the more I'm impressed
39
ARTICLE↑ trendingReddit r/LocalLLaMA·4/19/2026

Switching from Opus 4.7 to Qwen-35B-A3B

A user is considering switching from Opus 4.7 to Qwen-35B-A3B as their daily coding agent and is seeking community experiences. They question if Qwen-35B-A3B will suffice for most tasks, acknowledging Opus might have an edge in complex reasoning, running on an M5 Max 128GB.

39
ARTICLE↑ trendingReddit r/LocalLLaMA·4/15/2026

Gemma4 26b & E4B are crazy good, and replaced Qwen for me!

The user describes their previous AI setup before switching to Gemma4, detailing the hardware configuration (GPUs and RAM) and the specific Qwen models used for various tasks. They explain the roles of different Qwen versions (3.5 4B, 30b, 27b, 80B, 122b) for semantic routing, general chat, reasoning, code generation, and knowledge retrieval, based on their quantization and context needs.

36
ARTICLE↑ trendingReddit r/LocalLLaMA·4/21/2026

An actual example of "If you dont run it, you dont own it" and Gemma 4 beats both Chat GPT and Gemini Chat

The author shares their experience using various AI models (GPT OOS 120B, Qwen 3 Max, Chat GPT 4o) for translating a Chinese novel, highlighting challenges with name consistency and unexpected censorship. Chat GPT 4o was initially the best for accuracy and translation quality, but some models showed degradation or filtering over time.

35
RESEARCHarXiv CS.CL·4/16/2026

A Multi-Model Approach to English-Bangla Sentiment Classification of Government Mobile Banking App Reviews

This study classifies sentiment in English and Bangla reviews of Bangladeshi government mobile banking apps, using a hybrid labeling approach for 5,652 reviews. It found that traditional machine learning models like Random Forest and Linear SVM significantly outperformed fine-tuned XLM-RoBERTa for this specific task.

31
ARTICLEDEV.to AI·4/25/2026

DeepSeek V4 Pro Just Dropped — Here's What Changed for AI Agents

DeepSeek V4 Pro, launched on April 24, 2026, introduces a 1.6T parameter MoE model with a 1M token context, dual Think/Non-Think modes, and an MIT license. Positioned as a cost-effective solution for AI agent workloads, it boasts improved multi-step planning and reliable function calling, with pricing significantly lower than competitors like Claude Sonnet 4.6 and GPT-4o.

27
ARTICLEDEV.to AI·4/9/2026

Choosing Between GPT-5.4 and Claude Sonnet 4.6 in Real Workflows

O artigo compara o desempenho dos modelos GPT-5.4 e Claude Sonnet 4.6 em fluxos de trabalho reais, destacando que, embora 80% das tarefas sejam semelhantes, o GPT-5.4 se sobressai em 20% das situações que exigem raciocínio multi-passos, uso de ferramentas e saídas estruturadas. A análise enfatiza que critérios como consistência, velocidade, custo e adequação ao fluxo de trabalho são mais importantes do que apenas a correção em ambientes de produção.

27