← heapsort-ai

LLMs

714 items

RESEARCH↑ trendingReddit r/LocalLLaMA·4/17/2026

Qwen3.6 GGUF Benchmarks

This content presents KLD performance benchmarks for Unsloth's Qwen3.6-35B-A3B GGUF quants, highlighting their efficiency in terms of KLD versus disk space. It also clarifies that frequent GGUF updates are typically due to external bug fixes or official improvements, rather than Unsloth's internal errors.

Qwen3.6 GGUF Benchmarks
41
ARTICLE↑ trendingReddit r/LocalLLaMA·4/9/2026

16 GB VRAM users, what model do we like best now?

Um usuário com 16 GB de VRAM compartilha sua experiência positiva com o modelo Qwen 3.5 27b em quants IQ3 em uma RTX 4080, alcançando boa velocidade e contexto. Ele discute os desafios de otimizar modelos de IA localmente com essa quantidade de VRAM, ponderando entre qualidade e velocidade ao lidar com diferentes níveis de quantização.

41
ARTICLE↑ trendingReddit r/LocalLLaMA·4/27/2026

Guys this is so fun!

A user expresses excitement about running various AI models like Qwen and Llama locally on their MacBook Air and an AI Workstation with an RTX Pro 6000 Blackwell, utilizing tools such as LM Studio and LM Link.

41
ARTICLE↑ trendingReddit r/LocalLLaMA·4/21/2026

2x 512gb ram M3 Ultra mac studios

A user with two high-end M3 Ultra Mac Studios (512GB RAM each, $25k in hardware) is testing LLM models like Deepseek and GLM, and is asking the community for suggestions on what else to load. They are troubleshooting backend issues and awaiting optimizations for Kimi 2.6.

2x 512gb ram M3 Ultra mac studios
41
ARTICLE↑ trendingReddit r/LocalLLaMA·4/19/2026

Is anyone getting real coding work done with Qwen3.6-35B-A3B-UD-Q4_K_M on a 32GB Mac in opencode, claude code or similar?

A user is attempting to perform real coding tasks with Qwen3.6-35B on a 32GB M2 Macbook Pro, encountering memory exhaustion and context window management issues. Despite the model identifying the essence of a bug, it struggles with implementation as critical information is lost during context compaction.

39
ARTICLE↑ trendingReddit r/LocalLLaMA·4/19/2026

Switching from Opus 4.7 to Qwen-35B-A3B

A user is considering switching from Opus 4.7 to Qwen-35B-A3B as their daily coding agent and is seeking community experiences. They question if Qwen-35B-A3B will suffice for most tasks, acknowledging Opus might have an edge in complex reasoning, running on an M5 Max 128GB.

39
ARTICLEDEV.to AI·4/23/2026

I Built a Local AI VRAM Calculator & GPU Planner (Beta)

The author has launched a new beta tool called "Local AI VRAM Calculator & GPU Planner" to help determine GPU and VRAM requirements for running local LLMs. This tool aims to make hardware tradeoffs visible for different workloads and quantization levels before committing to components.

39
ARTICLE↑ trendingReddit r/LocalLLaMA·4/10/2026

gemma-4-26B-A4B with my coding agent Kon

O autor compartilha Kon, seu agente de codificação de IA, que funciona bem com modelos locais para tarefas simples. Ele é notável por seu prompt de sistema pequeno, ausência de telemetria, compatibilidade com os melhores modelos locais e provedores populares, além de uma base de código simples e recursos avançados.

38
ARTICLE↑ trendingReddit r/MachineLearning·4/9/2026

Studying Sutton and Barto's RL book and its connections to RL for LLMs (e.g., tool use, math reasoning, agents, and so on)? [D]

Um graduado em Matemática busca orientação para estudar Aprendizado por Reforço (RL) e suas conexões com LLMs, especialmente para aplicações em matemática. Ele questiona a relevância do livro 'Sutton e Barto' em um contexto moderno de LLMs e pede ajuda para focar em tópicos e algoritmos mais recentes como PPO e GRPO.

38