hardware

55 items

DOC↑ trendingReddit r/LocalLLaMA·4/27/2026

To 16GB VRAM users, plug in your old GPU

This content suggests that users with 16GB VRAM add an old GPU (6GB+ VRAM) to increase total VRAM, enabling the execution of larger LLM models (~30b) even with a weaker secondary card. It includes a practical configuration example for `llama-server`.

deep learning GPU optimization LLM inference VRAM management

ARTICLE↑ trendingReddit r/LocalLLaMA·25d ago

I have (even faster) DeepSeek V4 Pro at home

The author successfully ran the DeepSeek V4 Pro model even faster on their home hardware using ktransformers. They detail hardware tweaks and present performance benchmark results with increasing context depth.

DeepSeek Benchmarking hardware performance

NEWS↑ trendingHacker News (AI)·8d ago

Nvidia and Microsoft Reinvent Windows PCs for the Age of Personal AI

Nvidia and Microsoft are collaborating to power a new generation of Windows AI PCs with GeForce RTX GPUs, bringing advanced personal AI capabilities to users. This initiative, featuring Project G-Assist and Nvidia ACE, aims to integrate generative AI agents and accelerate applications like Microsoft Copilot directly on the device.

Microsoft Copilot Windows AI hardware

ARTICLE↑ trendingReddit r/LocalLLaMA·4/30/2026

Follow-up: Qwen3.6-27B on 1× RTX 3090 — pushing to ~218K context + ~50–66 TPS, tool calls now stable (PN12 fix)

This update details running Qwen3.6-27B on a single RTX 3090, achieving ~218K context and stable tool calls at 50-66 TPS. A critical memory issue with long tool outputs was resolved by fixing an anchor drift in a Genesis patch (PN12) for vLLM.

Optimization hardware performance vLLM

ARTICLE↑ trendingReddit r/LocalLLaMA·4/22/2026

Is a high-end private local LLM setup worth it?

The user questions the worth of a high-end local LLM setup, citing high costs, setup difficulties, and perceived performance gaps compared to cloud services like Claude and GPT. They are willing to invest in powerful hardware but want to know if it can truly match the speed and intelligence of top commercial models.

local LLM private-ai cost hardware

ARTICLE↑ trendingReddit r/LocalLLaMA·5/7/2026

Need advice on hardware purchasing decision: RTX 5090 vs. M5 Max 128GB for agentic software development

The user is seeking advice on choosing between an RTX 5090 and an M5 Max 128GB for agentic software development using Qwen3.6 27B locally. The RTX 5090 offers 3x speed, while the M5 Max provides 4x memory, presenting a trade-off between rapid code generation and larger context capacity.

LLMs GPU hardware performance

ARTICLE↑ trendingReddit r/LocalLLaMA·4/9/2026

16 GB VRAM users, what model do we like best now?

Um usuário com 16 GB de VRAM compartilha sua experiência positiva com o modelo Qwen 3.5 27b em quants IQ3 em uma RTX 4080, alcançando boa velocidade e contexto. Ele discute os desafios de otimizar modelos de IA localmente com essa quantidade de VRAM, ponderando entre qualidade e velocidade ao lidar com diferentes níveis de quantização.

LLMs VRAM modelos de linguagem hardware

ARTICLE↑ trendingReddit r/LocalLLaMA·4/27/2026

Guys this is so fun!

A user expresses excitement about running various AI models like Qwen and Llama locally on their MacBook Air and an AI Workstation with an RTX Pro 6000 Blackwell, utilizing tools such as LM Studio and LM Link.

open source models LLMs Local AI hardware

ARTICLE↑ trendingReddit r/LocalLLaMA·4/21/2026

2x 512gb ram M3 Ultra mac studios

A user with two high-end M3 Ultra Mac Studios (512GB RAM each, $25k in hardware) is testing LLM models like Deepseek and GLM, and is asking the community for suggestions on what else to load. They are troubleshooting backend issues and awaiting optimizations for Kimi 2.6.

Apple AI models LLMs Mac Studio

NEWS↑ trendingReddit r/LocalLLaMA·4/12/2026

Weekend project with Intel B70s

A user is building a high-end system with Intel Arc B70 GPUs and a Gigabyte B850 AI Top motherboard. The goal is to test the Gemma 4 model in legal RAG applications, utilizing a Hermes agent.

Legal AI GPU RAG AI Model

NEWS↑ trendingReddit r/LocalLLaMA·5/6/2026

ZAYA1-8B: Frontier intelligence density, trained on AMD

ZAYA1-8B, a new AI model showcasing frontier intelligence density, has been announced. It was notably trained using AMD hardware.

AI training AMD AI Model hardware

ZAYA1-8B: Frontier intelligence density, trained on AMD

RESEARCH↑ trendingReddit r/LocalLLaMA·4/19/2026

QWEN3.6 + ik_llama is fast af

A user reported running the Qwen3.6 + ik_llama model at over 50 tokens/second with a 200k context window on 16GB VRAM and 32GB RAM. This marks a significant performance benchmark for large language models.

Benchmarking hardware performance LLM

NEWS↑ trendingReddit r/LocalLLaMA·5/4/2026

Ryzen AI Max+ 495 (Gorgon Halo) with 192GB VRAM!

Leaks indicate that the AMD Ryzen AI Max+ PRO 495 (Gorgon Halo) might feature an APU with 192GB of VRAM, signaling a promising future for Local AI. Despite potential high costs due to the storage crisis, future versions like the Medusa Halo in 2027 are speculated to reach 256GB.

Ryzen AI VRAM AMD Local AI

ARTICLEDEV.to AI·4/14/2026

OpenClaw on Raspberry Pi 5: Full Setup Guide

The article describes how the Raspberry Pi 5 is now powerful enough to comfortably run OpenClaw AI agent workloads, presenting a cost-effective and private alternative to cloud hosting. It details the Pi 5's specifications that make it practical for this purpose.

OpenClaw Raspberry Pi 5 AI hardware

ARTICLE↑ trendingHacker News (AI)·6d ago

32GB of DDR5 now costs $375 – AI shortage continues to squeeze PC building

The price of 32GB DDR5 memory has risen to $375, driven by the ongoing AI shortage. This trend continues to impact the PC building market, making components more expensive for consumers.

PC building AI shortage DDR5 hardware

ARTICLE↑ trendingReddit r/LocalLLaMA·26d ago

The RTX 5000 PRO (48GB) arrived and it is better than I expected.

The author, a novice PC builder, bought an RTX 5000 Pro GPU for local LLM processing, spending $5600 in total. Despite initial struggles with assembly and software setup (Linux, vLLM), they found the GPU's performance better than expected.

local LLM PC Build GPU AI

ARTICLE↑ trendingReddit r/MachineLearning·4/17/2026

Which computer should I buy: Mac or custom-built 5090? [D]

The user seeks advice on choosing between a Mac M5 MAX with MLX and a custom-built PC with an RTX 5090 for their machine learning projects. Their work primarily involves fine-tuning large pre-trained models and training from scratch, often with image/video data and sometimes LLMs, making VRAM a critical factor.

deep learning GPU machine learning hardware

ARTICLE↑ trendingReddit r/LocalLLaMA·5/6/2026

Bad news: Apple drops high-memory Mac Studio configs

Apple has quietly discontinued high-memory configurations for the Mac Studio, leaving the M3 Ultra version with a maximum of 96GB RAM and the Mac mini at 48GB. This change is a significant setback for users wanting to run large AI models locally, as high-memory options were crucial for such tasks.

Apple Mac Studio Local AI hardware

Bad news: Apple drops high-memory Mac Studio configs

NEWS↑ trendingReddit r/LocalLLaMA·4/26/2026

Comparison of upcoming x86 unified memory systems

This content compares upcoming x86 unified memory systems from AMD and Intel, including Gorgon Halo, Strix Halo, Medusa Halo, and Nova Lake AX. It details release timelines and bandwidth improvements, with AMD Medusa Halo promising a significant performance jump by 2027.

AI accelerators processors memory hardware

ARTICLE↑ trendingReddit r/LocalLLaMA·19d ago

In theory, if I have $20k-ish to spend on hardware what would actually get me closest to local coding agent that would allow me to go totally off the social grid?

The user asks what hardware (around $20k, e.g., RTX 6000 GPUs) would be needed to set up a local coding agent and go completely off the social grid. The question also touches upon the role of the AI model in this setup.

Coding Agent privacy Local AI hardware