hardware

55 items

NEWS↑ trendingReddit r/LocalLLaMA·26d ago

NVIDIA Reportedly Prepares RTX 5090 Price Hike Amid Rising GDDR7 Costs (maybe RTX 50 and PRO series as well)

NVIDIA is reportedly preparing for a price hike for the RTX 5090 and potentially other RTX 50 and PRO series, due to rising GDDR7 memory costs. This news suggests a potential increase in the pricing of the company's upcoming graphics cards.

RTX 5090 GPUs hardware NVIDIA

NVIDIA Reportedly Prepares RTX 5090 Price Hike Amid Rising GDDR7 Costs (maybe RTX 50 and PRO series as well)

ARTICLEDEV.to AI·19d ago

Designing with Nvidia's Ising Quantum AI: A Calibration Playbook for ML Engineers

Nvidia's Ising quantum AI models are combinatorial optimizers used to map high-dimensional hardware states into low-energy configurations for optimal operation. Productizing this technology as a service requires careful calibration to ensure reliable convergence and avoid being bypassed by operators.

Optimization ML Engineering hardware NVIDIA

ARTICLEDEV.to AI·22d ago

i ran frontier ai entirely on my own hardware for months, and i can't go back

The author successfully ran frontier AI entirely on personal hardware for months, driven by frustrations with centralized cloud infrastructure dependency, latency, costs, and privacy concerns. They believe local AI represents the true future of the technology.

privacy Gemma 4 security Local AI

ARTICLEDEV.to AI·4/23/2026

Agentic AI Needs Different Silicon

This content highlights that Google's new TPU 8T and 8I chips are specifically designed for agentic AI, which operates in stateful, multi-step loops, differing from traditional stateless LLM inference. This represents a fundamental shift in hardware architecture, where the KV cache acts as persistent memory crucial for agents that reason and act over time.

AI compute Google Agentic AI hardware

ARTICLEDEV.to AI·4/17/2026

I Run 14 AI Agents 24/7 on a 16GB MacBook — Here's What Broke First

The author runs 14 AI agents 24/7 on a 16GB MacBook, challenging the consensus that powerful hardware is essential for serious AI workloads. These agents, which orchestrate a real business, are managed in waves with only 1-3 executing simultaneously to maintain persistent state.

AI orchestration LLMs Local AI hardware

ARTICLEDEV.to AI·5/3/2026

I wrote a custom CUDA inference engine to run Qwen3.5-27B on $130 mining cards

A developer created a custom CUDA inference engine to successfully run the Qwen3.5-27B large language model on low-cost, repurposed mining graphics cards. This innovative approach demonstrates significant hardware optimization, making powerful AI models more accessible on affordable consumer-grade hardware.

CUDA Optimization inference hardware

ARTICLEDEV.to AI·4/16/2026

Inside NVIDIA’s $2B Marvell Deal: What NVLink Fusion Means for AI Ethernet Fabrics

NVIDIA's $2B deal with Marvell, centered on NVLink Fusion, is a fabric-control move for AI Ethernet fabrics, not just a chip deal. It signifies that optical interconnects and rack-scale integration are the new battleground for AI infrastructure, shifting how network teams approach design.

Networking AI infrastructure hardware

ARTICLEDEV.to AI·4/12/2026

How I Run an AI Agent 24/7 on a Mac Mini — The Full Setup

This article details the setup for running an AI agent 24/7 on a Mac Mini, named Joey. It covers hardware, software, and costs, highlighting the Mac Mini's energy efficiency and cost-effectiveness compared to cloud solutions.

Custo-benefício AI agent Automação Mac Mini

ARTICLEDEV.to AI·15d ago

Most people starting with local LLMs jump straight to 4-bit quantization because it's fast and uses

This article compares 16-bit, 8-bit, and 4-bit LLM quantization, revealing that 4-bit, while faster, significantly compromises quality on reasoning and math tasks. The real trade-off is between the task and required precision, with 8-bit being optimal for precision-demanding tasks, offering minimal quality loss with only a slight speed reduction. Quantization choice should be based on the task and hardware considerations, not solely on hardware.

inference speed model performance quantization hardware

ARTICLEDEV.to AI·24d ago

Built an open-source picker that recommends the right self-hosted LLM for your hardware

An open-source picker has been developed to recommend self-hosted Large Language Models (LLMs) based on a user's specific hardware, including platform and available VRAM. The project also provides a curated model directory, installation guides for Ollama, llama.cpp, and LM Studio, and a glossary for newcomers.

Open Source self-hosting hardware guides

ARTICLEDEV.to AI·4/25/2026

The Rise of Local AI: Running LLMs on Your Own Hardware in 2026

By 2026, running powerful AI models locally on personal hardware will be a mainstream capability, offering significant privacy benefits and zero marginal cost compared to cloud services. This shift addresses concerns about sending sensitive data to third-party servers and eliminates subscription fees.

privacy security Local AI hardware

ARTICLEDEV.to AI·19d ago

The Pillars of Progress: Navigating AI Infrastructure and GPU Scaling

Artificial Intelligence is a transformative force, with GPUs being crucial for its computational power. Understanding AI infrastructure and GPU scaling is paramount for organizations aiming to harness this technology's full potential.

GPU scaling AI infrastructure hardware Computational power

NEWSDEV.to AI·5/7/2026

Nvidia Ships AI Factory Blueprints: 4-Node to 128-Cluster Specs

Nvidia released three validated blueprints for AI data centers, from 4-node RTX PRO to 128-node NVL72 clusters, designed for agentic AI and trillion-parameter models. These Enterprise Reference Architectures offer repeatable infrastructure designs for deploying AI factories.

AI models data centers AI infrastructure hardware

ARTICLEDEV.to AI·9d ago

Best Local AI Models for Apple Silicon in 2026

The article discusses the significant change in running local AI models on Apple Silicon Macs, a task that previously required dedicated NVIDIA GPUs. This transformation is driven by Apple Silicon's unified memory architecture, which efficiently utilizes shared RAM across components.

mac apple-silicon Local AI hardware

DOCDEV.to AI·16d ago

로컬 LLM 셋업 가이드 (v12)

This is a practical guide for deploying local LLMs, detailing hardware, operating system, and prerequisite installation requirements. It compares frameworks like llama.cpp, Ollama, and vLLM for different development and performance needs.

learning guide hardware local deployment

DOCDEV.to AI·18d ago

在老旧 AMD RX 580 (8GB) 上通过原生 Vulkan 运行 Flux Schnell (12B) + LLM — 完整架构指南 [2026]

This technical guide demonstrates running LLMs and Stable Diffusion models on an old AMD RX 580 GPU in 2026, bypassing AI software limitations. It details the use of native Vulkan with the ggml engine for efficient inference, proving the viability of older hardware.

Vulkan hardware ggml AI inference

ARTICLEDEV.to AI·29d ago

When I started running models locally, I thought quantization meant squeezing more into RAM. Turns o

The article advises against defaulting to Q4_K_M for local LLM inference, emphasizing that optimal performance comes from testing quantization levels tailored to specific workflows. It suggests that aggressive quantization like Q3_K_S can significantly cut latency with imperceptible quality loss for many tasks, though context length presents a trade-off.

Optimization LLMs quantization hardware

NEWSThe Verge AI·7d ago

Microsoft Build 2026: All the news about Windows, AI, RTX Spark, and more

Microsoft Build 2026, the annual developer conference, is set to kick off with anticipated announcements on new AI models, a Copilot "super app," and major Windows 11 changes. The event is also expected to feature new hardware like the Surface Laptop Ultra with Nvidia RTX Spark, and Project Solara, Microsoft's AI agent OS.

Windows Developer Conference Microsoft AI

NEWSThe Verge AI·5/5/2026

OpenAI is reportedly launching a phone for ChatGPT

OpenAI is reportedly fast-tracking the development of its first hardware product, a ChatGPT-powered phone, aiming for mass production in early 2027. The device is expected to feature a customized MediaTek Dimensity 9600 chip with an enhanced image signal processor.

smartphone tech news OpenAI ChatGPT

NEWSMIT Tech Review AI·4/21/2026

Analog computing from waste heat

A team at MIT led by Giuseppe Romano has developed an analog computing method that uses waste heat from electronic devices for data processing, eliminating the need for electricity. This novel approach encodes input data without relying on binary 1s and 0s.

analog computing sustainable computing Energy Efficiency hardware