model quantization

2 items

NEWS↑ trendingReddit r/LocalLLaMA·4/20/2026

ubergarm/Kimi-K2.6-GGUF Q4_X now available

User ubergarm/VoidAlchemy announced the availability of the "Q4_X" quantized version of the Kimi-K2.6-GGUF model, thanking jukofyork and AesSedai for their tips on patching and quantization. This model requires about 584GB RAM+VRAM and runs on both ik_llama.cpp and mainline llama.cpp, with plans for smaller quants and imatrix info to follow.

LLMs model quantization open-source AI

ubergarm/Kimi-K2.6-GGUF Q4_X now available

DOCDEV.to AI·15d ago

로컬 LLM 셋업 가이드 (v27)

This guide provides a comprehensive walkthrough for setting up and running Local LLMs on Linux systems, covering hardware requirements, a comparison of popular frameworks like llama.cpp and Ollama, and recommendations for various models and quantization formats. It aims to help users efficiently deploy LLMs locally for privacy, low latency, and cost savings.

LLM setup model quantization local LLM AI frameworks