← heapsort-ai

LLM

609 items

RESEARCH↑ trendingReddit r/MachineLearning·4/14/2026

We benchmarked TranslateGemma against 5 other LLMs on subtitle translation across 6 languages. At first glance the numbers told a clean story, but then human QA added a chapter. [D]

This content presents a benchmark study evaluating six Large Language Models (LLMs), including TranslateGemma-12b, on English subtitle translation into six languages. The models were ranked using reference-free Quality Evaluation (QE) metrics and a custom combined metric called TQI, where TranslateGemma-12b emerged as the top-performing model overall.

We benchmarked TranslateGemma against 5 other LLMs on subtitle translation across 6 languages. At first glance the numbers told a clean story, but then human QA added a chapter. [D]
70
ARTICLE↑ trendingReddit r/LocalLLaMA·4/14/2026

Gemma 4 31B — 4bit is all you need

This content compares the performance of Gemma 4 31B's 4-bit and 8-bit quantized versions on an M5 Max MacBook Pro, surprisingly finding the 4-bit version scored higher (91.3% vs 88.4%). It also notes an issue where Gemma 4 26B-A4B entered a regression loop, truncating responses after hitting the max token limit of 16,384.

Gemma 4 31B — 4bit is all you need
67
CASE↑ trendingReddit r/LocalLLaMA·4/23/2026

Qwen 3.6 27B is a BEAST

A user reports that Qwen 3.6 27B, run locally on a laptop, excels at data science tasks like tool calls and data transformation debugging. Its performance was so impressive that they are considering canceling cloud subscriptions, finding it perfect for pyspark/python work.

56
NEWS↑ trendingReddit r/LocalLLaMA·4/22/2026

Qwen3.6-27B released!

Qwen3.6-27B, a new dense, open-source model, has been released, boasting flagship-level agentic coding power that surpasses its predecessor, Qwen3.5-397B-A17B. It also features strong reasoning across text and multimodal tasks, supports thinking/non-thinking modes, and is available under the Apache 2.0 license.

Qwen3.6-27B released!
54
DOCGoogle for Developers (YouTube)·19h ago

Gemma Playground: Robot Duck

This content explores the Gemma Playground, using a 'Robot Duck' as an application example. The focus is on demonstrating the capabilities of the Gemma model in a practical scenario.

Gemma Playground: Robot Duck
54
RESEARCHarXiv CS.LG·19h ago

Enabling KV Caching of Shared Prefix for Diffusion Language Models

The paper introduces "bicache", the first KV caching technique for shared prefixes in diffusion language models (DLMs), addressing challenges where existing LLM caching methods fail due to DLMs' bidirectional attention. This new approach aims to unlock high-throughput DLM serving by leveraging observations about shared prefix KVs stability in shallow layers.

54