large language models

262 items

RESEARCH↑ trendingReddit r/MachineLearning·5/7/2026

META Superintelligence Lab Presents: ProgramBench: Can SOTA AI Recreate Real Executable Programs(ffmpeg, SQLite, ripgrep) From Scratch Without The Internet?

Meta Superintelligence Lab introduces ProgramBench, an initiative testing the ability of advanced AIs to recreate executable programs like ffmpeg and SQLite from scratch, without internet access. This study aims to explore the limits of AI code generation. The research focuses on evaluating the autonomy and completeness of AI models in complex software synthesis.

program synthesis code generation Benchmarks AI programming

RESEARCH↑ trendingHacker News (AI)·11d ago

AI Propaganda factories with language models

The article discusses the potential for AI, particularly large language models, to be exploited in creating 'propaganda factories'. It explores how these technologies could automate and scale the generation of misleading content, posing significant challenges to information integrity and public discourse.

Societal impact propaganda AI ethics large language models

ARTICLE↑ trendingReddit r/LocalLLaMA·26d ago

I Let a Small Model Train on Its Own Mistakes. It Reached 80% on HumanEval and Beat GPT-3.5 on Math

An experiment showed that a small AI model can train itself to code by inventing problems, solving them, and fine-tuning on its own corrections. The model achieved 80% on HumanEval and outperformed GPT-3.5 on math, using only a Python interpreter as the judge.

self-correction AI training Benchmarking code generation

I Let a Small Model Train on Its Own Mistakes. It Reached 80% on HumanEval and Beat GPT-3.5 on Math

ARTICLE↑ trendingReddit r/LocalLLaMA·4/14/2026

How to Distill from 100B+ to <4B Models

This content discusses the process of AI model distillation, focusing on how to reduce massive models with over 100 billion parameters to significantly smaller versions with less than 4 billion. The aim is to enhance the efficiency and accessibility of complex AI models.

Model Compression LLMs Model Distillation AI Efficiency

ARTICLE↑ trendingReddit r/MachineLearning·4/26/2026

Why do only big ML labs dominate widely-used models despite many open-source pretrained models smaller labs could do RL on? [D]

The content questions why large AI labs dominate widely-used models like GPT and Claude, despite the existence of many open-source pretrained models of similar scale. The author suggests that Reinforcement Learning from Human Feedback (RLHF) is key to the superiority of these models and wonders why it wouldn't be more accessible for smaller labs.

open-source AI RLHF AI industry large language models

ARTICLE↑ trendingReddit r/LocalLLaMA·4/23/2026

An Overnight Stack for Qwen3.6–27B: 85 TPS, 125K Context, Vision — on One RTX 3090 | by Wasif Basharat | Apr, 2026

The title describes an impressive optimization for the Qwen3.6–27B model, achieving 85 TPS and 125K context with vision capabilities on a single RTX 3090. This represents a significant technical feat for efficient LLM deployment.

Optimization multimodal AI GPU large language models

An Overnight Stack for Qwen3.6–27B: 85 TPS, 125K Context, Vision — on One RTX 3090 | by Wasif Basharat | Apr, 2026

RESEARCH↑ trendingReddit r/MachineLearning·4/13/2026

I scaled a pure Spiking Neural Network (SNN) to 1.088B parameters from scratch. Ran out of budget, but here is what I found [R]

An 18-year-old indie developer scaled a pure Spiking Neural Network (SNN) to 1.088 billion parameters from scratch for language modeling, achieving loss convergence despite common beliefs about vanishing gradients. Key findings include maintaining 93% sparsity and the unexpected emergence of structurally correct Russian text, though the experiment was halted due to budget constraints.

Spiking Neural Networks AI scaling large language models Language modeling

RESEARCHarXiv CS.LG·1d ago

FAIR-Calib: Frontier-Aware Instability-Reweighted Calibration for Post-Training Quantization of Diffusion Large Language Models

Diffusion Large Language Models (dLLMs) face a "stability lag" due to irreversible token commitment, a problem exacerbated by Post-Training Quantization (PTQ) errors. FAIR-Calib proposes a two-stage PTQ framework that uses a position prior and layer-wise calibration to protect fragile frontier states, enhancing quantization for dLLMs.

Diffusion Models post-training quantization quantization AI calibration

ARTICLE↑ trendingReddit r/LocalLLaMA·4/24/2026

Hard freakin' decision..Blackwell 96G or Mac Studio 256G

A user sought advice on purchasing high-end AI hardware to run large models like Gemma4s and Qwen3.6s, weighing options between a Blackwell/RTX Pro 6000 96G GPU and a Mac Studio M3 Ultra 256G. They ultimately decided on the Blackwell option, citing its superior token handling capabilities and a favorable deal.

AI applications GPU AI hardware large language models

Hard freakin' decision..Blackwell 96G or Mac Studio 256G

NEWS↑ trendingReddit r/LocalLLaMA·5/6/2026

ZAYA1-8B: Frontier intelligence density, trained on AMD

ZAYA1-8B, a new AI model showcasing frontier intelligence density, has been announced. It was notably trained using AMD hardware.

AI training AMD AI model hardware

ZAYA1-8B: Frontier intelligence density, trained on AMD

ARTICLE↑ trendingReddit r/LocalLLaMA·4/18/2026

Qwen3.6-35B-A3B solved coding problems Qwen3.5-27B couldn’t

The author, initially skeptical, tested Qwen3.6-35B-A3B and found it could solve coding problems that Qwen3.5-27B simply couldn't handle anymore. This occurred while developing a customized budgeting app, where the previous version was introducing technical debt.

model performance App Development large language models coding assistance

RESEARCHDEV.to AI·4/22/2026

Towards Large Reasoning Models: A Survey of Reinforced Reasoning with LargeLanguage Models

This survey delves into large reasoning models, specifically examining the application of reinforced reasoning techniques to large language models. It offers a comprehensive overview of current methods and progress in enhancing LLM reasoning capabilities.

Survey reinforced learning AI Reasoning large language models

ARTICLE↑ trendingReddit r/LocalLLaMA·4/24/2026

DeepSeek-v4 has a comical 384K max output capability

A user expresses shock at DeepSeek-v4's 384K max output capability, successfully generating a comprehensive single-HTML web OS in a 100KB file. This impressive functionality showcases the model's potential for generating extensive and complex content.

DeepSeek AI models code generation large language models

DeepSeek-v4 has a comical 384K max output capability

ARTICLE↑ trendingReddit r/LocalLLaMA·5/6/2026

Bad news: Apple drops high-memory Mac Studio configs

Apple has quietly discontinued high-memory configurations for the Mac Studio, leaving the M3 Ultra version with a maximum of 96GB RAM and the Mac mini at 48GB. This change is a significant setback for users wanting to run large AI models locally, as high-memory options were crucial for such tasks.

Apple Mac Studio Local AI hardware

Bad news: Apple drops high-memory Mac Studio configs

ARTICLE↑ trendingReddit r/LocalLLaMA·4/27/2026

Anthropic's Claude remote uses GLM-4.7

A user observed that Anthropic's Claude code remote environment defaults to using the GLM-4.7 model, rather than a proprietary Anthropic model. This finding prompts questions about AI companies, known for their proprietary models, potentially serving open-weight models.

AI models Anthropic large language models

RESEARCHarXiv CS.LG·4/14/2026

Human-like Working Memory Interference in Large Language Models

This study investigates working memory limitations in Large Language Models (LLMs), finding that they exhibit human-like interference signatures. Pretrained LLMs show performance degradation with increased memory load and bias by recency, even though transformers can be trained to perfectly solve such tasks.

LLMs AI limitations Working Memory human cognition

RESEARCHarXiv CS.CL·18d ago

Broadening Access to Transportation Safety Data with Generative AI: A Schema-Grounded Framework for Spatial Natural Language Queries

This paper introduces a schema-grounded natural language interface using Generative AI to make transportation safety data more accessible. It aims to bridge the gap for practitioners by translating user queries into structured semantic frames for reliable analysis.

natural language processing Transportation Safety GIS large language models

RESEARCHarXiv CS.LG·4/20/2026

Aletheia: Gradient-Guided Layer Selection for Efficient LoRA Fine-Tuning Across Architectures

Aletheia introduces a gradient-guided layer selection method for LoRA fine-tuning, identifying the most task-relevant layers and applying adapters selectively with asymmetric rank. This approach achieves a significant 15-28% training speedup across diverse large language models and architectures while broadly matching downstream behavior.

Parameter-efficient fine-tuning efficiency large language models Fine-tuning

DOCOpenAI Blog·4/23/2026

GPT-5.5 System Card

This document, titled "GPT-5.5 System Card", likely details the technical specifications, capabilities, and limitations of the GPT-5.5 language model. It serves as a comprehensive reference for understanding the operation and usage guidelines of this advanced AI system.

Model Evaluation large language models AI safety Generative AI

ARTICLEDEV.to AI·4/22/2026

AI এখন শুধু একটা টুল না থেকে ধীরে ধীরে intelligence এর দিকে যাচ্ছে

Recent whispers in Silicon Valley point to Anthropic's Mythos, an AI model rumored to be transcending the definition of a mere tool towards intelligence. Insiders suggest Mythos can deeply analyze complex systems, understand software structures, and detect hidden vulnerabilities, capabilities far beyond standard language models.

AI capabilities Mythos Anthropic AI