← heapsort-ai

LLM

612 items

ARTICLEDEV.to AI·10d ago

ai, deepseek, machinelearning

This article details the complete history of LLM development in China from 2017 to 2026, illustrating how Chinese AI labs evolved into genuine competitors. It highlights milestones such as Baidu's ERNIE 1.0 and the impact of OpenAI's GPT-2, alongside challenges like GPU export restrictions.

27
ARTICLEDEV.to AI·4/18/2026

Kiwi-chan Progress Report: Steady Mining!

This devlog updates on Kiwi-chan, a local-LLM Minecraft bot, detailing its progress in resource gathering like oak logs. It describes the challenging debugging process and the AI's complex loop of generating, executing, and rewriting its own code to overcome failures in the game world.

27
ARTICLEDEV.to AI·4/8/2026

AIMock: One Mock Server For Your Entire AI Stack

AIMock é um servidor de mock projetado para stacks agentic de IA, que visa resolver problemas de testes não confiáveis, caros e lentos que dependem de APIs reais. Ele expande a capacidade do LLMock para cobrir múltiplos serviços (LLM, banco de vetores, reranker, etc.), garantindo testes rápidos, gratuitos e confiáveis para aplicações de IA complexas.

27
ARTICLEDEV.to AI·4/15/2026

I Ran 163 Benchmarks Across 10 LLMs So You Don't Have To. Here's What I Found

This article highlights the common practice of teams overpaying for LLM inference due to a lack of proper benchmarking, often picking models based on popularity rather than cost-efficiency. The author, using a tool called CostGuard, ran 163 benchmarks across 15 models, uncovering surprising price differences of up to 200x between models like Gemini 2.5 Flash and GPT-5.

27
CASEDEV.to AI·4/10/2026

My AI pipeline had a 1M token context window. The output still got worse.

Um pipeline de investigação AIOps, que utilizava uma janela de contexto de 1M tokens com Gemini, viu sua qualidade de saída piorar devido à má seleção de contexto. A proporção fixa de carregamento de código irrelevante, especialmente de um repositório legado, estava degradando o desempenho do modelo, evidenciando que a qualidade do contexto é mais importante que a quantidade.

27