← heapsort-ai

LLM

611 items

DOCDEV.to AI·16d ago

터미널 AI 에이전트 구축 (v2)

This practical guide teaches developers how to build and optimize terminal-based AI agents, leveraging local LLMs for real-time code support. It details the setup of platforms like Aider and Ollama, and includes an example CLI agent with function calling capabilities.

28
ARTICLEDEV.to AI·5/7/2026

Firecrawl vs Crawl4AI: Web Scraping for RAG

Building reliable Retrieval-Augmented Generation (RAG) pipelines necessitates a shift in web scraping from traditional selectors to converting DOM into semantic Markdown. Firecrawl and Crawl4AI are key tools for this translation layer, and this post evaluates them based on architectural fit, extraction quality, performance, and AI workflow integration.

28
RESEARCHarXiv CS.AI·5d ago

Synthetic Contrastive Reasoning for Multi-Table Q&A

This paper introduces a synthetic contrastive reasoning-trace dataset for multi-table question answering (MMQA), addressing the lack of reasoning supervision in existing resources. Open-weight LLMs fine-tuned with Contrastive Preference Optimization (CPO) using this dataset achieved significant performance improvements, highlighting the benefits of heterogeneous trace generators.

28
RESEARCHarXiv CS.CL·5d ago

LANTERN: Layered Archival and Temporal Episodic Retrieval Network for Long-Context LLM Conversations

LANTERN is a lightweight memory layer for LLMs that archives conversation turns and restores relevant details after context compaction via hybrid retrieval. It recovers 78.3% of verifiable facts lost to compaction, outperforming LLM-driven approaches with significantly lower inference cost and zero LLM calls.

28
DOCDEV.to AI·11d ago

How to Deploy Qwen2.5 72B with vLLM + AWQ Quantization on a $24/Month DigitalOcean GPU Droplet: Multilingual Reasoning at 1/110th Claude Opus Cost

This guide details how to deploy Qwen2.5 72B with vLLM and AWQ quantization on a DigitalOcean GPU Droplet for just $24/month. It demonstrates significant cost reduction compared to commercial AI APIs like Claude Opus, offering enterprise-grade multilingual reasoning at a fraction of the price.

28
ARTICLEDEV.to AI·4/9/2026

Karpathy called it context engineering > prompt engineering. I built a tool that does it automatically for codebases.

O artigo discute a ênfase de Karpathy em "engenharia de contexto" em vez de "engenharia de prompt" para LLMs, destacando que a performance da IA depende crucialmente do contexto fornecido. Ele aponta o problema de LLMs consumirem muitos tokens repetidamente para entender o contexto de um código, levando o autor a desenvolver uma ferramenta para automatizar esse processo.

28
RESEARCHarXiv CS.CL·4/9/2026

SensorPersona: An LLM-Empowered System for Continual Persona Extraction from Longitudinal Mobile Sensor Streams

SensorPersona é um sistema baseado em LLM que infere continuamente personas de usuários a partir de dados multimodais coletados de forma discreta de sensores móveis. Ele aprofunda a personalização ao extrair padrões físicos, traços psicossociais e experiências de vida, superando as limitações da inferência baseada apenas em histórico de chat.

28