large language models

262 items

DOCAndrej Karpathy (YouTube)·11/23/2023

[1hr Talk] Intro to Large Language Models

This is a one-hour talk providing a comprehensive introduction to Large Language Models (LLMs). It covers the fundamental concepts and workings of these powerful AI technologies.

learning large language models

[1hr Talk] Intro to Large Language Models

DOCAndrej Karpathy (YouTube)·1/17/2023

Let's build GPT: from scratch, in code, spelled out.

This content provides a detailed tutorial on building a GPT model from scratch, explaining each implementation step in code. It serves as a practical guide to understanding the architecture and functionality of Large Language Models.

GPT learning large language models AI development

Let's build GPT: from scratch, in code, spelled out.

DOCAndrej Karpathy (YouTube)·6/9/2024

Let's reproduce GPT-2 (124M)

This content provides a guide for reproducing the GPT-2 (124M) model, detailing the steps required to recreate this language architecture. It serves as a practical tutorial for AI enthusiasts and developers.

learning GPT-2 machine learning large language models

ARTICLEThe AI Epiphany (YouTube)·7/3/2024

Best LLM? Qwen 2 LLM w/ author Junyang Lin

This content discusses Qwen 2, a large language model, potentially reviewing its capabilities or comparing it with other LLMs, featuring insights from its author, Junyang Lin.

AI models Qwen 2 large language models LLM

Best LLM? Qwen 2 LLM w/ author Junyang Lin

ARTICLEThe AI Epiphany (YouTube)·9/16/2024

Imbue - training a 70B model from scratch! (w/ Bowei - head of infra)

This content discusses Imbue's ambitious project of training a 70B AI model entirely from scratch. It features Bowei, head of infrastructure, providing insights into the challenges and processes involved in such a large-scale undertaking.

model training Imbue infrastructure large language models

Imbue - training a 70B model from scratch! (w/ Bowei - head of infra)

ARTICLEDEV.to AI·4/11/2026

Best GirlfriendGPT Alternative in 2026: Why AI Angels Wins

The article discusses the evolution of AI companion applications and positions AI Angels as the superior alternative to GirlfriendGPT in 2026. It highlights users' search for more meaningful, personalized, and private experiences, identifying AI Angels as the definitive choice for the best AI girlfriend experience.

AI Angels AI girlfriends large language models AI companions

ARTICLEDEV.to AI·4/27/2026

The next phase of the Microsoft OpenAI partnership

The next phase of the Microsoft OpenAI partnership centers on integrating OpenAI's advanced models, including the 1-trillion-parameter GPT-4, into Microsoft products like Azure, Dynamics, and Office. This integration aims to empower developers to build and deploy AI-powered applications on the cloud platform.

GPT-4 AI integration cloud computing AI partnership

ARTICLEDEV.to AI·4/15/2026

Why Does AI Just... Make Stuff Up?

This article explores the fundamental reasons behind artificial intelligence's tendency to generate incorrect or fabricated information, often referred to as "hallucinations". It delves into the mechanisms that cause AI models to "make stuff up" and discusses implications for their reliability and trustworthiness.

AI hallucinations AI limitations AI reliability large language models

DOCfast.ai Blog·11/6/2025

A Guide to Solveit Features

Large language models make code generation remarkably easy, but this often leads to code that developers don't understand. This lack of comprehension makes modifying, debugging, or adding features to AI-generated code challenging.

code maintainability code generation large language models Software Engineering

ARTICLEDEV.to AI·4/13/2026

AI Agents vs RPA: Which Automation Technology Is Better?

This article compares AI agents and RPA, highlighting that RPA automates repetitive tasks on user interfaces, while AI agents use LLMs for reasoning and adaptation. The choice depends on the need for deterministic repetition or intelligent decision-making, with many organizations adopting a hybrid approach.

workflow automation large language models automation RPA

NEWSDEV.to AI·4/26/2026

DeepSeek-V4 Ported to MLX for Apple Silicon Inference

DeepSeek-V4 has been ported to Apple's MLX framework, enabling the large language model to run on Apple Silicon Macs. The functional port, a community effort by @Prince_Canuma, still requires optimization for improved performance.

apple-silicon local inference MLX large language models

ARTICLEDEV.to AI·4/24/2026

Qwen3.6-Plus for Coding: When It Beats Qwen3.5-Plus

Qwen3.6-Plus outperforms Qwen3.5-Plus in complex, multi-stage coding tasks that require codebase inspection, planning, and integrated tool use. While 3.5-Plus handles short snippets well, 3.6-Plus excels at maintaining context across workflows involving terminal commands, search, and browsing.

AI models software development tool use large language models

ARTICLEDEV.to AI·4/8/2026

Understanding Tokens and Context Windows

Tokens são os blocos de construção fundamentais dos Large Language Models (LLMs), que preveem a próxima sequência de texto com base em unidades menores. Essa quebra do texto em tokens é essencial para o funcionamento dos sistemas de completação de chat.

LLMs Inteligência Artificial context windows large language models

ARTICLEDEV.to AI·4/11/2026

You Don’t Need “Prompt Engineering” to Talk to AI

This article argues that "Prompt Engineering" is overestimated for common users, asserting that interacting with Large Language Models is as simple as having a conversation. The author, an AI student, proposes a straightforward approach to get good results without advanced technical knowledge.

AI interaction User Guide prompt-engineering AI

ARTICLEDEV.to AI·4/13/2026

The Expensive Anxiety of AI

The article analyzes the significant resources and computational costs involved in training and deploying AI models, particularly large language models. It discusses the need for massive data, complex matrix operations, and specialized hardware like GPUs and TPUs, as well as techniques such as distributed and parallel processing.

GPU computational costs large language models TPU

ARTICLEDEV.to AI·4/6/2026

I built a GitHub App that auto-writes PR descriptions — here's what happened in 8 days

O autor relata a criação e os resultados de um aplicativo GitHub baseado em IA, desenvolvido para gerar automaticamente descrições de pull requests, após 8 dias de uso.

GitHub productivity AI large language models

ARTICLEOpenAI Blog·4/10/2026

AI fundamentals

This beginner-friendly guide explains the fundamentals of artificial intelligence, detailing what AI is and how it works. It also covers the application of large language models in tools like ChatGPT.

ai-fundamentals beginner guide ChatGPT large language models

ARTICLE↑ trendingReddit r/LocalLLaMA·4/28/2026

Duality of r/LocalLLaMA

This content is a Reddit post title suggesting a discussion about the dual or contrasting aspects of the r/LocalLLaMA community, focused on local language models.

AI Community Reddit large language models

ARTICLE↑ trendingReddit r/LocalLLaMA·4/10/2026

the state of LocalLLama

Este conteúdo apresenta uma análise sobre o estado atual do projeto LocalLLama. Ele explora os avanços e desafios que envolvem os Large Language Models locais.

open-source AI Local LLMs AI large language models

NEWSDEV.to AI·4/24/2026

DeepSeek V4 Rivoluziona l'IA con un Contesto da 1 Milione di Token e Ragionamento di Classe Mondiale

DeepSeek V4 is revolutionizing AI by introducing a 1 million token context window and world-class reasoning capabilities. The announcement details the key points, with a more in-depth analysis available in the full article.

DeepSeek AI models Context window Reasoning