model training

16 items

RESEARCH↑ trendingReddit r/MachineLearning·4/24/2026

New project about llm hallucination [P]

This content introduces a new side project and its GitHub repository, focusing on mitigating LLM hallucination through a novel contrastive sampling and selective training method. The core idea treats hallucination as a preference problem, using self-generated negative samples and divergence-based, gated learning to push correct answers and suppress wrong ones.

hallucination model training Natural Language Processing AI safety

RESEARCH↑ trendingReddit r/MachineLearning·27d ago

Trained transformer-based chess models to play like humans (including thinking time) [P]

A developer trained transformer-based deep learning models to play chess like humans across various rating buckets, including unique thinking time prediction. The models were trained on Lichess data and achieved accuracy comparable to MAIA-3, despite their small size.

AI models deep learning chess AI model training

RESEARCH↑ trendingReddit r/LocalLLaMA·25d ago

internlm/Intern-S2-Preview · Hugging Face

Intern-S2-Preview is an efficient 35B scientific multimodal foundation model that achieves performance comparable to trillion-scale models by exploring task scaling and full-chain training. It excels in hundreds of professional scientific tasks while maintaining strong general reasoning, multimodal understanding, and agent capabilities.

AI models multimodal AI model training Foundation Models

internlm/Intern-S2-Preview · Hugging Face

ARTICLE↑ trendingReddit r/MachineLearning·5/7/2026

Dataset of 150k+ stool images and not sure how to fully use it [D]

A user with a 150k stool image dataset seeks best practices for training a computer vision model. They question their current manual verification workflow and look for smarter, more scalable approaches for ensuring dataset and annotation quality.

dataset-quality model training machine learning computer vision

RESEARCHarXiv CS.LG·20d ago

Simply Stabilizing the Loop via Fully Looped Transformer

Looped Transformers provide a way to improve model performance by iteratively reusing blocks without increasing parameter count, but they suffer from training instability at higher loop iterations. This instability is attributed to gradient oscillation and residual explosion, leading to the proposal of the Fully Looped Transformer, which introduces a Fully Looped Architecture and Attention Injection.

neural networks AI architecture deep learning model training

RESEARCHarXiv CS.LG·5/1/2026

Monitoring Neural Training with Topology: A Footprint-Predictable Collapse Index

A new topology-aware monitor, the Collapse Index (CI), is proposed to detect representational collapse early in neural training. It uses fast, incremental updates to provide a low-latency early-warning signal for interventions in LLM fine-tuning and KGE training.

neural networks monitoring topology model training

RESEARCHDEV.to AI·5/6/2026

Micro-Batch Training with Batch-Channel Normalization and Weight Standardization

This content explores advanced techniques for optimizing neural network training, specifically focusing on micro-batch processing. It details the application of batch-channel normalization and weight standardization to enhance model performance and stability in scenarios with small batch sizes.

neural networks batch-normalization Optimization deep learning

RESEARCHarXiv CS.CL·5/4/2026

RSAT: Structured Attribution Makes Small Language Models Faithful Table Reasoners

RSAT is a new method that trains small language models (SLMs) to produce faithful, step-by-step reasoning for table questions, grounded with cell-level citations. It significantly improves faithfulness (3.7x) and achieves near-perfect citation validity by integrating attribution into the reasoning process.

language models attribution Table Reasoning model training

RESEARCHarXiv CS.LG·4/21/2026

Annotation Entropy Predicts Per-Example Learning Dynamics in LoRA Fine-Tuning

This research discovers that LoRA fine-tuning leads to 'un-learning' on contested examples, where high annotator disagreement correlates with increased loss during training. This pattern is distinct from full fine-tuning and consistently observed across multiple encoder and decoder-only models and datasets.

model training machine learning NLP Fine-tuning

DOCAWS Machine Learning Blog·7d ago

The art and science of hyperparameter optimization on Amazon Nova Forge

This post explores the art and science of hyperparameter optimization on Amazon Nova Forge, detailing how to balance improving domain-specific performance without degrading a model's general capabilities. It covers customization strategies, configuring training parameters like learning rate and batch size, and avoiding common mistakes that lead to wasted training runs.

Amazon Nova Forge hyperparameter optimization learning model training

RESEARCHDEV.to AI·5/10/2026

Distillation that keeps confidence honest

Traditional on-policy distillation (OPD) causes smaller student models to exhibit overconfidence due to the larger teacher model's access to privileged context. New research formalizes this mismatch and proposes CaOPD to rectify this certainty illusion without sacrificing accuracy gains.

Confidence Calibration distillation model training machine learning

ARTICLEDEV.to AI·7d ago

hat Makes a Good SFT Sample (And Why Most Synthetic Datasets Get It Wrong)

Many fine-tuned language models result in worse performance due to poor quality synthetic data. The issue is not with the training setup, but with the lack of mechanisms to filter out errors during synthetic data generation.

synthetic data LLMs model training Fine-tuning

RESEARCHarXiv CS.AI·5/6/2026

Terminus-4B: Can a Smaller Model Replace Frontier LLMs at Agentic Execution Tasks?

This research introduces Terminus-4B, a finetuned small language model, to explore its capability in replacing frontier LLMs for agentic terminal execution tasks. The model is post-trained using Supervised Finetuning and Reinforcement Learning with rubric-based LLM-as-judge rewards.

LLMs model training performance evaluation Small Language Models

ARTICLEThe AI Epiphany (YouTube)·9/16/2024

Imbue - training a 70B model from scratch! (w/ Bowei - head of infra)

This content discusses Imbue's ambitious project of training a 70B AI model entirely from scratch. It features Bowei, head of infrastructure, providing insights into the challenges and processes involved in such a large-scale undertaking.

model training Imbue infrastructure large language models

Imbue - training a 70B model from scratch! (w/ Bowei - head of infra)

ARTICLEHugging Face Blog·3/3/2026

PRX Part 3 — Training a Text-to-Image Model in 24h!

Este é o terceiro episódio da série PRX, focado no desafio de treinar um modelo de inteligência artificial capaz de gerar imagens a partir de descrições textuais. O artigo propõe explorar como essa tarefa complexa pode ser realizada em um período otimizado de 24 horas.

Text-to-image deep learning model training machine learning

DOCHugging Face Blog·4/16/2026

Training and Finetuning Multimodal Embedding & Reranker Models with Sentence Transformers

This content discusses the training and finetuning of multimodal embedding and reranker models, leveraging the Sentence Transformers library to optimize their performance.

Finetuning embedding models multimodal AI model training