Foundation Models

24 items

RESEARCH↑ trendingReddit r/LocalLLaMA·25d ago

internlm/Intern-S2-Preview · Hugging Face

Intern-S2-Preview is an efficient 35B scientific multimodal foundation model that achieves performance comparable to trillion-scale models by exploring task scaling and full-chain training. It excels in hundreds of professional scientific tasks while maintaining strong general reasoning, multimodal understanding, and agent capabilities.

AI models multimodal AI model training Foundation Models

internlm/Intern-S2-Preview · Hugging Face

RESEARCH↑ trendingReddit r/MachineLearning·26d ago

Continual Harness: Online Adaptation for Self-Improving Foundation Agents [R]

The paper introduces "Continual Harness," a new approach for online adaptation in self-improving foundation agents, formalizing the iterative refinement loop. This methodology enables model-harness co-learning, building upon the success of systems like Gemini Plays Pokémon.

Online Adaptation self-improvement machine learning Foundation Models

Continual Harness: Online Adaptation for Self-Improving Foundation Agents [R]

RESEARCHarXiv CS.LG·13d ago

TSFMAudit: Data Contamination Auditing in Forecasting Time Series Foundation Models

This work introduces TSFMAudit, a novel method for auditing data contamination in Time Series Foundation Models (TSFMs) during pretraining. It detects when evaluation datasets have been unduly exposed, leading to overly optimistic performance estimates, by observing unusually efficient adaptation during fine-tuning. The study evaluates TSFMAudit on 6 TSFMs and 187 datasets, addressing a previously unstudied challenge in pretraining contamination auditing for TSFMs.

time-series-models data-auditing security machine learning

RESEARCHarXiv CS.LG·8d ago

NumLeak: Public Numeric Benchmarks as Latent Labels in Foundation Models

This paper introduces NumLeak, a framework designed to measure memorized recall in foundation models using public numeric benchmarks. It reveals that top-tier LLMs recall financial and economic data with high fidelity, suggesting that evaluations may be measuring memorization rather than genuine out-of-sample skill.

LLM memorization Foundation Models data leakage Benchmarking

RESEARCHarXiv CS.CL·4/6/2026

SocioEval: A Template-Based Framework for Evaluating Socioeconomic Status Bias in Foundation Models

SocioEval é um framework baseado em templates para avaliar sistematicamente o viés de status socioeconômico em modelos de fundação, incluindo LLMs, uma área pouco explorada. A pesquisa avaliou 13 LLMs e revelou variações substanciais nas taxas de viés (0,42% a 33,75%), manifestando-se de forma diferente em vários temas.

LLMs evaluation Foundation Models SocioEval

RESEARCHarXiv CS.AI·4d ago

GITCO: Gated Inference-Time Context Optimization in TSFMs

This paper introduces GITCO, a lightweight framework for Gated Inference-Time Context Optimization that enhances the accuracy of patch-based Time Series Foundation Models (TSFMs). It selectively identifies and suppresses harmful patches without model weight updates, achieving a +1.95% MASE reduction on TimesFM 2.5 across 53 datasets.

forecasting Optimization machine learning Foundation Models

RESEARCHarXiv CS.LG·25d ago

Mechanistic Interpretability of EEG Foundation Models via Sparse Autoencoders

This paper explores the mechanistic interpretability of EEG foundation models by applying TopK Sparse Autoencoders (SAEs) to extract sparse feature dictionaries from their embeddings. It benchmarks monosemanticity and entanglement across different EEG transformer architectures, grounds these features in a clinical taxonomy, and introduces concept steering to quantify selectivity and expose representational failures.

Clinical AI AI interpretability Foundation Models Sparse autoencoders

RESEARCHarXiv CS.LG·11d ago

TaxDistill: Improving Metagenomic Taxonomic Annotation via Distilled Genomic Foundation Models

TaxDistill introduces a knowledge distillation framework to improve metagenomic taxonomic annotation by addressing limitations of traditional methods. It utilizes GenomeOcean, a 500M-parameter genomic foundation model, as a teacher network to generate clean soft labels and enhance classification performance.

Genomics machine learning Foundation Models metagenomics

ARTICLEDEV.to AI·29d ago

White Paper FM v Public API

This article compares Apple's Foundation Models white paper with its actual API surface, highlighting a significant discrepancy between advertised capabilities and exposed functionalities. The author notes that the white paper describes an ambitious multimodal system, whereas the API exposes only a fraction of that functionality.

Apple AI models Foundation Models API

RESEARCHarXiv CS.CL·4/13/2026

A Representation-Level Assessment of Bias Mitigation in Foundation Models

This research investigates how bias mitigation reshapes the embedding space of encoder-only and decoder-only foundation models like BERT and Llama2. Findings show that bias mitigation reduces gender-occupation disparities in the embedding space, leading to more neutral internal representations, confirming embedding analysis as a valuable debiasing validation tool.

BERT Bias Mitigation Foundation Models representational analysis

RESEARCHarXiv CS.LG·28d ago

Do Foundation Model Embeddings Improve Cross-Country Crop Yield Generalisation? A Leave-One-Country-Out Evaluation in Sub-Saharan Africa

This paper evaluates whether geospatial foundation model embeddings improve cross-country maize yield predictions in Sub-Saharan Africa. It finds that while within-country predictions are moderate, all feature sets, including foundation model embeddings, perform poorly under cross-country testing, indicating a significant generalisability gap.

Geospatial AI Sub-Saharan Africa machine learning Foundation Models

RESEARCHDEV.to AI·13d ago

Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillationand Agentic RL

This research introduces Chain-of-Agents, an end-to-end framework for developing agent foundation models. It leverages multi-agent distillation and agentic reinforcement learning to enhance AI agent capabilities.

AI models reinforcement learning machine learning Foundation Models

DOCHugging Face Blog·29d ago

Building Blocks for Foundation Model Training and Inference on AWS

The content discusses the essential building blocks for training and inference of foundation models on the AWS platform. It explores the necessary components for implementing and operating these models.

AI training machine learning Foundation Models AWS

NEWSMIT Tech Review AI·22d ago

What to expect from Google this week

This story from The Algorithm discusses what to expect from Google's annual I/O developer conference. The event is set to open with Google in a clear third place in the foundation model race.

Developer Conference Foundation Models Google I/O AI

RESEARCHarXiv CS.LG·4/13/2026

Distilling Genomic Models for Efficient mRNA Representation Learning via Embedding Matching

This paper introduces a distillation framework to make large genomic foundation models for mRNA representation learning more efficient, reducing model size by 200-fold. By using embedding-level distillation, the smaller model achieves state-of-the-art performance on mRNA-related tasks, demonstrating an effective strategy for scalable biological AI.

mRNA Foundation Models Model Distillation representation learning

RESEARCHarXiv CS.LG·4/27/2026

Focus Session: Hardware and Software Techniques for Accelerating Multimodal Foundation Models

This research presents a multi-layered methodology to accelerate multimodal foundation models (MFMs) through hardware and software co-design. It employs optimization techniques like hierarchy-aware mixed-precision quantization, structural pruning, speculative decoding, and model cascading to reduce computational and memory requirements.

Optimization multimodal AI AI acceleration Foundation Models

RESEARCHarXiv CS.LG·4/27/2026

Mochi: Aligning Pre-training and Inference for Efficient Graph Foundation Models via Meta-Learning

Mochi is a Graph Foundation Model that improves efficiency and task unification by employing a meta-learning based training framework. It pre-trains on few-shot episodes directly mirroring downstream evaluation, addressing limitations of traditional reconstruction-based pre-training and achieving competitive performance.

Meta-Learning Model Alignment Graph Neural Networks Foundation Models

RESEARCHarXiv CS.LG·5/4/2026

AirFM-DDA: Air-Interface Foundation Model in the Delay-Doppler-Angle Domain for AI-Native 6G

AirFM-DDA introduces an Air-interface Foundation Model operating in the Delay-Doppler-Angle (DDA) domain for AI-native 6G physical layer tasks. This model reparameterizes channel state information from the space-time-frequency domain to explicitly resolve multipath components, overcoming the computational overhead of existing global attention mechanisms.

AI-native networks Foundation Models Wireless Communication physical layer

RESEARCHarXiv CS.LG·7d ago

Foundation-Preserving Adaptation via Generalized Rayleigh-Quotient Optimization

This paper introduces Foundation Preserving LoRA (FoLoRA), an optimization framework that addresses the degradation of nontarget capabilities during finetuning of foundation models. It uses a generalized Rayleigh quotient to balance task utility and forgetting penalty, guiding updates to preserve pretraining knowledge.

Finetuning neural networks Optimization machine learning

RESEARCHarXiv CS.LG·13d ago

AirCast-SR: A Foundation Model for Kilometer-Scale Atmospheric Super-Resolution via Latent Consistency Diffusion

AirCast-SR introduces a foundation model for atmospheric super-resolution, downscaling global AI weather forecasts from 28 km to 1 km resolution for 67-hour predictions of eight surface variables. Utilizing a 3D U-Net within a Latent Consistency Model diffusion framework, it addresses computational limits of traditional NWP models.

Foundation Models AI super-resolution weather prediction