← heapsort-ai

Foundation Models

24 items

RESEARCH↑ trendingReddit r/LocalLLaMA·25d ago

internlm/Intern-S2-Preview · Hugging Face

Intern-S2-Preview is an efficient 35B scientific multimodal foundation model that achieves performance comparable to trillion-scale models by exploring task scaling and full-chain training. It excels in hundreds of professional scientific tasks while maintaining strong general reasoning, multimodal understanding, and agent capabilities.

internlm/Intern-S2-Preview · Hugging Face
42
RESEARCHarXiv CS.LG·13d ago

TSFMAudit: Data Contamination Auditing in Forecasting Time Series Foundation Models

This work introduces TSFMAudit, a novel method for auditing data contamination in Time Series Foundation Models (TSFMs) during pretraining. It detects when evaluation datasets have been unduly exposed, leading to overly optimistic performance estimates, by observing unusually efficient adaptation during fine-tuning. The study evaluates TSFMAudit on 6 TSFMs and 187 datasets, addressing a previously unstudied challenge in pretraining contamination auditing for TSFMs.

29
RESEARCHarXiv CS.CL·4/6/2026

SocioEval: A Template-Based Framework for Evaluating Socioeconomic Status Bias in Foundation Models

SocioEval é um framework baseado em templates para avaliar sistematicamente o viés de status socioeconômico em modelos de fundação, incluindo LLMs, uma área pouco explorada. A pesquisa avaliou 13 LLMs e revelou variações substanciais nas taxas de viés (0,42% a 33,75%), manifestando-se de forma diferente em vários temas.

29
RESEARCHarXiv CS.LG·25d ago

Mechanistic Interpretability of EEG Foundation Models via Sparse Autoencoders

This paper explores the mechanistic interpretability of EEG foundation models by applying TopK Sparse Autoencoders (SAEs) to extract sparse feature dictionaries from their embeddings. It benchmarks monosemanticity and entanglement across different EEG transformer architectures, grounds these features in a clinical taxonomy, and introduces concept steering to quantify selectivity and expose representational failures.

28
ARTICLEDEV.to AI·29d ago

White Paper FM v Public API

This article compares Apple's Foundation Models white paper with its actual API surface, highlighting a significant discrepancy between advertised capabilities and exposed functionalities. The author notes that the white paper describes an ambitious multimodal system, whereas the API exposes only a fraction of that functionality.

27
RESEARCHarXiv CS.CL·4/13/2026

A Representation-Level Assessment of Bias Mitigation in Foundation Models

This research investigates how bias mitigation reshapes the embedding space of encoder-only and decoder-only foundation models like BERT and Llama2. Findings show that bias mitigation reduces gender-occupation disparities in the embedding space, leading to more neutral internal representations, confirming embedding analysis as a valuable debiasing validation tool.

27
RESEARCHarXiv CS.LG·28d ago

Do Foundation Model Embeddings Improve Cross-Country Crop Yield Generalisation? A Leave-One-Country-Out Evaluation in Sub-Saharan Africa

This paper evaluates whether geospatial foundation model embeddings improve cross-country maize yield predictions in Sub-Saharan Africa. It finds that while within-country predictions are moderate, all feature sets, including foundation model embeddings, perform poorly under cross-country testing, indicating a significant generalisability gap.

27
RESEARCHarXiv CS.LG·4/13/2026

Distilling Genomic Models for Efficient mRNA Representation Learning via Embedding Matching

This paper introduces a distillation framework to make large genomic foundation models for mRNA representation learning more efficient, reducing model size by 200-fold. By using embedding-level distillation, the smaller model achieves state-of-the-art performance on mRNA-related tasks, demonstrating an effective strategy for scalable biological AI.

27
RESEARCHarXiv CS.LG·4/27/2026

Focus Session: Hardware and Software Techniques for Accelerating Multimodal Foundation Models

This research presents a multi-layered methodology to accelerate multimodal foundation models (MFMs) through hardware and software co-design. It employs optimization techniques like hierarchy-aware mixed-precision quantization, structural pruning, speculative decoding, and model cascading to reduce computational and memory requirements.

27
RESEARCHarXiv CS.LG·4/27/2026

Mochi: Aligning Pre-training and Inference for Efficient Graph Foundation Models via Meta-Learning

Mochi is a Graph Foundation Model that improves efficiency and task unification by employing a meta-learning based training framework. It pre-trains on few-shot episodes directly mirroring downstream evaluation, addressing limitations of traditional reconstruction-based pre-training and achieving competitive performance.

27
RESEARCHarXiv CS.LG·5/4/2026

AirFM-DDA: Air-Interface Foundation Model in the Delay-Doppler-Angle Domain for AI-Native 6G

AirFM-DDA introduces an Air-interface Foundation Model operating in the Delay-Doppler-Angle (DDA) domain for AI-native 6G physical layer tasks. This model reparameterizes channel state information from the space-time-frequency domain to explicitly resolve multipath components, overcoming the computational overhead of existing global attention mechanisms.

27
RESEARCHarXiv CS.LG·13d ago

AirCast-SR: A Foundation Model for Kilometer-Scale Atmospheric Super-Resolution via Latent Consistency Diffusion

AirCast-SR introduces a foundation model for atmospheric super-resolution, downscaling global AI weather forecasts from 28 km to 1 km resolution for 67-hour predictions of eight surface variables. Utilizing a 3D U-Net within a Latent Consistency Model diffusion framework, it addresses computational limits of traditional NWP models.

27