← heapsort-ai

Autoregressive Models

5 items

RESEARCHarXiv CS.CL·12d ago

From AR to Diffusion: Efficiently Adapting Large Language Models with Strictly Causal and Elastic Horizons

FLUID is a new framework designed to efficiently adapt Autoregressive (AR) backbones to the diffusion paradigm for parallel text generation. It enables initialization from GPT-style models and introduces an entropy-driven mechanism called Elastic Horizons, achieving state-of-the-art performance with significantly reduced training costs.

28
RESEARCHarXiv CS.CL·26d ago

Differences in Text Generated by Diffusion and Autoregressive Language Models

This research explores the intrinsic differences in text generated by Diffusion Language Models (DLMs) and Autoregressive Language Models (ARMs), finding that DLMs show lower n-gram entropy but higher semantic coherence and diversity. Controlled experiments reveal that DLM training objectives enhance coherence and diversity through bidirectional context, while decoding algorithms are responsible for entropy reduction.

27
RESEARCHarXiv CS.AI·24d ago

Conditional Attribute Estimation with Autoregressive Sequence Models

This research introduces Conditional Attribute Transformers, a novel method for jointly estimating next-token probability and an attribute's value conditional on each potential next token selection. This framework enables critical capabilities like per-token credit assignment and counterfactual analysis within a single forward pass, overcoming limitations of traditional generative models.

27