RESEARCH27
jina-embeddings-v5-omni: Geometry-preserving Embeddings via Locked Aligned Towers
arXiv CS.CLΒ·May 12, 2026
This work introduces GELATO, a novel approach to multimodal embedding models that extends VLM-style architectures. It results in the jina-embeddings-v5-omni suite, which efficiently encodes text, image, audio, and video into a single semantic embedding space by freezing backbone text models and training only connecting components.
Read original β