RESEARCHarXiv CS.CL·25d ago
VectraYX-Nano: A 42M-Parameter Spanish Cybersecurity Language Model with Curriculum Learning and Native Tool Use
VectraYX-Nano is a 42M-parameter Spanish language model specifically developed for cybersecurity with a Latin-American focus and native tool invocation. This research details its training from scratch, including a custom 170M-token Spanish corpus, a specific Transformer architecture, and a curriculum learning approach with replay.
27