RESEARCH27

ViLegalNLI: Natural Language Inference for Vietnamese Legal Texts

arXiv CS.CL·May 4, 2026

This article introduces ViLegalNLI, the first large-scale Vietnamese Natural Language Inference (NLI) dataset specifically constructed for the legal domain. It consists of 42,012 premise-hypothesis pairs derived from official statutory documents, developed using a semi-automatic framework that integrates large language models for hypothesis generation and quality validation.

Dataset Legal AI Natural Language Inference Vietnamese NLI large language models

Read original ↗