RESEARCHarXiv CS.CL·20d ago
Improving Quantized Model Performance in Qualitative Analysis with Multi-Pass Prompt Verification
This research examines how various lower-bit quantization levels impact LLaMA-3.1's performance in qualitative analysis, noting that low-bit models often produce hallucinations. It proposes a quantization-aware multi-pass prompt verification method to enhance accuracy by systematically reducing hallucinations and filtering unreliable content.
28