Text-as-Signal: Quantitative Semantic Scoring with Embeddings, Logprobs, and Noise Reduction
This paper introduces a practical pipeline to convert text corpora into quantitative semantic signals, employing embeddings, logprob-based evaluation, and noise reduction. The case study applies six semantic dimensions to Portuguese news articles about AI, supporting AI engineering tasks such as corpus inspection and monitoring.