Adaptive Context-Aware Embedding Filtering: Balancing Factual Accuracy and Creative Freedom in Language Model Decoding

Tanvi

doi:10.5281/zenodo.20573421

Independent Research · Preprint

A-CAEF: Adaptive Context-Aware Embedding Filtering for Factuality-Aware Language Model Decoding

A lightweight decoding-time method using context-aware semantic drift signals and entropy-aware adaptation to guide token selection in large language models.

Published preprint title: Adaptive Context-Aware Embedding Filtering: Balancing Factual Accuracy and Creative Freedom in Language Model Decoding

Tanvi Published 6 June 2026 Independent Research

Open research

Read the preprint and code

Read the preprint on Zenodo Zenodo record Code repository

DOI: https://doi.org/10.5281/zenodo.20573421

Abstract

A decoding-time approach to factual reliability

A-CAEF, or Adaptive Context-Aware Embedding Filtering, is a lightweight decoding-time method for improving factual reliability in large language models. It uses context-aware semantic drift signals and entropy-aware adaptation to guide token selection without retraining the model, modifying model parameters, or relying on external retrieval.

This preprint introduces A-CAEF as an inference-time strategy for reducing factual drift while preserving creative flexibility in language model generation. The paper reports preliminary evaluation on TruthfulQA using a Qwen-based decoder model and compares A-CAEF against nucleus sampling.

In this setting, A-CAEF improves the joint TruthInfo metric by 6.67 absolute points, corresponding to a 28.6% relative gain over the baseline, while maintaining similar ROUGE-L and BLEU scores.

Scope

What this early result does and does not claim

The evaluation is preliminary and limited to the reported TruthfulQA setting and Qwen-based decoder model. Broader validation across models, datasets and decoding baselines remains future work.

A-CAEF is not presented as a complete solution to hallucination. The narrower question is whether context-aware, entropy-adaptive constraints can improve factual-informativeness at decoding time without retraining or external retrieval.

Citation

Cite this preprint

Tanvi. (2026). Adaptive Context-Aware Embedding Filtering: Balancing Factual Accuracy and Creative Freedom in Language Model Decoding (Version 1). Zenodo. https://doi.org/10.5281/zenodo.20573421

BibTeX

@misc{tanvi2026acaef,
  author = {Tanvi},
  title = {Adaptive Context-Aware Embedding Filtering: Balancing Factual Accuracy and Creative Freedom in Language Model Decoding},
  year = {2026},
  publisher = {Zenodo},
  version = {1},
  doi = {10.5281/zenodo.20573421},
  url = {https://doi.org/10.5281/zenodo.20573421}
}

Reader response

Did this leave you with a thought?

Continue the conversation with me on LinkedIn.

Message me on LinkedIn