Semantically Enriched Text Generation for QA through Dense Paraphrasing

Published in ICNLSP 2024, 2024

Large Language Models (LLMs) are very effective at extractive language tasks such as Question Answering (QA). While LLMs can improve their performance on these tasks through increases in model size (via massive pretraining) and/or iterative on-the-job training (oneshot, few-shot, chain-of-thought), we explore what other less resource-intensive and more efficient types of data augmentation can be applied to obtain similar boosts in performance. We define multiple forms of Dense Paraphrasing (DP) and obtain DP-enriched versions of different contexts. We demonstrate that performing QA using these semantically enriched contexts leads to increased performance on models of various sizes and across task domains, without needing to increase model size.

Timothy Obiso, Bingyang Ye, Kyeongmin Rim, and James Pustejovsky. 2024. Semantically Enriched Text Generation for QA through Dense Paraphrasing. In Proceedings of the 7th International Conference on Natural Language and Speech Processing (ICNLSP 2024), pages 279–286, Trento. Association for Computational Linguistics.

https://aclanthology.org/2024.icnlsp-1.30/