SharP: Soft and hard prompt-guided augmentation with LLM for low resource fake news detection

[thumbnail of SharP_Manuscript_ v8.pdf]
Text
- Accepted Version
· Restricted to Repository staff only until 19 January 2027.
· Available under License Creative Commons Attribution Non-commercial No Derivatives.

Please see our End User Agreement.

It is advisable to refer to the publisher's version if you intend to cite from this work. See Guidance on citing.

Add to AnyAdd to TwitterAdd to FacebookAdd to LinkedinAdd to PinterestAdd to Email

Luo, Z., Li, W., Huang, H., Liu, K. and Gao, M. (2026) SharP: Soft and hard prompt-guided augmentation with LLM for low resource fake news detection. Expert Systems with Applications, 309. ISSN 0957-4174 doi: 10.1016/j.eswa.2026.131178

Abstract/Summary

Fake news detection under low resource conditions is challenged by the scarcity of labeled data and the difficulty of capturing subtle deceptive patterns. Existing data augmentation methods, such as synonym substitution, generative adversarial networks, and large language model (LLM)-based generation, often produce fluent but overly generic content, lacking the task-specific relevance needed for accurate detection. To address this, we propose SharP, a boundary-aware text generation framework guided by soft and hard prompts, which leverages LLMs as tools adapted to produce task-specific synthetic samples. SharP employs a dual prompt mechanism: hard prompts provide structured guidance and factual constraints, while soft prompts introduce learned semantic patterns from the data. These two types of prompts are combined to guide the generation process, enabling the model to produce samples that closely align with the characteristics of news. To further enhance detection performance, we incorporate a boundary-aware strategy that steers generation toward areas where the classifier is less confident, helping to clarify subtle distinctions between real and fake content. In addition, we adopt a dual objective optimization that balances semantic alignment with the source data and classifier feedback. This encourages the generation of samples that are both domain-consistent and helpful for refining decision boundaries, ultimately improving model robustness and generalization in low resource fake news detection.

Altmetric Badge

Dimensions Badge

Item Type Article
URI https://centaur.reading.ac.uk/id/eprint/128036
Identification Number/DOI 10.1016/j.eswa.2026.131178
Refereed Yes
Divisions Henley Business School > Digitalisation, Marketing and Entrepreneurship
Publisher Elsevier
Download/View statistics View download statistics for this item

University Staff: Request a correction | Centaur Editors: Update this record