Learning a strategy for preference elicitation in conversational recommender systems

Makarova, Aleksandra; Shahzad, Muhammad; Hong, Xia; Lester, Martin

Download

[thumbnail of m92428-makarova final.pdf]

Preview

Text
- Accepted Version

Advice

Please see our End User Agreement.

It is advisable to refer to the publisher's version if you intend to cite from this work. See Guidance on citing.

Tools

Lists

Makarova, A., Shahzad, M. ORCID: https://orcid.org/0009-0002-9394-343X, Hong, X. ORCID: https://orcid.org/0000-0002-6832-2298 and Lester, M. ORCID: https://orcid.org/0000-0002-2323-1771 (2024) Learning a strategy for preference elicitation in conversational recommender systems. In: International Joint Conference on Neural Networks (IJCNN), 30 Jun - 5 Jul 2024, Yokohama, Japan. doi: 10.1109/ijcnn60899.2024.10650365

Abstract/Summary

This paper delves into the information elicitation aspect of Conversational Recommender Systems (CRS), presenting an innovative method of selecting chatbot questions that result in the highest information gain when reconstructing the preference profile of a user, which allows one to achieve high-quality recommendations after a small number of conversational interactions. The proposed system comprises a Recommendation Module and a Preference Elicitation Module. The Recommendation Module leverages a Long Short-Term Memory (LSTM) network with an Attention mechanism and is optimised to reconstruct the preference profiles of new users based on limited information gathered through dialogue. The Preference Elicitation Module is trained using a reinforcement learning technique known as bot-play, where the Questioner Bot proactively prompts the Answerer Bot to provide item and attribute ratings, leveraging the reduction in the Recommendation model’s loss as a reward signal. This enables the model to learn an optimal questioning strategy, thereby maximising the accuracy of the representation of the user profile and the relevance of recommendations. The experimental results demonstrate the ability of the Recommendation component to learn item-attribute mappings, enabling the Questioner Bot to make accurate rating predictions with only a limited number of answered questions. Moreover, the trained Preference Elicitation policy model consistently outperforms the baseline model across both synthetic and real-world datasets, showcasing its ability to minimise the number of conversational turns required to achieve accurate recommendations.

Altmetric Badge

Item Type	Conference or Workshop Item (Paper)
URI	https://centaur.reading.ac.uk/id/eprint/116044
Identification Number/DOI	10.1109/ijcnn60899.2024.10650365
Refereed	Yes
Divisions	Science > School of Mathematical, Physical and Computational Sciences > Department of Computer Science
Download/View statistics	View download statistics for this item

Download Statistics

Downloads

Downloads per month over past year

Deposit Details

University Staff: Request a correction | Centaur Editors: Update this record

Date Deposited:	22 Apr 2024 14:26	Date item deposited into CentAUR
Last Modified:	30 Sep 2025 13:45	Date item last modified