Accessibility navigation


Learning a strategy for preference elicitation in conversational recommender systems

Makarova, A., Shahzad, M., Hong, X. and Lester, M. ORCID: https://orcid.org/0000-0002-2323-1771 (2024) Learning a strategy for preference elicitation in conversational recommender systems. In: IEEE World Congress on Computational Intelligence (IEEE WCCI 2024), 30 Jun - 5 Jul 2024, Yokohama, Japan. (In Press)

[img] Text - Accepted Version
· Restricted to Repository staff only
· The Copyright of this document has not been checked yet. This may affect its availability.

1MB

It is advisable to refer to the publisher's version if you intend to cite from this work. See Guidance on citing.

Abstract/Summary

This paper delves into the information elicitation aspect of Conversational Recommender Systems (CRS), presenting an innovative method of selecting chatbot questions that result in the highest information gain when reconstructing the preference profile of a user, which allows one to achieve high-quality recommendations after a small number of conversational interactions. The proposed system comprises a Recommendation Module and a Preference Elicitation Module. The Recommendation Module leverages a Long Short-Term Memory (LSTM) network with an Attention mechanism and is optimised to reconstruct the preference profiles of new users based on limited information gathered through dialogue. The Preference Elicitation Module is trained using a reinforcement learning technique known as bot-play, where the Questioner Bot proactively prompts the Answerer Bot to provide item and attribute ratings, leveraging the reduction in the Recommendation model’s loss as a reward signal. This enables the model to learn an optimal questioning strategy, thereby maximising the accuracy of the representation of the user profile and the relevance of recommendations. The experimental results demonstrate the ability of the Recommendation component to learn item-attribute mappings, enabling the Questioner Bot to make accurate rating predictions with only a limited number of answered questions. Moreover, the trained Preference Elicitation policy model consistently outperforms the baseline model across both synthetic and real-world datasets, showcasing its ability to minimise the number of conversational turns required to achieve accurate recommendations.

Item Type:Conference or Workshop Item (Paper)
Refereed:Yes
Divisions:Science > School of Mathematical, Physical and Computational Sciences > Department of Computer Science
ID Code:116044

University Staff: Request a correction | Centaur Editors: Update this record

Page navigation