Learning a strategy for preference elicitation in conversational recommender systemsMakarova, A., Shahzad, M. ORCID: https://orcid.org/0009-0002-9394-343X, Hong, X. ORCID: https://orcid.org/0000-0002-6832-2298 and Lester, M. ORCID: https://orcid.org/0000-0002-2323-1771 (2024) Learning a strategy for preference elicitation in conversational recommender systems. In: International Joint Conference on Neural Networks (IJCNN), 30 Jun - 5 Jul 2024, Yokohama, Japan, https://doi.org/10.1109/ijcnn60899.2024.10650365.
It is advisable to refer to the publisher's version if you intend to cite from this work. See Guidance on citing. To link to this item DOI: 10.1109/ijcnn60899.2024.10650365 Abstract/SummaryThis paper delves into the information elicitation aspect of Conversational Recommender Systems (CRS), presenting an innovative method of selecting chatbot questions that result in the highest information gain when reconstructing the preference profile of a user, which allows one to achieve high-quality recommendations after a small number of conversational interactions. The proposed system comprises a Recommendation Module and a Preference Elicitation Module. The Recommendation Module leverages a Long Short-Term Memory (LSTM) network with an Attention mechanism and is optimised to reconstruct the preference profiles of new users based on limited information gathered through dialogue. The Preference Elicitation Module is trained using a reinforcement learning technique known as bot-play, where the Questioner Bot proactively prompts the Answerer Bot to provide item and attribute ratings, leveraging the reduction in the Recommendation model’s loss as a reward signal. This enables the model to learn an optimal questioning strategy, thereby maximising the accuracy of the representation of the user profile and the relevance of recommendations. The experimental results demonstrate the ability of the Recommendation component to learn item-attribute mappings, enabling the Questioner Bot to make accurate rating predictions with only a limited number of answered questions. Moreover, the trained Preference Elicitation policy model consistently outperforms the baseline model across both synthetic and real-world datasets, showcasing its ability to minimise the number of conversational turns required to achieve accurate recommendations.
Download Statistics DownloadsDownloads per month over past year Altmetric Deposit Details University Staff: Request a correction | Centaur Editors: Update this record |