A novel supervised t-SNE based approach of viseme classification for automated lip reading

Full text not archived in this repository.

Please see our End User Agreement.

It is advisable to refer to the publisher's version if you intend to cite from this work. See Guidance on citing.

Add to AnyAdd to TwitterAdd to FacebookAdd to LinkedinAdd to PinterestAdd to Email

Fenghour, S., Chen, D., Hajderanj, L. ORCID: https://orcid.org/0009-0007-0445-3049, Weheliye, I. and Xiao, P. (2022) A novel supervised t-SNE based approach of viseme classification for automated lip reading. In: 2021 International Conference on Electrical, Computer and Energy Technologies (ICECET), 9-10 December 2021, Cape Town, South Africa. doi: 10.1109/ICECET52533.2021.9698534

Abstract/Summary

Convolutional Neural Networks (CNNs) are the most commonly used model for classifying speech segments represented in images, however, training a CNN-based classifier is usually time-consuming along with using an uncertain network topology. In this paper, a novel approach to viseme classification for automatic lip-reading is proposed. The main idea of the approach is to first map the original high dimensional imagery data into a two dimensional space by using Supervised t-Distributed Stochastic Neighbour Embedding, and then conduct classification in the low dimensional space. The effectiveness of the proposed approach has been demonstrated by classifying visemes of three different frame widths, with an average accuracy of 98.5%, 94.0%, and 82.1%, respectively. Correspondingly, in comparison, CNN-based classifiers have achieved an average accuracy of 66.2%, 75.2%, and 84.4%, respectively. In addition, the new approach has taken much less CPU time for training. The main contribution of this paper is the application of Supervised t-Distributed Stochastic Neighbour Embedding for feature extraction in lip-reading for viseme classification with varying durations with a comparison in performance to the use of a spatio-temporal CNN and analysis of how both approaches perform for varying durations.

Altmetric Badge

Item Type Conference or Workshop Item (Paper)
URI https://centaur.reading.ac.uk/id/eprint/122814
Identification Number/DOI 10.1109/ICECET52533.2021.9698534
Refereed Yes
Divisions Henley Business School > Digitalisation, Marketing and Entrepreneurship
Download/View statistics View download statistics for this item

University Staff: Request a correction | Centaur Editors: Update this record