Accent and gender recognition from English language speech and audio using signal processing and deep learningShergill, J. S., Pravin, C. and Ojha, V. ORCID: https://orcid.org/0000-0002-9256-1192 (2021) Accent and gender recognition from English language speech and audio using signal processing and deep learning. In: International Conference on Hybrid Intelligent Systems, 14-16 Dec 2020, pp. 62-72, https://doi.org/10.1007/978-3-030-73050-5_7.
It is advisable to refer to the publisher's version if you intend to cite from this work. See Guidance on citing. To link to this item DOI: 10.1007/978-3-030-73050-5_7 Abstract/SummaryThis research is concerned with taking user input in the form of speech data to classify and then predict which region of the United Kingdom the user is from and their gender. This research was conducted on regional accents, data preprocessing, Fourier transforms, and deep learning modeling. Due to the lack of publicly available datasets for this type of research, a dataset was created from scratch (12 regions with a 1:1 gender ratio). In this paper, we propose modeling the human’s voice accent and voice gender recognition as a classification task. We used a deep convolution neural network, and experimentally developed an architecture that maximized the classification accuracy of the mentioned tasks simultaneously. We also tested the model on publicly available spoken digit detests. We find that the gender classification is relatively easier to predict with high accuracy than the accent in our proposed multi-class classification model. Accent classification was found difficult because of the regional accent’s overlapping that prevents it from being classified with high accuracy.
Download Statistics DownloadsDownloads per month over past year Altmetric Deposit Details University Staff: Request a correction | Centaur Editors: Update this record |