Speech Emotion Recognition Using Audio Matching

Chaturvedi, Iti, Noel, Tim, and Satapathy, Ranjan (2022) Speech Emotion Recognition Using Audio Matching. Electronics, 11 (23). 3943.

Preview

PDF (Published Version) - Published Version
Available under License Creative Commons Attribution.
Download (2MB) | Preview

DOI: 10.3390/electronics11233943

View at Publisher Website: https://doi.org/10.3390/electronics11233...

626

Abstract

It has become popular for people to share their opinions about products on TikTok and YouTube. Automatic sentiment extraction on a particular product can assist users in making buying decisions. For videos in languages such as Spanish, the tone of voice can be used to determine sentiments, since the translation is often unknown. In this paper, we propose a novel algorithm to classify sentiments in speech in the presence of environmental noise. Traditional models rely on pretrained audio feature extractors for humans that do not generalize well across different accents. In this paper, we leverage the vector space of emotional concepts where words with similar meanings often have the same prefix. For example, words starting with ‘con’ or ‘ab’ signify absence and hence negative sentiments. Augmentations are a popular way to amplify the training data during audio classification. However, some augmentations may result in a loss of accuracy. Hence, we propose a new metric based on eigenvalues to select the best augmentations. We evaluate the proposed approach on emotions in YouTube videos and outperform baselines in the range of 10–20%. Each neuron learns words with similar pronunciations and emotions. We also use the model to determine the presence of birds from audio recordings in the city.


Item ID:	76882
Item Type:	Article (Research - C1)
ISSN:	2079-9292
Copyright Information:	© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Date Deposited:	29 Nov 2022 23:52
FoR Codes:	46 INFORMATION AND COMPUTING SCIENCES > 4602 Artificial intelligence > 460299 Artificial intelligence not elsewhere classified @ 50% 41 ENVIRONMENTAL SCIENCES > 4104 Environmental management > 410407 Wildlife and habitat management @ 25% 46 INFORMATION AND COMPUTING SCIENCES > 4609 Information systems > 460999 Information systems not elsewhere classified @ 25%
SEO Codes:	28 EXPANDING KNOWLEDGE > 2801 Expanding knowledge > 280115 Expanding knowledge in the information and computing sciences @ 20% 22 INFORMATION AND COMMUNICATION SERVICES > 2204 Information systems, technologies and services > 220403 Artificial intelligence @ 50% 22 INFORMATION AND COMMUNICATION SERVICES > 2204 Information systems, technologies and services > 220408 Information systems @ 30%
Downloads:	Total: 626 Last 12 Months: 8
	More Statistics

Actions (Repository Staff Only)

Item Control Page