LAPSE:2023.4341
Published Article

LAPSE:2023.4341
Semi-Natural and Spontaneous Speech Recognition Using Deep Neural Networks with Hybrid Features Unification
February 22, 2023
Abstract
Recently, identifying speech emotions in a spontaneous database has been a complex and demanding study area. This research presents an entirely new approach for recognizing semi-natural and spontaneous speech emotions with multiple feature fusion and deep neural networks (DNN). A proposed framework extracts the most discriminative features from hybrid acoustic feature sets. However, these feature sets may contain duplicate and irrelevant information, leading to inadequate emotional identification. Therefore, an support vector machine (SVM) algorithm is utilized to identify the most discriminative audio feature map after obtaining the relevant features learned by the fusion approach. We investigated our approach utilizing the eNTERFACE05 and BAUM-1s benchmark databases and observed a significant identification accuracy of 76% for a speaker-independent experiment with SVM and 59% accuracy with, respectively. Furthermore, experiments on the eNTERFACE05 and BAUM-1s dataset indicate that the suggested framework outperformed current state-of-the-art techniques on the semi-natural and spontaneous datasets.
Recently, identifying speech emotions in a spontaneous database has been a complex and demanding study area. This research presents an entirely new approach for recognizing semi-natural and spontaneous speech emotions with multiple feature fusion and deep neural networks (DNN). A proposed framework extracts the most discriminative features from hybrid acoustic feature sets. However, these feature sets may contain duplicate and irrelevant information, leading to inadequate emotional identification. Therefore, an support vector machine (SVM) algorithm is utilized to identify the most discriminative audio feature map after obtaining the relevant features learned by the fusion approach. We investigated our approach utilizing the eNTERFACE05 and BAUM-1s benchmark databases and observed a significant identification accuracy of 76% for a speaker-independent experiment with SVM and 59% accuracy with, respectively. Furthermore, experiments on the eNTERFACE05 and BAUM-1s dataset indicate that the suggested framework outperformed current state-of-the-art techniques on the semi-natural and spontaneous datasets.
Record ID
Keywords
multiple feature fusion, semi-natural database, speech emotion recognition, spontaneous database, support vector machine
Suggested Citation
Amjad A, Khan L, Chang HT. Semi-Natural and Spontaneous Speech Recognition Using Deep Neural Networks with Hybrid Features Unification. (2023). LAPSE:2023.4341
Author Affiliations
Amjad A: Department of Computer Science and Information Engineering, Chang Gung University, Guishan District, Taoyuan City 33302, Taiwan [ORCID]
Khan L: Department of Computer Science and Information Engineering, Chang Gung University, Guishan District, Taoyuan City 33302, Taiwan [ORCID]
Chang HT: Department of Computer Science and Information Engineering, Chang Gung University, Guishan District, Taoyuan City 33302, Taiwan; Department of Physical Medicine and Rehabilitation, Chang Gung Memorial Hospital, Guishan District, Taoyuan City 33302, Taiwan [ORCID]
Khan L: Department of Computer Science and Information Engineering, Chang Gung University, Guishan District, Taoyuan City 33302, Taiwan [ORCID]
Chang HT: Department of Computer Science and Information Engineering, Chang Gung University, Guishan District, Taoyuan City 33302, Taiwan; Department of Physical Medicine and Rehabilitation, Chang Gung Memorial Hospital, Guishan District, Taoyuan City 33302, Taiwan [ORCID]
Journal Name
Processes
Volume
9
Issue
12
First Page
2286
Year
2021
Publication Date
2021-12-20
ISSN
2227-9717
Version Comments
Original Submission
Other Meta
PII: pr9122286, Publication Type: Journal Article
Record Map
Published Article

LAPSE:2023.4341
This Record
External Link

https://doi.org/10.3390/pr9122286
Publisher Version
Download
Meta
Record Statistics
Record Views
182
Version History
[v1] (Original Submission)
Feb 22, 2023
Verified by curator on
Feb 22, 2023
This Version Number
v1
Citations
Most Recent
This Version
URL Here
https://psecommunity.org/LAPSE:2023.4341
Record Owner
Auto Uploader for LAPSE
Links to Related Works
(1.02 seconds) 0.04 + 0.03 + 0.45 + 0.15 + 0 + 0.15 + 0.1 + 0 + 0.04 + 0.05 + 0 + 0
