DOI QR코드

DOI QR Code

Emotion Recognition in Arabic Speech from Saudi Dialect Corpus Using Machine Learning and Deep Learning Algorithms

  • Hanaa Alamri (Department of Computer Science, College of Computer Science and Information System, Umm Al-Qura University) ;
  • Hanan S. Alshanbari (Department of Computer Science, College of Computer Science and Information System, Umm Al-Qura University)
  • Received : 2023.08.05
  • Published : 2023.08.30

Abstract

Speech can actively elicit feelings and attitudes by using words. It is important for researchers to identify the emotional content contained in speech signals as well as the sort of emotion that resulted from the speech that was made. In this study, we studied the emotion recognition system using a database in Arabic, especially in the Saudi dialect, the database is from a YouTube channel called Telfaz11, The four emotions that were examined were anger, happiness, sadness, and neutral. In our experiments, we extracted features from audio signals, such as Mel Frequency Cepstral Coefficient (MFCC) and Zero-Crossing Rate (ZCR), then we classified emotions using many classification algorithms such as machine learning algorithms (Support Vector Machine (SVM) and K-Nearest Neighbor (KNN)) and deep learning algorithms such as (Convolution Neural Network (CNN) and Long Short-Term Memory (LSTM)). Our Experiments showed that the MFCC feature extraction method and CNN model obtained the best accuracy result with 95%, proving the effectiveness of this classification system in recognizing Arabic spoken emotions.

Keywords

References

  1. O. A. Mohammad and M. Elhadef, "Arabic speech emotion recognition method based on LPC and PPSD," in 2021 2nd International Conference on Computation, Automation and Knowledge Management (ICCAKM), 2021. 
  2. B. H. Su and C. C. Lee, "Unsupervised Cross-Corpus Speech Emotion Recognition Using a Multi-Source Cycle-GAN," IEEE Transactions on Affective Computing, 2022. 
  3. A.H. Meftah, M. A. Qamhan, Y. Seddiq, Y. A. Alotaibi, and S. A. Selouani, "King Saud university emotions corpus: Construction, analysis, evaluation, and comparison," IEEE Access, vol. 9, pp. 54201-54219, 2021.  https://doi.org/10.1109/ACCESS.2021.3070751
  4. X. Huang, J. Baker, and R. Reddy, "A historical perspective of speech recognition, " Commun," Commun. ACM, vol. 57, no. 1, pp. 94-103, 2014.  https://doi.org/10.1145/2500887
  5. S. Basu, J. Chakraborty, A. Bag, and M. Aftabuddin, "A review on emotion recognition using speech," in 2017 International Conference on Inventive Communication and Computational Technologies (ICICCT), 2017. 
  6. B. Zhang, C. Quan, and F. Ren, "Study on CNN in the recognition of emotion in audio and images," in 2016 IEEE/ACIS 15th International Conference on Computer and Information Science (ICIS), 2016. 
  7. Y. Hifny and A. Ali, "Efficient Arabic emotion recognition using deep neural networks," in ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2019. 
  8. R. H. Aljuhani, A. Alshutayri, and S. Alahdal, "Arabic speech emotion recognition from Saudi dialect corpus," IEEE Access, vol. 9, pp. 127081-127085, 2021.  https://doi.org/10.1109/ACCESS.2021.3110992
  9. M. Meddeb, H. Karray, and A. M. Alimi, "Building and analysing emotion corpus of the Arabic speech," in 2017 1st International Workshop on Arabic Script Analysis and Recognition (ASAR), 2017. 
  10. Telfaz11. Youtube Channel. [Online]. Available:https://www.youtube.com/user/telfaz11 
  11. A.Milton, S. Sharmy Roy, and S. Tamil Selvi, "SVM scheme for speech emotion recognition using MFCC feature," Int. J. Comput. Appl., vol. 69, no. 9, pp. 34-39, 2013.  https://doi.org/10.5120/11872-7667
  12. A. Torres Garcia, C. A. Reyes Garcia, L. Villasenor-Pineda, and O. Mendoza-Montoya, Eds., Biosignal processing and classification using computational learning and intelligence: Principles, algorithms, and applications. San Diego, CA: Academic Press, 2021. 
  13. "Project jupyter," Jupyter.org. [Online]. Available: https://jupyter.org/. [Accessed: 25-Jul-2022]. 
  14. G. Varoquaux, L. Buitinck, G. Louppe, O. Grisel, F. Pedregosa, and A. Mueller, "Scikit-learn: Machine learning without learning the machinery, " GetMob," GetMob. Mob. Comput. Commun, vol. 19, no. 1, pp. 29-33, 2015.  https://doi.org/10.1145/2786984.2786995
  15. Keras.io. [Online]. Available: https://keras.io/. [Accessed: 25-Jul-2022]. 
  16. Mcfee, ""librosa: Audio and Music Signal Analysis in Python," in Proceedings of the 14th Python in Science Conference, 2015. 
  17. "Matplotlib - visualization with python," Matplotlib.org. [Online]. Available: https://matplotlib.org/. [Accessed: 25-Jul-2022]