Browse > Article
http://dx.doi.org/10.33851/JMIS.2020.7.1.33

Machine Learning Techniques for Speech Recognition using the Magnitude  

Krishnan, C. Gopala (Department of Computer Science and Engineering, Francis Xavier Engineering College)
Robinson, Y. Harold (School of Information Technology and Engineering, Vellore Institute of Technology)
Chilamkurti, Naveen (Department of Computer Science and IT, La Trobe University)
Publication Information
Journal of Multimedia Information System / v.7, no.1, 2020 , pp. 33-40 More about this Journal
Abstract
Machine learning consists of supervised and unsupervised learning among which supervised learning is used for the speech recognition objectives. Supervised learning is the Data mining task of inferring a function from labeled training data. Speech recognition is the current trend that has gained focus over the decades. Most automation technologies use speech and speech recognition for various perspectives. This paper demonstrates an overview of major technological standpoint and gratitude of the elementary development of speech recognition and provides impression method has been developed in every stage of speech recognition using supervised learning. The project will use DNN to recognize speeches using magnitudes with large datasets.
Keywords
Deep neural network; Magnitude; Speech recognition; Supervised learning;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 Balaji, S., Golden Julie, E., Harold Robinson, Y.: Development of Fuzzy based Energy Efficient Cluster Routing Protocol to Increase the Lifetime of Wireless Sensor Networks, Mobile Networks and Applications, vol. 24, no. 2, pp. 394-406, 2019.   DOI
2 Hsu, W.N., Y. Zhang, A Lee and J.R Glass, "Exploiting depth and highway connections in convolutional recurrent deep neural networks for speech recognition," in Proceedings of the International Conference on Interspeech, University of California, San Francisco, California, USA, pp. 395-399. 2016
3 Pahini A. Trivedi, "Introduction to Various Algorithms of Speech Recognition: Hidden Markov Model, Dynamic Time Warping and Artificial Neural Networks," International Journal of Engineering Development and Research, Volume 2, Issue 4, 2014.
4 M. S. Hossain and G. Muhammad, "Emotion recognition using deep learning approach from audiovisual emotional big data," Inf. Fusion, vol. 49, pp. 6978, Sep. 2019.
5 M. Chen, P. Zhou, and G. Fortino, "Emotion communication system," IEEE Access, vol. 5, pp. 326337, 2016.
6 Ondruska P., J. Dequaire, D. Z. Wang and Posner, End-to-end tracking and semantic segmentation using recwrent neural networks. Master Thesis, Cornell University, Ithaca, New York, USA, 2016.
7 N. D. Lane and P. Georgiev, "Can deep learning revolutionize mobile sensing?" in Proc. ACM 16th Int. Workshop Mobile Comput. Syst. Appl., 2015, pp. 117122.
8 J. G. Razuri, D. Sundgren, R. Rahmani, A. Moran, I. Bonet, and A. Larsson, "Speech emotion recognition in emotional feedback for human-robot interaction," Int. J. Adv. Res. Artif. Intell., vol. 4, no. 2, pp. 2027, 2015.
9 Subramanian Balaji, Yesudhas Harold Robinson, Enoch Golden Julie, "GBMS: A New Centralized Graph Based Mirror System Approach to Prevent Evaders for Data Handling with Arithmetic Coding in Wireless Sensor Networks," Ingenierie des Systemes d'Information, vol. 24, no. 5, pp. 481-490, 2019.   DOI
10 Orozco, I., M.E. Buemi and J.J. Berlles, "A study on pedestrian detection using a deep convolutional neural network," in Proceedings of the International Conference on Pattern Recognition Systems (ICPRS-16), April 20-22, 2016, IET, Talca, Chile, ISBN: 978-1-78561-283-1, pp. 1-15, 2016.
11 P. S. Apirajitha, C. Gopala Krishnan, G. Aravind Swaminathan, E. Manohar, "Enhanced Secure User Data on Cloud using Cloud Data Centre Computing and Decoy Technique," International Journal of Innovative Technology and Exploring Engineering (IJITEE), vol. 8, no. 9, pp. 1436-1439, July 2019.
12 C. G. Krishnan, K. Sivakumar and E. Manohar, "An Enhanced Method to Secure and Energy Effective Data Transfer in WSN using Hierarchical and Dynamic Elliptic Curve Cryptosystem," in Proceeding of 2018 International Conference on Smart Systems and Inventive Technology (ICSSIT), Tirunelveli, India, pp. 1-7, 2018.
13 C. Gopala Krishnan, A. Renga Rajan, R. Manikandan, "Delay Reduction by Providing Location Based Services using Hybrid Cache in peer to peer Networks," KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, vol. 9, no. 6, pp. 2078-2094, Jun. 2015   DOI
14 Harold Robinson, Y, Balaji, S., Golden Julie, E.: FPSOEE: Fuzzy-enabled particle swarm optimization-based energy-efficient algorithm in mobile ad-hoc networks, Journal of Intelligent & Fuzzy Systems, vol. 36, no. 4, pp. 3541-3553, 2019.   DOI
15 Yoshioka, T., K. Ohnishi, F. Fang and T. Nakatani, "Noise robust speech recognition using recent developments in neural networks for computer vision," in Proceedings of the 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Shanghai, China, ISBN:978-1-4799-9988-0, pp. 5730-5734, 2016.
16 M. Poongodi, Gopalakrishnan, Vijayakumar and Imad Fakhri Al-Shaikhli, "An Effective Energy Based Location Optimization in Wireless Sensor Networks in Structural Health Monitoring," International Journal of Pure and Applied Mathematics, vol. 116, no. 22, pp. 275-288, 2017.
17 D. Le and E. M. Provost, "Emotion recognition from spontaneous speech using hidden MARKOV models with deep belief networks," in Proceedings of IEEE Workshop Autom. Speech Recognit. Understand., pp. 216221, 2013.
18 H. M. Fayek, M. Lech, and L. Cavedon, "Evaluating deep learning architectures for speech emotion recognition," Neural Netw., vol. 92, pp. 6068, Aug. 2017.
19 Q. Mao, G. Xu, W. Xue, J. Gou, and Y. Zhan, "Learning emotion discriminative and domain-invariant features for domain adaptation in speech emotion recognition,"' Speech Commun., vol. 93, pp. 110, Oct. 2017.
20 S. Zhang, S. Zhang, T. Huang, and W. Gao, "Speech emotion recognition using deep convolutional neural network and discriminant temporal pyramid matching,'" IEEE Trans. Multimedia, vol. 20, no. 6, pp. 15761590, Oct. 2017.
21 Sercu T., C. Puhrsch, B. Kingsbury and Y. LeCun, "Very deep multilingual convolutional neural networks for LVCSR," in Proceedings of the 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Shanghai, China, ISBN:978-1-4799-9988-0, pp. 4955-4959, 2016.
22 J. Deng, X. Xu, Z. Zhang, and S. Fruhholz, and B. Schuller, "Universum autoencoder-based domain adaptation for speech emotion recognition," IEEE Signal Process. Lett., vol. 24, no. 4, pp. 500504, 2017.
23 Qian, Y. and P.C. Woodland, "Very deep convolutional neural networks for robust speech recognition," in Proceedings of the 2016 IEEE International Workshop on Spoken Language Technology (SLT), San Diego, USA, ISBN:978-1-5090-4903-5, pp. 481-488, 2016.
24 B. W. Schuller, "Speech emotion recognition: Two decades in a nutshell, benchmarks, and ongoing trends," Commun. ACM, vol. 61, no. 5, pp. 9099, 2018.   DOI
25 S. Mirsamadi, E. Barsoum, and C. Zhang, "Automatic speech emotion recognition using recurrent neural networks with local attention," in Proceedings of the 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), New Orleans, LA, pp. 2227-2231, 2017.
26 Ji-Hae Kim, Byung-Gyu Kim, Partha Pratim Roy, Da-Mi Jeong "Efficient Facial Expression Recognition Algorithm Based on Hierarchical Deep Neural Network Structure," IEEE Access, vol. 7, pp. 41273-41285, 2019.   DOI
27 Ji-Hae Kim, Gwang-Soo Hong, Byung-Gyu Kim, Debi P. Dogra, "deepGesture: Deep Learning-based Gesture Recognition Scheme using Motion Sensors," Displays, vol. 55, pp. 38-45, 2018.   DOI