Search | Korea Science

이순요;권규식;김홍태
- Journal of the Ergonomics Society of Korea
- /
- v.10 no.1
- /
- pp.3-10
- /
- 1991
The teach pendant and keyboard have been used as an input device of control command in human-robot sustem. But, many problems occur in case that the usef is a novice. So, speech recognition system is required to communicate between a human and the robot. In this study, Korean voice commands, eitht robot commands, and ten digits based on the broad phonetic analysis are described. Applying broad phonetic analysis, phonemes of voice commands are divided into phoneme groups, such as plosive, fricative, affricative, nasal, and glide sound, having similar features. And then, the feature parameters and their ranges to detect phoneme groups are found by minimax method. Classification rules are consisted of combination of the feature parameters, such as zero corssing rate(ZCR), log engery(LE), up and down(UD), formant frequency, and their ranges. Voice commands were recognized by the classification rules. The recognition rate was over 90 percent in this experiment. Also, this experiment showed that the recognition rate about digits was better than that about robot commands.
PDF

Kim, Seong-Rock;Park, Jae-Suk;Park, Ju-Hyun;Lee, Suk-Gyu
- 제어로봇시스템학회:학술대회논문집
- /
- 2004.08a
- /
- pp.1413-1415
- /
- 2004
This paper proposes a SVM(Support Vector Machine) training algorithm to control a service robot with voice command. The service robot with a stereo vision system and dual manipulators of four degrees of freedom implements a User-Dependent Voice Control System. The training of SVM algorithm that is one of the statistical learning theories leads to a QP(quadratic programming) problem. In this paper, we present an efficient SVM speech recognition scheme especially based on less learning data comparing with conventional approaches. SVM discriminator decides rejection or acceptance of user's extracted voice features by the MFCC(Mel Frequency Cepstrum Coefficient). Among several SVM kernels, the exponential RBF function gives the best classification and the accurate user recognition. The numerical simulation and the experiment verified the usefulness of the proposed algorithm.
PDF

Hur, Hwa-La;Kim, Tae-Sun;Park, Myeong-Chul
- Journal of the Korea Society of Computer and Information
- /
- v.26 no.9
- /
- pp.57-64
- /
- 2021
In this paper, we propose a device that can control the primary control surface of an aircraft by recognizing speech commands. The speech command consists of 19 commands, and a learning model is constructed based on a total of 2,500 datasets. The training model is composed of a CNN model using the Sequential library of the TensorFlow-based Keras model, and the speech file used for training uses the MFCC algorithm to extract features. The learning model consists of two convolution layers for feature recognition and Fully Connected Layer for classification consists of two dense layers. The accuracy of the validation dataset was 98.4%, and the performance evaluation of the test dataset showed an accuracy of 97.6%. In addition, it was confirmed that the operation was performed normally by designing and implementing a Raspberry Pi-based control device. In the future, it can be used as a virtual training environment in the field of voice recognition automatic flight and aviation maintenance.
https://doi.org/10.9708/jksci.2021.26.09.057 인용 PDF KSCI HTML

Seo, Seong-gwan;Mun, Hyunjun;Son, Baehoon;Yun, Joobeom
- Journal of the Korea Institute of Information Security & Cryptology
- /
- v.32 no.1
- /
- pp.89-98
- /
- 2022
As deep learning technology was applied to various fields, research on adversarial attack techniques, a security problem of deep learning models, was actively studied. adversarial attacks have been mainly studied in the field of images. Recently, they have even developed a complete decision-based attack technique that can attack with just the classification results of the model. However, in the case of the audio field, research is relatively slow. In this paper, we applied several decision-based attack techniques to the audio field and improved state-of-the-art attack techniques. State-of-the-art decision-attack techniques have the disadvantage of requiring many queries for gradient approximation. In this paper, we improve query efficiency by proposing a method of reducing the vector search space required for gradient approximation. Experimental results showed that the attack success rate was increased by 50%, and the difference between original audio and adversarial examples was reduced by 75%, proving that our method could generate adversarial examples with smaller noise.
https://doi.org/10.13089/JKIISC.2022.32.1.89 인용 PDF KSCI HTML