Search | Korea Science

A Study on Vocabulary-Independent Continuous Speech Recognition System for Intelligent Home Network System (지능형 홈네트워크 시스템을 위한 가변어휘 연속음성인식시스템에 관한 연구)

Lee, Ho-Woong;Jeong, Hee-Suk
- The Journal of The Korea Institute of Intelligent Transport Systems
- /
- v.7 no.2
- /
- pp.37-42
- /
- 2008
In this paper, the vocabulary-independent continuous speech recognition system for speech control of intelligent home-network is presented. This study suggests a conversational scenario of continuous natural vocabulary based upon keywords for recognition on natural speech command, and a way of optimizing the recognition system by constructing a recognition system and database based upon keywords.
PDF

Automatic Recognition of Pitch Accents Using Time-Delay Recurrent Neural Network (시간지연 회귀 신경회로망을 이용한 피치 악센트 인식)

Kim, Sung-Suk;Kim, Chul;Lee, Wan-Joo
- The Journal of the Acoustical Society of Korea
- /
- v.23 no.4E
- /
- pp.112-119
- /
- 2004
This paper presents a method for the automatic recognition of pitch accents with no prior knowledge about the phonetic content of the signal (no knowledge of word or phoneme boundaries or of phoneme labels). The recognition algorithm used in this paper is a time-delay recurrent neural network (TDRNN). A TDRNN is a neural network classier with two different representations of dynamic context: delayed input nodes allow the representation of an explicit trajectory F0(t), while recurrent nodes provide long-term context information that can be used to normalize the input F0 trajectory. Performance of the TDRNN is compared to the performance of a MLP (multi-layer perceptron) and an HMM (Hidden Markov Model) on the same task. The TDRNN shows the correct recognition of $91.9{\%}\;of\;pitch\;events\;and\;91.0{\%}$ of pitch non-events, for an average accuracy of $91.5{\%}$ over both pitch events and non-events. The MLP with contextual input exhibits $85.8{\%},\;85.5{\%},\;and\;85.6{\%}$ recognition accuracy respectively, while the HMM shows the correct recognition of $36.8{\%}\;of\;pitch\;events\;and\;87.3{\%}$ of pitch non-events, for an average accuracy of $62.2{\%}$ over both pitch events and non-events. These results suggest that the TDRNN architecture is useful for the automatic recognition of pitch accents.
PDF KSCI

Recognition of Passport MRZ Information Using Combined Neural Networks (결합 신경망을 이용한 여권 MRZ 정보 인식)

Kim, Jinho
- Journal of Korea Society of Digital Industry and Information Management
- /
- v.15 no.4
- /
- pp.149-157
- /
- 2019
In case of reading passport using a smart phone in contrast with a dedicated passport reading system, MRZ(Machine Readable Zone) character recognition can be hard when the character strokes were broken, touched or blurred according to the lighting condition, and the position and size of MRZ character lines were varied due to the camera distance and angle. In this paper, the effective recognition algorithm of the passport MRZ information using a combined neural network recognizer of CNN(Convolutional Neural Network) and ANN( Artificial Neural Network), is proposed under the various sized and skewed passport images. The MRZ line detection using connected component analysis algorithm and the skew correction using perspective transform algorithm are also designed in order to achieve effective character segmentation results. Each of the MRZ field recognition results is verified by using five check digits for deciding whether retrying the recognition process of passport MRZ information or not. After we implement the proposed recognition algorithm of passport MRZ information, the excellent recognition performance of the passport MRZ information was obtained in the experimental results for PC off-line mode and smart phone on-line mode.
https://doi.org/10.17662/ksdim.2019.15.4.149 인용 PDF KSCI

The Development of IDMLP Neural Network for the Chip Implementation and it's Application to Speech Recognition (Chip 구현을 위한 IDMLP 신경 회로망의 개발과 음성인식에 대한 응용)

김신진;박정운;정호선
- Journal of the Korean Institute of Telematics and Electronics B
- /
- v.28B no.5
- /
- pp.394-403
- /
- 1991
This paper described the development of input driven multilayer perceptron(IDMLP) neural network and it's application to the Korean spoken digit recognition. The IDMPLP neural network used here and the learning algorithm for this network was proposed newly. In this model, weight value is integer and transfer function in the neuron is hard limit function. According to the result of the network learning for the some kinds of input data, the number of network layers is one or more by the difficulties of classifying the inputs. We tested the recognition of binaried data for the spoken digit 0 to 9 by means of the proposed network. The experimental results are 100% and 96% for the learning data and test data, respectively.
PDF

Recognition of Music using Backpropagation Network (Backpropagation을 이용한 악보인식)

Park, Hyun-Jun;Cha, Eui-Young
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.11 no.6
- /
- pp.1170-1175
- /
- 2007
This paper presents techniques to recognize music using back propagation network one of the neural network algorithms, and to preprocess technique for music mage. Music symbols and music notes are segmented by preprocessing such as binarization, slope correction, staff line removing, etc. Segmented music symbols and music notes are recognized by music note recognizing network and non-music note recognizing network. We proved correctness of proposed music recognition algorithm though experiments and analysis with various kind of musics.
https://doi.org/10.6109/jkiice.2007.11.6.1170 인용 PDF KSCI

Bi-LSTM model with time distribution for bandwidth prediction in mobile networks

Hyeonji Lee;Yoohwa Kang;Minju Gwak;Donghyeok An
- ETRI Journal
- /
- v.46 no.2
- /
- pp.205-217
- /
- 2024
We propose a bandwidth prediction approach based on deep learning. The approach is intended to accurately predict the bandwidth of various types of mobile networks. We first use a machine learning technique, namely, the gradient boosting algorithm, to recognize the connected mobile network. Second, we apply a handover detection algorithm based on network recognition to account for vertical handover that causes the bandwidth variance. Third, as the communication performance offered by 3G, 4G, and 5G networks varies, we suggest a bidirectional long short-term memory model with time distribution for bandwidth prediction per network. To increase the prediction accuracy, pretraining and fine-tuning are applied for each type of network. We use a dataset collected at University College Cork for network recognition, handover detection, and bandwidth prediction. The performance evaluation indicates that the handover detection algorithm achieves 88.5% accuracy, and the bandwidth prediction model achieves a high accuracy, with a root-mean-square error of only 2.12%.
https://doi.org/10.4218/etrij.2022-0459 인용 PDF

A License Plate Recognition Algorithm using Multi-Stage Neural Network for Automobile Black-Box Image (다단계 신경 회로망을 이용한 블랙박스 영상용 차량 번호판 인식 알고리즘)

Kim, Jin-young;Heo, Seo-weon;Lim, Jong-tae
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.22 no.1
- /
- pp.40-48
- /
- 2018
This paper proposes a license-plate recognition algorithm for automobile black-box image which is obtained from the camera moving with the automobile. The algorithm intends to increase the overall recognition-rate of the license-plate by increasing the Korean character recognition-rate using multi-stage neural network for automobile black-box image where there are many movements of the camera and variations of light intensity. The proposed algorithm separately recognizes the vowel and consonant of Korean characters of automobile license-plate. First, the first-stage neural network recognizes the vowels, and the recognized vowels are classified as vertical-vowels('ㅏ','ㅓ') and horizontal-vowels('ㅗ','ㅜ'). Then the consonant is classified by the second-stage neural networks for each vowel group. The simulation for automobile license-plate recognition is performed for the image obtained by a real black-box system, and the simulation results show the proposed algorithm provides the higher recognition-rate than the existing algorithms using a neural network.
https://doi.org/10.6109/jkiice.2018.22.1.40 인용 PDF KSCI

Video Expression Recognition Method Based on Spatiotemporal Recurrent Neural Network and Feature Fusion

Zhou, Xuan
- Journal of Information Processing Systems
- /
- v.17 no.2
- /
- pp.337-351
- /
- 2021
Automatically recognizing facial expressions in video sequences is a challenging task because there is little direct correlation between facial features and subjective emotions in video. To overcome the problem, a video facial expression recognition method using spatiotemporal recurrent neural network and feature fusion is proposed. Firstly, the video is preprocessed. Then, the double-layer cascade structure is used to detect a face in a video image. In addition, two deep convolutional neural networks are used to extract the time-domain and airspace facial features in the video. The spatial convolutional neural network is used to extract the spatial information features from each frame of the static expression images in the video. The temporal convolutional neural network is used to extract the dynamic information features from the optical flow information from multiple frames of expression images in the video. A multiplication fusion is performed with the spatiotemporal features learned by the two deep convolutional neural networks. Finally, the fused features are input to the support vector machine to realize the facial expression classification task. The experimental results on cNTERFACE, RML, and AFEW6.0 datasets show that the recognition rates obtained by the proposed method are as high as 88.67%, 70.32%, and 63.84%, respectively. Comparative experiments show that the proposed method obtains higher recognition accuracy than other recently reported methods.
https://doi.org/10.3745/JIPS.01.0067 인용 PDF KSCI

An Enhanced Neural Network Approach for Numeral Recognition

Venugopal, Anita;Ali, Ashraf
- International Journal of Computer Science & Network Security
- /
- v.22 no.3
- /
- pp.61-66
- /
- 2022
Object classification is one of the main fields in neural networks and has attracted the interest of many researchers. Although there have been vast advancements in this area, still there are many challenges that are faced even in the current era due to its inefficiency in handling large data, linguistic and dimensional complexities. Powerful hardware and software approaches in Neural Networks such as Deep Neural Networks present efficient mechanisms and contribute a lot to the field of object recognition as well as to handle time series classification. Due to the high rate of accuracy in terms of prediction rate, a neural network is often preferred in applications that require identification, segmentation, and detection based on features. Neural networks self-learning ability has revolutionized computing power and has its application in numerous fields such as powering unmanned self-driving vehicles, speech recognition, etc. In this paper, the experiment is conducted to implement a neural approach to identify numbers in different formats without human intervention. Measures are taken to improve the efficiency of the machines to classify and identify numbers. Experimental results show the importance of having training sets to achieve better recognition accuracy.
https://doi.org/10.22937/IJCSNS.2022.22.3.9 인용 PDF KSCI

A Study on the Syllable Recognition Using Neural Network Predictive HMM

Kim, Soo-Hoon;Kim, Sang-Berm;Koh, Si-Young;Hur, Kang-In
- The Journal of the Acoustical Society of Korea
- /
- v.17 no.2E
- /
- pp.26-30
- /
- 1998
In this paper, we compose neural network predictive HMM(NNPHMM) to provide the dynamic feature of the speech pattern for the HMM. The NNPHMM is the hybrid network of neura network and the HMM. The NNPHMM trained to predict the future vector, varies each time. It is used instead of the mean vector in the HMM. In the experiment, we compared the recognition abilities of the one hundred Korean syllables according to the variation of hidden layer, state number and prediction orders of the NNPHMM. The hidden layer of NNPHMM increased from 10 dimensions to 30 dimensions, the state number increased from 4 to 6 and the prediction orders increased from 10 dimensions to 30 dimension, the state number increased from 4 to 6 and the prediction orders increased from the second oder to the fourth order. The NNPHMM in the experiment is composed of multi-layer perceptron with one hidden layer and CMHMM. As a result of the experiment, the case of prediction order is the second, the average recognition rate increased 3.5% when the state number is changed from 4 to 5. The case of prediction order is the third, the recognition rate increased 4.0%, and the case of prediction order is fourth, the recognition rate increased 3.2%. But the recognition rate decreased when the state number is changed from 5 to 6.
PDF

Search Result 2,517, Processing Time 0.028 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)