통합 검색 | Korea Science

Improved Bimodal Speech Recognition Study Based on Product Hidden Markov Model

Xi, Su Mei;Cho, Young Im
- International Journal of Fuzzy Logic and Intelligent Systems
- /
- 제13권3호
- /
- pp.164-170
- /
- 2013
Recent years have been higher demands for automatic speech recognition (ASR) systems that are able to operate robustly in an acoustically noisy environment. This paper proposes an improved product hidden markov model (HMM) used for bimodal speech recognition. A two-dimensional training model is built based on dependently trained audio-HMM and visual-HMM, reflecting the asynchronous characteristics of the audio and video streams. A weight coefficient is introduced to adjust the weight of the video and audio streams automatically according to differences in the noise environment. Experimental results show that compared with other bimodal speech recognition approaches, this approach obtains better speech recognition performance.
https://doi.org/10.5391/IJFIS.2013.13.3.164 인용 PDF KSCI

말초 청각 계통 모델을 이용한 한국어 모음 인식 (Korean Vowel Recognition using Peripheral Auditory Model)

윤태성;백승화;박상희
- 대한의용생체공학회:의공학회지
- /
- 제9권1호
- /
- pp.1-10
- /
- 1988
In this study, the recognition experiments for Korean vowel are performed using peripheral auditory model. In addition, for the purpose of objective comparison, the recognition experiments are performed by extracting LPC cepstrum coefficients for the same speech data. The results are as follows. 1) The time and the frequency responses of the auditory model show that important features of input signal are involved in the responses of inner ear and auditory nerve. 2) The recognition results for Korean vowel show that the recognition rate by auditory model output is higher than the recognition rate by LPC cepstrum coefficients. 3) The adaptation phenomenon of auditory nerve provides useful characteristics for the discrimination of vowel signal.
PDF

시간 지연을 갖는 쌍전파 신경회로망을 이용한 근전도 신호인식에 관한 연구 (A Study on EMG Signals Recognition using Time Delayed Counterpropagation Neural Network)

권장우;정인길;홍승홍
- 대한의용생체공학회:의공학회지
- /
- 제17권3호
- /
- pp.395-401
- /
- 1996
In this paper a new neural network model, time delayed counterpropagation neural networks (TDCPN) which have high recognition rate and short total learning time, is proposed for electromyogram(EMG) recognition. Signals the proposed model increases the recognition rates after learned the regional temporal correlation of patterns using time delay properties in input layer, and decreases the learning time by using winner-takes-all learning rule. The ouotar learning rule is put at the output layer so that the input pattern is able to map a desired output. We test the performance of this model with EMG signals collected from a normal subject. Experimental results show that the recognition rates of the suggested model is better and the learning time is shorter than those of TDNN and CPN.
PDF

${\nabla}^2G$ 연산자의 신호 분석 특성을 이용한 음성 인식 신경 회로망에 관한 연구 (Neural Network for Speech Recognition Using Signal Analysis Characteristics by ${\nabla}^2G$ Operator)

이종혁;정용근;남기곤;윤태훈;김재창;박의열;이양성
- 전자공학회논문지B
- /
- 제29B권10호
- /
- pp.90-99
- /
- 1992
In this paper, we propose a neural network model for speech recognition. The model consists of feature extraction parts and recognition parts. The interconnection model based on ${\Delta}^2$G operator was used for frequency analysis. Two features, global feature and local feature, were extracted from this model. Recognition parts consist of global grouping stage and local grouping stage. When the input pattern was coded by slope method, the recognition rate of speakers, A and B, was 100%. When the test was performed with the data of 9 speakers, the recognition rate of 91.4% was obtained.
PDF

Object Recognition Algorithm with Partial Information

Yoo, Suk Won
- International Journal of Advanced Culture Technology
- /
- 제7권4호
- /
- pp.229-235
- /
- 2019
Due to the development of video and optical technology today, video equipments are being used in a variety of fields such as identification, security maintenance, and factory automation systems that generate products. In this paper, we investigate an algorithm that effectively recognizes an experimental object in an input image with a partial problem due to the mechanical problem of the input imaging device. The object recognition algorithm proposed in this paper moves and rotates the vertices constituting the outline of the experimental object to the positions of the respective vertices constituting the outline of the DB model. Then, the discordance values between the moved and rotated experimental object and the corresponding DB model are calculated, and the minimum discordance value is selected. This minimum value is the final discordance value between the experimental object and the corresponding DB model, and the DB model with the minimum discordance value is selected as the recognition result for the experimental object. The proposed object recognition method obtains satisfactory recognition results using only partial information of the experimental object.
https://doi.org/10.17703/IJACT.2019.7.4.229 인용 PDF KSCI

ADD-Net: Attention Based 3D Dense Network for Action Recognition

Man, Qiaoyue;Cho, Young Im
- 한국컴퓨터정보학회논문지
- /
- 제24권6호
- /
- pp.21-28
- /
- 2019
Recent years with the development of artificial intelligence and the success of the deep model, they have been deployed in all fields of computer vision. Action recognition, as an important branch of human perception and computer vision system research, has attracted more and more attention. Action recognition is a challenging task due to the special complexity of human movement, the same movement may exist between multiple individuals. The human action exists as a continuous image frame in the video, so action recognition requires more computational power than processing static images. And the simple use of the CNN network cannot achieve the desired results. Recently, the attention model has achieved good results in computer vision and natural language processing. In particular, for video action classification, after adding the attention model, it is more effective to focus on motion features and improve performance. It intuitively explains which part the model attends to when making a particular decision, which is very helpful in real applications. In this paper, we proposed a 3D dense convolutional network based on attention mechanism(ADD-Net), recognition of human motion behavior in the video.
https://doi.org/10.9708/jksci.2019.24.06.021 인용 PDF KSCI HTML

벡터모델 기반 바타챠랴 거리 측정 기법과 우도 원리 베이시안을 융합한 어휘 인식 모델 (Vocabulary Recognition Model using a convergence of Likelihood Principla Bayesian methode and Bhattacharyya Distance Measurement based on Vector Model)

오상엽
- 디지털융복합연구
- /
- 제13권11호
- /
- pp.165-170
- /
- 2015
어휘 인식 시스템은 구성되어진 모델에서 벗어난 어휘의 입력과 유사한 어휘의 입력은 인식하지 못하거나 유사한 어휘로 인식되어 인식률 저하가 나타나며, 기존의 시스템은 벡터 값을 모델로 만들어 데이터베이스로 구성하여 어휘 인식에 사용하였다. 어휘 인식을 위한 탐색 중에 형성되는 모델은 데이터베이스로 구성되어 있지 않아 인식할 수 없는 단점이 존재한다. 따라서 본 논문에서는 특징 벡터 모델을 기반으로 바타챠랴 거리 측정법을 이용한 베이시안 인식 모델을 구성하여 탐색 중에 형성되는 벡터 모델을 인식할 수 있도록 유도하였으며, 위너 필터를 적용하여 인식률을 향상시켰다. 2 방법을 융합하여 실험한 결과 향상된 신뢰도로 인해 높은 인식 성능을 확인하였으며, 본 논문에서 제안한 측정법을 이용하여 기존의 방법들에 비하여 평균 98.2%의 성능을 나타내었다.
https://doi.org/10.14400/JDC.2015.13.11.165 인용 PDF KSCI

어휘 인식 시스템에서 학습 모델 분류를 위한 결정 트리 학습 알고리즘 (Decision Tree Learning Algorithms for Learning Model Classification in the Vocabulary Recognition System)

오상엽
- 디지털융복합연구
- /
- 제11권9호
- /
- pp.153-158
- /
- 2013
인식 대상 학습 모델이 분류되어 있지 않거나 명확하게 분류되지 않은 경우 어휘 인식을 결정하지 못하여 인식률이 저하되며 학습 모델 분류 형태가 변경되거나 새로운 학습 모델이 추가되면 인식 모델의 결정 트리 구조가 변경되어야 하는 구조적 문제가 발생한다. 이러한 문제점을 해결하기 위하여 학습 모델 분류를 위한 결정 트리 학습 알고리즘을 제안한다. 음운 현상이 충분히 반영된 음성 데이터베이스를 구성하고 학습 효과를 확보하기 위하여 학습 모델 분류를 위한 결정 트리 방법을 사용하였다. 본 연구에서는 실내 환경에 대하여 어휘 종속 인식과 어휘 독립 인식 실험을 수행한 결과 실내 환경의 어휘 종속 실험에서는 98.3%의 인식 성능을 보였고, 어휘 독립 실험에서 98.4%의 인식 성능을 보였다.
https://doi.org/10.14400/JDPM.2013.11.9.153 인용 PDF

3차원 얼굴인식 모델에 관한 연구: 모델 구조 비교연구 및 해석 (A Study On Three-dimensional Optimized Face Recognition Model : Comparative Studies and Analysis of Model Architectures)

박찬준;오성권;김진율
- 전기학회논문지
- /
- 제64권6호
- /
- pp.900-911
- /
- 2015
In this paper, 3D face recognition model is designed by using Polynomial based RBFNN(Radial Basis Function Neural Network) and PNN(Polynomial Neural Network). Also recognition rate is performed by this model. In existing 2D face recognition model, the degradation of recognition rate may occur in external environments such as face features using a brightness of the video. So 3D face recognition is performed by using 3D scanner for improving disadvantage of 2D face recognition. In the preprocessing part, obtained 3D face images for the variation of each pose are changed as front image by using pose compensation. The depth data of face image shape is extracted by using Multiple point signature. And whole area of face depth information is obtained by using the tip of a nose as a reference point. Parameter optimization is carried out with the aid of both ABC(Artificial Bee Colony) and PSO(Particle Swarm Optimization) for effective training and recognition. Experimental data for face recognition is built up by the face images of students and researchers in IC&CI Lab of Suwon University. By using the images of 3D face extracted in IC&CI Lab. the performance of 3D face recognition is evaluated and compared according to two types of models as well as point signature method based on two kinds of depth data information.
https://doi.org/10.5370/KIEE.2015.64.6.900 인용 PDF KSCI KPUBS HTML

트리 구조 어휘 사전을 이용한 연결 숫자음 인식 시스템의 구현 (Implementation of Connected-Digit Recognition System Using Tree Structured Lexicon Model)

윤영선;채의근
- 대한음성학회지:말소리
- /
- 제50호
- /
- pp.123-137
- /
- 2004
In this paper, we consider the implementation of connected digit recognition system using tree structured lexicon model. To implement efficiently the fixed or variable length digit recognition system, finite state network (FSN) is required. We merge the word network algorithm that implements the FSN with lexical tree search algorithm that is used for general speech recognition system for fast search and large vocabulary systems. To find the efficient modeling of digit recognition system, we investigate some performance changes when the lexical tree search is applied.
PDF

검색결과 3,389건 처리시간 0.03초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)