Search | Korea Science

The design of application program in Multi-modal system (멀티모달을 이용한 응용프로그램 제어에 관한 연구)

Choi Kwang-Kook;Kwak Sang-Hun;Ha Yan-Dol-I;Kim Yu-Jin;Kim Cheol;Choi Seung-Ho
- Proceedings of the Acoustical Society of Korea Conference
- /
- autumn
- /
- pp.205-208
- /
- 2000
본 논문은 멀티모달 시스템에서 응용프로그램 S/W를 제어하는 연구로써 음성과 입술인식기를 결합시켜 문자 데이터를 수신하는 Comdio의 명령어들을 이 시스템이 제어하도록 설계하였다. 음성과 입술인식기는 HMM으로 구현되어 결합 시 각각의 인식기에 8:2의 가중치를 부여하였다.
PDF

A Study on the ID Visual System (개인식별을 위한 영상시스템 연구)

심정범;이진행;송현교;강민구
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 1998.11a
- /
- pp.208-213
- /
- 1998
사회구조가 복잡해질수록 보안(Security)의 확보는 점차 중요한 사회문제로 대두되고 있다. 보안의 문제에서 가장 중요한 것이 각 개인의 본인 여부를 정확하고 신속하게 판별할 수 있는 자동화된 인증(Authentication) 기술의 개발 여부라고 할 수 있다. 이를 위해 사용되는 개인식별은 신체의 일부를 이용한 지문인식, 두개골함성, 장문인식, 족적인식, 입술인식, 홍채인식, 골격인식 등 불변하는 신체의 특징을 이용하는 연구가 주도적이었다. 본 연구에서는 개인식별에 관한 총체적인 영상시스템을 위한 영상처리 자료를 정리한다.
PDF

Facial Features Detection Using Heuristic Cost Function (얼굴의 특성을 반영하는 휴리스틱 평가함수를 이용한 얼굴 특징 검출)

Jang, Gyeong-Sik
- The KIPS Transactions:PartB
- /
- v.8B no.2
- /
- pp.183-188
- /
- 2001
이 논문은 눈의 형태에 대한 정보를 이용하여 눈동자를 효과적으로 찾는 방법과 얼굴 특성을 반영하는 평가함수를 이용하여 눈동자, 입의 위치와 같은 얼굴 특징들을 인식하는 방법을 제안하였다. 색 정보를 이용하여 입술과 얼굴 영역을 추출하고 눈동자와 흰자위간의 명도 차를 이용하는 함수를 사용하여 눈동자를 인식하였다. 마지막으로 얼굴 특성을 반영하느 평가함수를 정의하고 이를 이용하여 최종적인 얼굴과 눈, 입을 인식하였다. 제안한 방법을 사용하여 여러 영상들에 대해 실험하여 좋은 결과를 얻었다.
PDF

Estimation of speech feature vectors and enhancement of speech recognition performance using lip information (입술정보를 이용한 음성 특징 파라미터 추정 및 음성인식 성능향상)

Min So-Hee;Kim Jin-Young;Choi Seung-Ho
- MALSORI
- /
- no.44
- /
- pp.83-92
- /
- 2002
Speech recognition performance is severly degraded under noisy envrionments. One approach to cope with this problem is audio-visual speech recognition. In this paper, we discuss the experiment results of bimodal speech recongition based on enhanced speech feature vectors using lip information. We try various kinds of speech features as like linear predicion coefficient, cepstrum, log area ratio and etc for transforming lip information into speech parameters. The experimental results show that the cepstrum parameter is the best feature in the point of reconition rate. Also, we present the desirable weighting values of audio and visual informations depending on signal-to-noiso ratio.
PDF

Implementation of a Multimodal Controller Combining Speech and Lip Information (음성과 영상정보를 결합한 멀티모달 제어기의 구현)

Kim, Cheol;Choi, Seung-Ho
- The Journal of the Acoustical Society of Korea
- /
- v.20 no.6
- /
- pp.40-45
- /
- 2001
In this paper, we implemented a multimodal system combining speech and lip information, and evaluated its performance. We designed speech recognizer using speech information and lip recognizer using image information. Both recognizers were based on HMM recognition engine. As a combining method we adopted the late integration method in which weighting ratio for speech and lip is 8:2. By the way, Our constructed multi-modal recognition system was ported on DARC system. That is, our system was used to control Comdio of DARC. The interrace between DARC and our system was done with TCP/IP socked. The experimental results of controlling Comdio showed that lip recognition can be used for an auxiliary means of speech recognizer by improving the rate of the recognition. Also, we expect that multi-model system will be successfully applied to o traffic information system and CNS (Car Navigation System).
PDF

Robust Endpoint Detection for Bimodal System in Noisy Environments (잡음환경에서의 바이모달 시스템을 위한 견실한 끝점검출)

오현화;권홍석;손종목;진성일;배건성
- Journal of the Institute of Electronics Engineers of Korea CI
- /
- v.40 no.5
- /
- pp.289-297
- /
- 2003
The performance of a bimodal system is affected by the accuracy of the endpoint detection from the input signal as well as the performance of the speech recognition or lipreading system. In this paper, we propose the endpoint detection method which detects the endpoints from the audio and video signal respectively and utilizes the signal to-noise ratio (SNR) estimated from the input audio signal to select the reliable endpoints to the acoustic noise. In other words, the endpoints are detected from the audio signal under the high SNR and from the video signal under the low SNR. Experimental results show that the bimodal system using the proposed endpoint detector achieves satisfactory recognition rates, especially when the acoustic environment is quite noisy.
PDF KSCI

Definition of Optimal Face Region for Face Recognition with Phase-Only Correlation (위상 한정 상관법으로 얼굴을 인식하기 위한 최적 얼굴 영역의 정의)

Lee, Choong-Ho
- Journal of the Institute of Convergence Signal Processing
- /
- v.13 no.3
- /
- pp.150-155
- /
- 2012
POC(Phase-Only Correlation) is a useful method that can conduct face recognition without using feature extraction or eigenface, but uses Fourier transformation for square areas. In this paper, we propose an effective face area to increase the performance of face recognition using POC. Specifically, three areas are experimented for POC. The frist area is the square area that includes head and space. The second area is the square area from ear to ear horizontally and from the end of chin to the forehead vertically. The third area is the square area from the line under the lips to the forehead vertically and from cheek to cheek horizontally. Experimental results show that the second face area has the best advantage among the three types of areas to define the threshold for POC.
PDF KSCI

A study on the implementation of identification system using facial multi-modal (얼굴의 다중특징을 이용한 인증 시스템 구현)

정택준;문용선
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.6 no.5
- /
- pp.777-782
- /
- 2002
This study will offer multimodal recognition instead of an existing monomodal bioinfomatics by using facial multi-feature to improve the accuracy of recognition and to consider the convenience of user . Each bioinfomatics vector can be found by the following ways. For a face, the feature is calculated by principal component analysis with wavelet multiresolution. For a lip, a filter is used to find out an equation to calculate the edges of the lips first. Then by using a thinning image and least square method, an equation factor can be drawn. A feature found out the facial parameter distance ratio. We've sorted backpropagation neural network and experimented with the inputs used above. Based on the experimental results we discuss the advantage and efficiency.
PDF KSCI

A Study on Analysis of Variant Factors of Recognition Performance for Lip-reading at Dynamic Environment (동적 환경에서의 립리딩 인식성능저하 요인분석에 대한 연구)

신도성;김진영;이주헌
- The Journal of the Acoustical Society of Korea
- /
- v.21 no.5
- /
- pp.471-477
- /
- 2002
Recently, lip-reading has been studied actively as an auxiliary method of automatic speech recognition(ASR) in noisy environments. However, almost of research results were obtained based on the database constructed in indoor condition. So, we dont know how developed lip-reading algorithms are robust to dynamic variation of image. Currently we have developed a lip-reading system based on image-transform based algorithm. This system recognize 22 words and this word recognizer achieves word recognition of up to 53.54%. In this paper we present how stable the lip-reading system is in environmental variance and what the main variant factors are about dropping off in word-recognition performance. For studying lip-reading robustness we consider spatial valiance (translation, rotation, scaling) and illumination variance. Two kinds of test data are used. One Is the simulated lip image database and the other is real dynamic database captured in car environment. As a result of our experiment, we show that the spatial variance is one of degradations factors of lip reading performance. But the most important factor of degradation is not the spatial variance. The illumination variances make severe reduction of recognition rates as much as 70%. In conclusion, robust lip reading algorithms against illumination variances should be developed for using lip reading as a complementary method of ASR.
PDF KSCI

A study on lip-motion recognition algorithms (입 모양 인식 기술이 비교 연구)

Park, Han-Mu;Jung, Jin-Woo
- Proceedings of the Korean Institute of Intelligent Systems Conference
- /
- 2008.04a
- /
- pp.268-270
- /
- 2008
얼굴 인식은 영상 처리 분야 중 대표적인 분야의 하나로, 지금까지 다양한 응용시스템이 개발됐다. 얼굴 인식은 눈, 코, 입 같은 얼굴의 특징들을 값으로 변환하고 각 특징 값들의 상관관계를 분석하는 방식으로 이루어지는데, 이 중에서 입은 형태 변화가 심하기 때문에 얼굴 인식에서는 특징 값으로 잘 이용되지 않는다. 반면, 표정 인식이나 화자 인식과 같은 특정 응용 시스템에서는 중요한 특징의 하나로 사용되고 있다. 입 모양을 인식한다는 것은 입술의 형태와 그 변화를 인식한다는 것을 의미하며, 이에 대한 연구가 많이 이루어지기는 했지만 음성 인식의 보조 수단으로 사용된 것이 대부분이다. 본 논문에서는 현재까지 제안된 입 움직임 인식 기술에 대해서 정리하고, 새로이 적용 가능한 응용 시스템에 대해 고찰해보고자 한다.
PDF

Search Result 93, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)