Search | Korea Science

Efficient Speaker Identification based on Robust VQ-PCA (강인한 VQ-PCA에 기반한 효율적인 화자 식별)

Lee Ki-Yong
- Journal of Internet Computing and Services
- /
- v.5 no.3
- /
- pp.57-62
- /
- 2004
In this paper, an efficient speaker identification based on robust vector quantizationprincipal component analysis (VQ-PCA) is proposed to solve the problems from outliers and high dimensionality of training feature vectors in speaker identification, Firstly, the proposed method partitions the data space into several disjoint regions by roust VQ based on M-estimation. Secondly, the robust PCA is obtained from the covariance matrix in each region. Finally, our method obtains the Gaussian Mixture model (GMM) for speaker from the transformed feature vectors with reduced dimension by the robust PCA in each region, Compared to the conventional GMM with diagonal covariance matrix, under the same performance, the proposed method gives faster results with less storage and, moreover, shows robust performance to outliers.
PDF

Model-based Curved Lane Detection using Geometric Relation between Camera and Road Plane (카메라와 도로평면의 기하관계를 이용한 모델 기반 곡선 차선 검출)

Jang, Ho-Jin;Baek, Seung-Hae;Park, Soon-Yong
- Journal of Institute of Control, Robotics and Systems
- /
- v.21 no.2
- /
- pp.130-136
- /
- 2015
In this paper, we propose a robust curved lane marking detection method. Several lane detection methods have been proposed, however most of them have considered only straight lanes. Compared to the number of straight lane detection researches, less number of curved-lane detection researches has been investigated. This paper proposes a new curved lane detection and tracking method which is robust to various illumination conditions. First, the proposed methods detect straight lanes using a robust road feature image. Using the geometric relation between a vehicle camera and the road plane, several circle models are generated, which are later projected as curved lane models on the camera images. On the top of the detected straight lanes, the curved lane models are superimposed to match with the road feature image. Then, each curve model is voted based on the distribution of road features. Finally, the curve model with highest votes is selected as the true curve model. The performance and efficiency of the proposed algorithm are shown in experimental results.
https://doi.org/10.5302/J.ICROS.2015.14.9008 인용 PDF KSCI

Improving the Processing Speed and Robustness of Face Detection for a Psychological Robot Application (심리로봇적용을 위한 얼굴 영역 처리 속도 향상 및 강인한 얼굴 검출 방법)

Ryu, Jeong Tak;Yang, Jeen Mo;Choi, Young Sook;Park, Se Hyun
- Journal of Korea Society of Industrial Information Systems
- /
- v.20 no.2
- /
- pp.57-63
- /
- 2015
Compared to other emotion recognition technology, facial expression recognition technology has the merit of non-contact, non-enforceable and convenience. In order to apply to a psychological robot, vision technology must be able to quickly and accurately extract the face region in the previous step of facial expression recognition. In this paper, we remove the background from any image using the YCbCr skin color technology, and use Haar-like Feature technology for robust face detection. We got the result of improved processing speed and robust face detection by removing the background from the input image.
https://doi.org/10.9723/jksiis.2015.20.2.057 인용 PDF KSCI

A Weighted Feature Voting Approach for Robust and Real-Time Voice Activity Detection

Moattar, Mohammad Hossein;Homayounpour, Mohammad Mehdi
- ETRI Journal
- /
- v.33 no.1
- /
- pp.99-109
- /
- 2011
This paper concerns a robust real-time voice activity detection (VAD) approach which is easy to understand and implement. The proposed approach employs several short-term speech/nonspeech discriminating features in a voting paradigm to achieve a reliable performance in different environments. This paper mainly focuses on the performance improvement of a recently proposed approach which uses spectral peak valley difference (SPVD) as a feature for silence detection. The main issue of this paper is to apply a set of features with SPVD to improve the VAD robustness. The proposed approach uses a weighted voting scheme in order to take the discriminative power of the employed feature set into account. The experiments show that the proposed approach is more robust than the baseline approach from different points of view, including channel distortion and threshold selection. The proposed approach is also compared with some other VAD techniques for better confirmation of its achievements. Using the proposed weighted voting approach, the average VAD performance is increased to 89.29% for 5 different noise types and 8 SNR levels. The resulting performance is 13.79% higher than the approach based only on SPVD and even 2.25% higher than the not-weighted voting scheme.
https://doi.org/10.4218/etrij.11.1510.0158 인용 PDF KSCI

Preprocessing and Facial Feature Robust to Illumination Variations (조명변화에 강인한 전처리 및 얼굴특징)

Kim, Dong-Ju;Lee, Sang-Heon;Kim, Hyun-Duk
- KIPS Transactions on Software and Data Engineering
- /
- v.2 no.7
- /
- pp.503-506
- /
- 2013
In this paper, we propose the face recognition method combining the ECSP preprocessing technique which is modified version of previous CS-LBP and the illumination-robust D2D-PCA feature. The performance evaluation of proposed method was carried out using various binary pattern operators and feature extraction algorithms such as well-known PCA and 2D-PCA on the Yale B database. As a results, the proposed method showed the best recognition accuracy compared to different approaches, and we confirmed that the proposed approach is robust to illumination variation.
https://doi.org/10.3745/KTSDE.2013.2.7.503 인용 PDF KSCI

A Study on Robust Pattern Classification of Lung Sounds for Diagnosis of Pulmonary Dysfunction in Noise Environment (폐질환 진단을 위한 잡음환경에 강건한 폐음 패턴 분류법에 관한 연구)

Yeo, Song-Phil;Jeon, Chang-Ik;Yoo, Se-Keun;Kim, Duk-Young;Kim, Sung-Hwan
- The Transactions of the Korean Institute of Electrical Engineers D
- /
- v.51 no.3
- /
- pp.122-128
- /
- 2002
In this paper, a robust pattern classification of breath sounds for the diagnosis of pulmonary dysfunction in noise environment is proposed. The feature parameter extraction method by highpass lifter algorithm and PM(projection measure) algorithm are used. 17 different groups of breath sounds are experimentally classified and investigated. The classification has been performed by 6 different types of combinations with proposed methods to evaluate the performances, such as ARC with EDM and LCC with EDM, WLCC with EDM, ARC with PM, LCC with PM, WLCC with PM. Furthermore, all feature parameters are extracted to 80th orders by 5th orders step, and all experiments are evaluated in increasing noise environments by degrees SNR 24dB to 0dB. As a results, WLCC which is derived from highpass lifter algorithm, is selected for the feature parameter extraction method. Pm is more robust than EDM in noisy environments to test and compare experimental results. WLCC with PM method(WLCC/PM) has a better performance in an increasing noise environment for diagnosis of pulmonary dysfunction.
PDF KSCI

Robust Music Identification Using Long-Term Dynamic Modulation Spectrum

Kim, Hyoung-Gook;Eom, Ki-Wan
- The Journal of the Acoustical Society of Korea
- /
- v.25 no.2E
- /
- pp.69-73
- /
- 2006
In this paper, we propose a robust music audio fingerprinting system for automatic music retrieval. The fingerprint feature is extracted from the long-term dynamic modulation spectrum (LDMS) estimation in the perceptual compressed domain. The major advantage of this feature is its significant robustness against severe background noise from the street and cars. Further the fast searching is performed by looking up hash table with 32-bit hash values. The hash value bits are quantized from the logarithmic scale modulation frequency coefficients. Experiments illustrate that the LDMS fingerprint has advantages of high scalability, robustness and small fingerprint size. Moreover, the performance is improved remarkably under the severe recording-noise conditions compared with other power spectrum-based robust fingerprints.
PDF KSCI

HMM-based missing feature reconstruction for robust speech recognition in additive noise environments (가산잡음환경에서 강인음성인식을 위한 은닉 마르코프 모델 기반 손실 특징 복원)

Cho, Ji-Won;Park, Hyung-Min
- Phonetics and Speech Sciences
- /
- v.6 no.4
- /
- pp.127-132
- /
- 2014
This paper describes a robust speech recognition technique by reconstructing spectral components mismatched with a training environment. Although the cluster-based reconstruction method can compensate the unreliable components from reliable components in the same spectral vector by assuming an independent, identically distributed Gaussian-mixture process of training spectral vectors, the presented method exploits the temporal dependency of speech to reconstruct the components by introducing a hidden-Markov-model prior which incorporates an internal state transition plausible for an observed spectral vector sequence. The experimental results indicate that the described method can provide temporally consistent reconstruction and further improve recognition performance on average compared to the conventional method.
https://doi.org/10.13064/KSSS.2014.6.4.127 인용 PDF KSCI

Robust Histogram Equalization Using Compensated Probability Distribution

Kim, Sung-Tak;Kim, Hoi-Rin
- MALSORI
- /
- v.55
- /
- pp.131-142
- /
- 2005
A mismatch between the training and the test conditions often causes a drastic decrease in the performance of the speech recognition systems. In this paper, non-linear transformation techniques based on histogram equalization in the acoustic feature space are studied for reducing the mismatched condition. The purpose of histogram equalization(HEQ) is to convert the probability distribution of test speech into the probability distribution of training speech. While conventional histogram equalization methods consider only the probability distribution of a test speech, for noise-corrupted test speech, its probability distribution is also distorted. The transformation function obtained by this distorted probability distribution maybe bring about miss-transformation of feature vectors, and this causes the performance of histogram equalization to decrease. Therefore, this paper proposes a new method of calculating noise-removed probability distribution by using assumption that the CDF of noisy speech feature vectors consists of component of speech feature vectors and component of noise feature vectors, and this compensated probability distribution is used in HEQ process. In the AURORA-2 framework, the proposed method reduced the error rate by over $44\%$ in clean training condition compared to the baseline system. For multi training condition, the proposed methods are also better than the baseline system.
PDF

Efficient Image Search using Advanced SURF and DCD on Mobile Platform (모바일 플랫폼에서 개선된 SURF와 DCD를 이용한 효율적인 영상 검색)

Lee, Yong-Hwan
- Journal of the Semiconductor & Display Technology
- /
- v.14 no.2
- /
- pp.53-59
- /
- 2015
Since the amount of digital image continues to grow in usage, users feel increased difficulty in finding specific images from the image collection. This paper proposes a novel image searching scheme that extracts the image feature using combination of Advanced SURF (Speed-Up Robust Feature) and DCD (Dominant Color Descriptor). The key point of this research is to provide a new feature extraction algorithm to improve the existing SURF method with removal of unnecessary feature in image retrieval, which can be adaptable to mobile system and efficiently run on the mobile environments. To evaluate the proposed scheme, we assessed the performance of simulation in term of average precision and F-score on two databases, commonly used in the field of image retrieval. The experimental results revealed that the proposed algorithm exhibited a significant improvement of over 14.4% in retrieval effectiveness, compared to OpenSURF. The main contribution of this paper is that the proposed approach achieves high accuracy and stability by using ASURF and DCD in searching for natural image on mobile platform.
PDF KSCI

Search Result 872, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)