• Title/Summary/Keyword: Feature-based Warping

Search Result 50, Processing Time 0.021 seconds

Multi-view Image Generation from Stereoscopic Image Features and the Occlusion Region Extraction (가려짐 영역 검출 및 스테레오 영상 내의 특징들을 이용한 다시점 영상 생성)

  • Lee, Wang-Ro;Ko, Min-Soo;Um, Gi-Mun;Cheong, Won-Sik;Hur, Nam-Ho;Yoo, Ji-Sang
    • Journal of Broadcast Engineering
    • /
    • v.17 no.5
    • /
    • pp.838-850
    • /
    • 2012
  • In this paper, we propose a novel algorithm that generates multi-view images by using various image features obtained from the given stereoscopic images. In the proposed algorithm, we first create an intensity gradient saliency map from the given stereo images. And then we calculate a block-based optical flow that represents the relative movement(disparity) of each block with certain size between left and right images. And we also obtain the disparities of feature points that are extracted by SIFT(scale-invariant We then create a disparity saliency map by combining these extracted disparity features. Disparity saliency map is refined through the occlusion detection and removal of false disparities. Thirdly, we extract straight line segments in order to minimize the distortion of straight lines during the image warping. Finally, we generate multi-view images by grid mesh-based image warping algorithm. Extracted image features are used as constraints during grid mesh-based image warping. The experimental results show that the proposed algorithm performs better than the conventional DIBR algorithm in terms of visual quality.

Similarity-Based Subsequence Search in Image Sequence Databases (이미지 시퀀스 데이터베이스에서의 유사성 기반 서브시퀀스 검색)

  • Kim, In-Bum;Park, Sang-Hyun
    • The KIPS Transactions:PartD
    • /
    • v.10D no.3
    • /
    • pp.501-512
    • /
    • 2003
  • This paper proposes an indexing technique for fast retrieval of similar image subsequences using the multi-dimensional time warping distance. The time warping distance is a more suitable similarity measure than Lp distance in many applications where sequences may be of different lengths and/or different sampling rates. Our indexing scheme employs a disk-based suffix tree as an index structure and uses a lower-bound distance function to filter out dissimilar subsequences without false dismissals. It applies the normaliration for an easier control of relative weighting of feature dimensions and the discretization to compress the index tree. Experiments on medical and synthetic image sequences verify that the proposed method significantly outperforms the naive method and scales well in a large volume of image sequence databases.

Feature-Strengthened Gesture Recognition Model Based on Dynamic Time Warping for Multi-Users (다중 사용자를 위한 Dynamic Time Warping 기반의 특징 강조형 제스처 인식 모델)

  • Lee, Suk Kyoon;Um, Hyun Min;Kwon, Hyuck Tae
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.5 no.10
    • /
    • pp.503-510
    • /
    • 2016
  • FsGr model, which has been proposed recently, is an approach of accelerometer-based gesture recognition by applying DTW algorithm in two steps, which improved recognition success rate. In FsGr model, sets of similar gestures will be produced through training phase, in order to define the notion of a set of similar gestures. At the 1st attempt of gesture recognition, if the result turns out to belong to a set of similar gestures, it makes the 2nd recognition attempt to feature-strengthened parts extracted from the set of similar gestures. However, since a same gesture show drastically different characteristics according to physical traits such as body size, age, and sex, FsGr model may not be good enough to apply to multi-user environments. In this paper, we propose FsGrM model that extends FsGr model for multi-user environment and present a program which controls channel and volume of smart TV using FsGrM model.

Speaker Identification Using Dynamic Time Warping Algorithm (동적 시간 신축 알고리즘을 이용한 화자 식별)

  • Jeong, Seung-Do
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.12 no.5
    • /
    • pp.2402-2409
    • /
    • 2011
  • The voice has distinguishable acoustic properties of speaker as well as transmitting information. The speaker recognition is the method to figures out who speaks the words through acoustic differences between speakers. The speaker recognition is roughly divided two kinds of categories: speaker verification and identification. The speaker verification is the method which verifies speaker himself based on only one's voice. Otherwise, the speaker identification is the method to find speaker by searching most similar model in the database previously consisted of multiple subordinate sentences. This paper composes feature vector from extracting MFCC coefficients and uses the dynamic time warping algorithm to compare the similarity between features. In order to describe common characteristic based on phonological features of spoken words, two subordinate sentences for each speaker are used as the training data. Thus, it is possible to identify the speaker who didn't say the same word which is previously stored in the database.

Digital Isolated Word Recognition System based on MFCC and DTW Algorithm (MFCC와 DTW에 알고리즘을 기반으로 한 디지털 고립단어 인식 시스템)

  • Zang, Xian;Chong, Kil-To
    • Proceedings of the KIEE Conference
    • /
    • 2008.10b
    • /
    • pp.290-291
    • /
    • 2008
  • The most popular speech feature used in speech recognition today is the Mel-Frequency Cepstral Coefficients (MFCC) algorithm, which could reflect the perception characteristics of the human ear more accurately than other parameters. This paper adopts MFCC and its first order difference, which could reflect the dynamic character of speech signal, as synthetical parametric representation. Furthermore, we quote Dynamic Time Warping (DTW) algorithm to search match paths in the pattern recognition process. We use the software "GoldWave" to record English digitals in the lab environments and the simulation results indicate the algorithm has higher recognition accuracy than others using LPCC, etc. as character parameters in the experiment for Digital Isolated Word Recognition (DIWR) system.

  • PDF

Implementation of the Auditory Sense for the Smart Robot: Speaker/Speech Recognition (로봇 시스템에의 적용을 위한 음성 및 화자인식 알고리즘)

  • Jo, Hyun;Kim, Gyeong-Ho;Park, Young-Jin
    • Proceedings of the Korean Society for Noise and Vibration Engineering Conference
    • /
    • 2007.05a
    • /
    • pp.1074-1079
    • /
    • 2007
  • We will introduce speech/speaker recognition algorithm for the isolated word. In general case of speaker verification, Gaussian Mixture Model (GMM) is used to model the feature vectors of reference speech signals. On the other hand, Dynamic Time Warping (DTW) based template matching technique was proposed for the isolated word recognition in several years ago. We combine these two different concepts in a single method and then implement in a real time speaker/speech recognition system. Using our proposed method, it is guaranteed that a small number of reference speeches (5 or 6 times training) are enough to make reference model to satisfy 90% of recognition performance.

  • PDF

Design and Implementation of Matching Engine for QbSH System Based on Polyphonic Music (다성음원 기반 QbSH 시스템을 위한 매칭엔진의 설계 및 구현)

  • Park, Sung-Joo;Chung, Kwang-Sue
    • Journal of Korea Multimedia Society
    • /
    • v.15 no.1
    • /
    • pp.18-31
    • /
    • 2012
  • This paper proposes a matching engine of query-by-singing/humming (QbSH) system which retrieves the most similar music information by comparing the input data with the extracted feature information from polyphonic music like MP3. The feature sequences transcribed from polyphonic music may have many errors. So, to reduce the influence of errors and improve the performance, the chroma-scale representation, compensation and asymmetric DTW (Dynamic Time Warping) are adopted in the matching engine. The performance of various distance metrics are also investigated in this paper. In our experiment, the proposed QbSH system achieves MRR (Mean Reciprocal Rank) of 0.718 for 1000 singing/humming queries when searching from a database of 450 polyphonic musics.

Vector Quantizer Based Speaker Normalization for Continuos Speech Recognition (연속음성 인식기를 위한 벡터양자화기 기반의 화자정규화)

  • Shin Ok-keun
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.8
    • /
    • pp.583-589
    • /
    • 2004
  • Proposed is a speaker normalization method based on vector quantizer for continuous speech recognition (CSR) system in which no acoustic information is made use of. The proposed method, which is an improvement of the previously reported speaker normalization scheme for a simple digit recognizer, builds up a canonical codebook by iteratively training the codebook while the size of codebook is increased after each iteration from a relatively small initial size. Once the codebook established, the warp factors of speakers are estimated by comparing exhaustively the warped versions of each speaker's utterance with the codebook. Two sets of phones are used to estimate the warp factors: one, a set of vowels only. and the other, a set composed of all the Phonemes. A Piecewise linear warping function which corresponds to the estimated warp factor is adopted to warp the power spectrum of the utterance. Then the warped feature vectors are extracted to be used to train and to test the speech recognizer. The effectiveness of the proposed method is investigated by a set of recognition experiments using the TIMIT corpus and HTK speech recognition tool kit. The experimental results showed comparable recognition rate improvement with the formant based warping method.

A Study on Intelligent Control Algorithm Development for Cooperation Working of Human and Robot (인간과 로봇 협력작업을 위한 로봇 지능제어알고리즘 개발에 관한 연구)

  • Lee, Woo-Song;Jung, Yang-Guen;Park, In-Man;Jung, Jong-Gyu;Kim, Hui-Jin;Kim, Min-Seong;Han, Sung-Hyun
    • Journal of the Korean Society of Industry Convergence
    • /
    • v.20 no.4
    • /
    • pp.285-297
    • /
    • 2017
  • This study proposed a new approach to develop an Intelligent control algorithm for cooperative working of human and robot based on voice recognition. In general case of speaker verification, Gaussian Mixture Model is used to model the feature vectors of reference speech signals. On the other hand, Dynamic Time Warping based template matching techniques were presented for the voice recognition about several years ago. We converge these two different concepts in a single method and then implement in a real time voice recognition enough to make reference model to satisfy 95% of recognition performance. In this paper it was illustrated the reliability of voice recognition by simulation and experiments for humanoid robot with 18 joints.

Same music file recognition method by using similarity measurement among music feature data (음악 특징점간의 유사도 측정을 이용한 동일음원 인식 방법)

  • Sung, Bo-Kyung;Chung, Myoung-Beom;Ko, Il-Ju
    • Journal of the Korea Society of Computer and Information
    • /
    • v.13 no.3
    • /
    • pp.99-106
    • /
    • 2008
  • Recently, digital music retrieval is using in many fields (Web portal. audio service site etc). In existing fields, Meta data of music are used for digital music retrieval. If Meta data are not right or do not exist, it is hard to get high accurate retrieval result. Contents based information retrieval that use music itself are researched for solving upper problem. In this paper, we propose Same music recognition method using similarity measurement. Feature data of digital music are extracted from waveform of music using Simplified MFCC (Mel Frequency Cepstral Coefficient). Similarity between digital music files are measured using DTW (Dynamic time Warping) that are used in Vision and Speech recognition fields. We success all of 500 times experiment in randomly collected 1000 songs from same genre for preying of proposed same music recognition method. 500 digital music were made by mixing different compressing codec and bit-rate from 60 digital audios. We ploved that similarity measurement using DTW can recognize same music.

  • PDF