• Title/Summary/Keyword: robust extraction

Search Result 427, Processing Time 0.025 seconds

Error-Tolerant Music Information Retrieval Method Using Query-by-Humming (허밍 질의를 이용한 오류에 강한 악곡 정보 검색 기법)

  • 정현열;허성필
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.6
    • /
    • pp.488-496
    • /
    • 2004
  • This paper describes a music information retrieval system which uses humming as the key for retrieval Humming is an easy way for the user to input a melody. However, there are several problems with humming that degrade the retrieval of information. One problem is a human factor. Sometimes people do not sing accurately, especially if they are inexperienced or unaccompanied. Another problem arises from signal processing. Therefore, a music information retrieval method should be sufficiently robust to surmount various humming errors and signal processing problems. A retrieval system has to extract pitch from the user's humming. However pitch extraction is not perfect. It often captures half or double pitches. even if the extraction algorithms take the continuity of the pitch into account. Considering these problems. we propose a system that takes multiple pitch candidates into account. In addition to the frequencies of the pitch candidates. the confidence measures obtained from their powers are taken into consideration as well. We also propose the use of an algorithm with three dimensions that is an extension of the conventional DP algorithm, so that multiple pitch candidates can be treated. Moreover in the proposed algorithm. DP paths are changed dynamically to take deltaPitches and IOIratios of input and reference notes into account in order to treat notes being split or unified. We carried out an evaluation experiment to compare the proposed system with a conventional system. From the experiment. the proposed method gave better retrieval performance than the conventional system.

A Fast Algorithm of the Belief Propagation Stereo Method (신뢰전파 스테레오 기법의 고속 알고리즘)

  • Choi, Young-Seok;Kang, Hyun-Soo
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.45 no.5
    • /
    • pp.1-8
    • /
    • 2008
  • The belief propagation method that has been studied recently yields good performance in disparity extraction. The method in which a target function is modeled as an energy function based on Markov random field(MRF), solves the stereo matching problem by finding the disparity to minimize the energy function. MRF models provide robust and unified framework for vision problem such as stereo and image restoration. the belief propagation method produces quite correct results, but it has difficulty in real time implementation because of higher computational complexity than other stereo methods. To relieve this problem, in this paper, we propose a fast algorithm of the belief propagation method. Energy function consists of a data term and a smoothness tern. The data term usually corresponds to the difference in brightness between correspondences, and smoothness term indicates the continuity of adjacent pixels. Smoothness information is created from messages, which are assigned using four different message arrays for the pixel positions adjacent in four directions. The processing time for four message arrays dominates 80 percent of the whole program execution time. In the proposed method, we propose an algorithm that dramatically reduces the processing time require in message calculation, since the message.; are not produced in four arrays but in a single array. Tn the last step of disparity extraction process, the messages are called in the single integrated array and this algorithm requires 1/4 computational complexity of the conventional method. Our method is evaluated by comparing the disparity error rates of our method and the conventional method. Experimental results show that the proposed method remarkably reduces the execution time while it rarely increases disparity error.

Robust Feature Extraction Based on Image-based Approach for Visual Speech Recognition (시각 음성인식을 위한 영상 기반 접근방법에 기반한 강인한 시각 특징 파라미터의 추출 방법)

  • Gyu, Song-Min;Pham, Thanh Trung;Min, So-Hee;Kim, Jing-Young;Na, Seung-You;Hwang, Sung-Taek
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.20 no.3
    • /
    • pp.348-355
    • /
    • 2010
  • In spite of development in speech recognition technology, speech recognition under noisy environment is still a difficult task. To solve this problem, Researchers has been proposed different methods where they have been used visual information except audio information for visual speech recognition. However, visual information also has visual noises as well as the noises of audio information, and this visual noises cause degradation in visual speech recognition. Therefore, it is one the field of interest how to extract visual features parameter for enhancing visual speech recognition performance. In this paper, we propose a method for visual feature parameter extraction based on image-base approach for enhancing recognition performance of the HMM based visual speech recognizer. For experiments, we have constructed Audio-visual database which is consisted with 105 speackers and each speaker has uttered 62 words. We have applied histogram matching, lip folding, RASTA filtering, Liner Mask, DCT and PCA. The experimental results show that the recognition performance of our proposed method enhanced at about 21% than the baseline method.

A Thoracic Spine Segmentation Technique for Automatic Extraction of VHS and Cobb Angle from X-ray Images (X-ray 영상에서 VHS와 콥 각도 자동 추출을 위한 흉추 분할 기법)

  • Ye-Eun, Lee;Seung-Hwa, Han;Dong-Gyu, Lee;Ho-Joon, Kim
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.1
    • /
    • pp.51-58
    • /
    • 2023
  • In this paper, we propose an organ segmentation technique for the automatic extraction of medical diagnostic indicators from X-ray images. In order to calculate diagnostic indicators of heart disease and spinal disease such as VHS(vertebral heart scale) and Cobb angle, it is necessary to accurately segment the thoracic spine, carina, and heart in a chest X-ray image. A deep neural network model in which the high-resolution representation of the image for each layer and the structure converted into a low-resolution feature map are connected in parallel was adopted. This structure enables the relative position information in the image to be effectively reflected in the segmentation process. It is shown that learning performance can be improved by combining the OCR module, in which pixel information and object information are mutually interacted in a multi-step process, and the channel attention module, which allows each channel of the network to be reflected as different weight values. In addition, a method of augmenting learning data is presented in order to provide robust performance against changes in the position, shape, and size of the subject in the X-ray image. The effectiveness of the proposed theory was evaluated through an experiment using 145 human chest X-ray images and 118 animal X-ray images.

Comparative Study on Feature Extraction Schemes for Feature-based Structural Displacement Measurement (특징점 추출 기법에 따른 구조물 동적 변위 측정 성능에 관한 연구)

  • Junho Gong
    • Journal of the Korea institute for structural maintenance and inspection
    • /
    • v.28 no.3
    • /
    • pp.74-82
    • /
    • 2024
  • In this study, feature point detection and displacement measurement performance depending on feature extraction algorithms were compared and analyzed according to environmental changes and target types in the feature point-based displacement measurement algorithm. A three-story frame structure was designed for performance evaluation, and the displacement response of the structure was digitized into FHD (1920×1080) resolution. For performance analysis, the initial measurement distance was set to 10m, and increased up to 40m with an increment of 10m. During the experiments, illuminance was fixed to 450lux or 120lux. The artificial and natural targets mounted on the structure were set as regions of interest and used for feature point detection. Various feature detection algorithms were implemented for performance comparisons. As a result of the feature point detection performance analysis, the Shi-Tomasi corner and KAZE algorithm were found that they were robust to the target type, illuminance change, and increase in measurement distance. The displacement measurement accuracy using those two algorithms was also the highest. However, when using natural targets, the displacement measurement accuracy is lower than that of artificial targets. This indicated the limitation in extracting feature points as the resolution of the natural target decreased as the measurement distance increased.

Robust Speech Recognition Algorithm of Voice Activated Powered Wheelchair for Severely Disabled Person (중증 장애우용 음성구동 휠체어를 위한 강인한 음성인식 알고리즘)

  • Suk, Soo-Young;Chung, Hyun-Yeol
    • The Journal of the Acoustical Society of Korea
    • /
    • v.26 no.6
    • /
    • pp.250-258
    • /
    • 2007
  • Current speech recognition technology s achieved high performance with the development of hardware devices, however it is insufficient for some applications where high reliability is required, such as voice control of powered wheelchairs for disabled persons. For the system which aims to operate powered wheelchairs safely by voice in real environment, we need to consider that non-voice commands such as user s coughing, breathing, and spark-like mechanical noise should be rejected and the wheelchair system need to recognize the speech commands affected by disability, which contains specific pronunciation speed and frequency. In this paper, we propose non-voice rejection method to perform voice/non-voice classification using both YIN based fundamental frequency(F0) extraction and reliability in preprocessing. We adopted a multi-template dictionary and acoustic modeling based speaker adaptation to cope with the pronunciation variation of inarticulately uttered speech. From the recognition tests conducted with the data collected in real environment, proposed YIN based fundamental extraction showed recall-precision rate of 95.1% better than that of 62% by cepstrum based method. Recognition test by a new system applied with multi-template dictionary and MAP adaptation also showed much higher accuracy of 99.5% than that of 78.6% by baseline system.

A Vibration-based Fault Diagnostics Technique for the Planetary Gearbox of Wind Turbines Considering Characteristics of Vibration Modulation (풍력발전기 유성기어박스의 진동 변조 특성을 고려한 진동기반 고장 진단 기법 고찰)

  • Ha, Jong M.;Park, Jungho;Oh, Hyunsoek;Youn, Byeng D.
    • Transactions of the Korean Society of Mechanical Engineers A
    • /
    • v.39 no.7
    • /
    • pp.665-671
    • /
    • 2015
  • The performance of fault diagnostics for a planetary gearbox depends on vibration modulation characteristics, which can vary with manufacturing & assembly tolerance, and load condition. In this paper, a fault diagnostics technique that considers vibration modulation characteristics is proposed for the effective fault detection of planetary gearboxes in wind turbines. For identifying the vibration modulation characteristics in practice, re-sampled vibration signals are processed with narrow band-pass filters. Thereafter, the optimal position of the vibration extraction window is identified for effective detection of faulty signals under the varying vibration modulation characteristics. The proposed diagnostics technique makes it possible to perform robust diagnostics of the planetary gearbox with regard to the changeable vibration modulation effect. For demonstrating the proposed fault diagnostics technique, a 2-kW WT testbed is designed with two DC motors and gearboxes. A faulty gear with partial tooth breakage is machined and assembled into the gearbox.

Comparison of recognition rate with distance on stereo face images base PCA (PCA기반의 스테레오 얼굴영상에서 거리에 따른 인식률 비교)

  • Park Chang-Han;Namkung Jae-Chan
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.42 no.1
    • /
    • pp.9-16
    • /
    • 2005
  • In this paper, we compare face recognition rate by distance change using Principal Component Analysis algorithm being input left and right image in stereo image. Change to YCbCr color space from RGB color space in proposed method and face region does detection. Also, after acquire distance using stereo image extracted face image's extension and reduce do extract robust face region, experimented recognition rate by using PCA algorithm. Could get face recognition rate of 98.61%(30cm), 98.91%(50cm), 99.05%(100cm), 99.90%(120cm), 97.31%(150cm) and 96.71%(200cm) by average recognition result of acquired face image. Therefore, method that is proposed through an experiment showed that can get high recognition rate if apply scale up or reduction according to distance.

A Centroid-based Image Retrieval Scheme Using Centroid Situation Vector (Centroid 위치벡터를 이용한 영상 검색 기법)

  • 방상배;남재열;최재각
    • Journal of Broadcast Engineering
    • /
    • v.7 no.2
    • /
    • pp.126-135
    • /
    • 2002
  • An image contains various features such as color, shape, texture and location information. When only one of those features is used to retrieve an image, it is difficult to acquire satisfactory retrieval efficiency. Especially, in the database with huge capacity, such phenomenon happens frequently. Therefore, by using moi·e features, efficiency of the contents-based image retrieval (CBIR) system can be improved. This paper proposes a technique to consider location information about specific color as well as color information in image using centroid situation vector. Centroid situation vectors are calculated for specific color of the query image. Then, location similarity is determined through comparing distances between extracted centroid situation vectors of query image and target image in the database. Simulation results show that the proposed method is robust in zoom-in or zoom-out processed images and improves discrimination ability in fliped or rotated images. In addition, the suggested method reduced computational complexity by overlapping information extraction, and that improved the retrieval speed using an efficient index file.

Motion Recognitions Based on Local Basis Images Using Independent Component Analysis (독립성분분석을 이용한 국부기저영상 기반 동작인식)

  • Cho, Yong-Hyun
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.18 no.5
    • /
    • pp.617-623
    • /
    • 2008
  • This paper presents a human motion recognition method using both centroid shift and local basis images. The centroid shift based on 1st moment balance technique is applied to get the robust motion images against position or size changes, the extraction of local basis images based on independent component analysis(ICA) is also applied to find a set of statistically independent motion features, which is included in each motions. Especially, ICA of fixed-point(FP) algorithm based on Newton method is used for being quick to extract a local basis images of motions. The proposed method has been applied to the problem for recognizing the 160(1 person * 10 animals * 16 motions) sign language motion images of 240*215 pixels. The 3 distances such as city-block, Euclidean, negative angle are used as measures when match the probe images to the nearest gallery images. The experimental results show that the proposed method has a superior recognition performances(speed, rate) than the method using local eigen images and the method using local basis images without centroid shift respectively.