• Title/Summary/Keyword: Sound Recognition

Search Result 310, Processing Time 0.036 seconds

Improvement of Environment Recognition using Multimodal Signal (멀티 신호를 이용한 환경 인식 성능 개선)

  • Park, Jun-Qyu;Baek, Seong-Joon
    • The Journal of the Korea Contents Association
    • /
    • v.10 no.12
    • /
    • pp.27-33
    • /
    • 2010
  • In this study, we conducted the classification experiments with GMM (Gaussian Mixture Model) from combining the extracted features by using microphone, Gyro sensor and Acceleration sensor in 9 different environment types. Existing studies of Context Aware wanted to recognize the Environment situation mainly using the Environment sound data with microphone, but there was limitation of reflecting recognition owing to structural characteristics of Environment sound which are composed of various noises combination. Hence we proposed the additional application methods which added Gyro sensor and Acceleration sensor data in order to reflect recognition agent's movement feature. According to the experimental results, the method combining Acceleration sensor data with the data of existing Environment sound feature improves the recognition performance by more than 5%, when compared with existing methods of getting only Environment sound feature data from the Microphone.

Improved Melody Recognition Performance of a Cochlear Implant Speech Processing Strategy Using Instantaneous Frequency Encoding Based on Teager Energy Operator

  • Choi, Sung-Jin;Ryu, Sang-Baek;Kim, Kyung-Hwan
    • Journal of Biomedical Engineering Research
    • /
    • v.31 no.6
    • /
    • pp.417-426
    • /
    • 2010
  • We present a speech processing strategy incorporating instantaneous frequency (IF) encoding for the enhancement of melody recognition performance of cochlear implants. For the IF extraction from incoming sound, we propose the use of a Teager energy operator (TEO), which is advantageous for its lower computational load. From time-frequency analysis, we verified that the TEO-based method provides proper IF encoding of input sound, which is crucial for melody recognition. Similar benefit could be obtained also from the use of a Hilbert transform (HT), but much higher computational cost was required. The melody recognition performance of the proposed speech processing strategy was compared with those of a conventional strategy using envelope extraction, and the HT-based IF encoding. Hearing tests on normal subjects were performed using acoustic simulation and a musical contour identification task. Insignificant difference in melody recognition performance was observed between the TEO-based and HT-based IF encodings, and both were superior to the conventional strategy. However, the TEO-based strategy was advantageous considering that it was approximately 35% faster than the HT-based strategy.

Implementation of Speech Recognition Filtering at Emergency (응급상황에서의 음성인식을 위한 필터기 구현)

  • Cho, Young-Im;Jang, Sung-Soon
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.20 no.2
    • /
    • pp.208-213
    • /
    • 2010
  • Generally, the mal factor for speech recognition is the background noise in speech recognition. The noise is the reason to reduce the speech recognition performance. Owing to the fact, the place to recognize is very important. To improve the recognition performance from the sound having noise, we implemented the noise filtered Wiener filter at the signal process step which adopted the FIR filter. In FIR filter, it deal with the filtered speech signal which is appropriate frequency range of human speech frequency range. Therefore, we make the recognition system distinguish between noise and speech sound from the incoming speech signal.

A Study on a Method of U/V Decision by Using The LSP Parameter in The Speech Signal (LSP 파라미터를 이용한 음성신호의 성분분리에 관한 연구)

  • 이희원;나덕수;정찬중;배명진
    • Proceedings of the IEEK Conference
    • /
    • 1999.06a
    • /
    • pp.1107-1110
    • /
    • 1999
  • In speech signal processing, the accurate decision of the voiced/unvoiced sound is important for robust word recognition and analysis and a high coding efficiency. In this paper, we propose the mehod of the voiced/unvoiced decision using the LSP parameter which represents the spectrum characteristics of the speech signal. The voiced sound has many more LSP parameters in low frequency region. To the contrary, the unvoiced sound has many more LSP parameters in high frequency region. That is, the LSP parameter distribution of the voiced sound is different to that of the unvoiced sound. Also, the voiced sound has the minimun value of sequantial intervals of the LSP parameters in low frequency region. The unvoiced sound has it in high frequency region. we decide the voiced/unvoiced sound by using this charateristics. We used the proposed method to some continuous speech and then achieved good performance.

  • PDF

A Recognition Algorithm of Hangeul Alphabet Using 2-D Digital filtering (2차원 디지털 필터링에 의한 한글 자모의 인식 알고리즘)

  • O, Gil-Nam;Sin, Seong-Ho;Jin, Yong-Ok
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.21 no.3
    • /
    • pp.55-59
    • /
    • 1984
  • This paper describes a method of Hangout recognition using 2 - D digital filtering. The 170 patterns classified by the positions of the initial sound (consonant), middle sound (vowel) and terminal sound (consonant) of the 1,659 characters were established and models formed by using 2 - D digital filtering for each patterns were obtained. Based on these models we proposed an algorithm that can recognize KOREAN combinational characters by separating patterns from them with superpostion principles. As a result of simulation, 100% of recognition rate is obtained in the case of the print letter.

  • PDF

Development of a Multiple Monitioring System for Intelligence of a Machine Tool -Application to Drilling Process- (공작기계 지능화를 위한 다중 감시 시스템의 개발-드릴가공에의 적용-)

  • Kim, H.Y.;Ahn, J.H.
    • Journal of the Korean Society for Precision Engineering
    • /
    • v.10 no.4
    • /
    • pp.142-151
    • /
    • 1993
  • An intelligent mulitiple monitoring system to monitor tool/machining states synthetically was proposed and developed. It consists of 2 fundamental subsystems : the multiple sensor detection unit and the intellignet integrated diagnosis unit. Three signals, that is, spindle motor current, Z-axis motor current, and machining sound were adopted to detect tool/machining states more reliably. Based on the multiple sensor information, the diagnosis unit judges either tool breakage or degree of tool wear state using fuzzy reasoning. Tool breakage is diagnosed by the level of spindle/z-axis motor current. Tool wear is diagnosed by both the result of fuzzy pattern recognition for motor currents and the result of pattern matching for machining sound. Fuzzy c-means algorithm was used for fuzzy pattern recognition. Experiments carried out for drill operation in the machining center have shown that the developed system monitors abnormal drill/states drilling very reliably.

  • PDF

A Study on Internet Advertising Methods for Increased Brand Recognition -Focused on the relationships between design factors of rich media- (브랜드 인지도 향상을 위한 인터넷 광고디자인 기법 연구 - 리치미디어 광고 요소간의 상관성을 중심으로 -)

  • 양필은;이혜선
    • Archives of design research
    • /
    • v.16 no.4
    • /
    • pp.47-58
    • /
    • 2003
  • Introduction of rich media is influencing strategy, planning, and design field of internet advertising. Although past studios had suggested that using rich media in internet advertising is effective in building higher brand recognition, there has been little research on specific design factors of rich media advertising. Therefore, the purpose of this study is to analyze the effectiveness of design factors for brand recognition. The result of this study suggests that illustrated image was more effective than photo images for higher brand recognition. Also, sound with narration resulted in higher brand recognition than in the case of only sound.

  • PDF

A embodiment of mouse pointing system using 3-axis accelerometer and sound-recognition module (3축 가속도센서 및 음성인식 모듈을 이용한 마우스 포인팅 시스템의 구현)

  • Lee, Seung-Joon;Shin, Dong-Hwan;Kasno, Mohamad Afif B.;Kim, Joo-Woong;Park, Jin-Woo;Eom, Ki-Hwan
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2010.05a
    • /
    • pp.934-937
    • /
    • 2010
  • In this paper, we did pursue the embodiment of a mouse pointing system which help the handicapped and people of not familiar with using electronics use electronic devices easily. Speech Recognition and 3-axis acceleration sensors in conjunction with a headset, a new mouse pointing system is constructed. We used speaker dependent system module which are generating the BCD code by recognizing human voices because it has high recognition rate rather than speaker independent system. Head-set mouse system is organized by 3-axis accelerometer, sound recognition module and TMS320F2812 processor. The main controller, TMS320F2812 DSP-processor is communicated with main computer by using SCI communications. The system is operated by Visual Basic in PC.

  • PDF

Adaptive Post Processing of Nonlinear Amplified Sound Signal

  • Lee, Jae-Kyu;Choi, Jong-Suk;Seok, Cheong-Gyu;Kim, Mun-Sang
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2005.06a
    • /
    • pp.872-876
    • /
    • 2005
  • We propose a real-time post processing of nonlinear amplified signal to improve voice recognition in remote talk. In the previous research, we have found the nonlinear amplification has unique advantage for both the voice activity detection and the sound localization in remote talk. However, the original signal becomes distorted due to its nonlinear amplification and, as a result, the rest of sequence such as speech recognition show less satisfactorily results. To remedy this problem, we implement a linearization algorithm to recover the voice signal's linear characteristics after the localization has been done.

  • PDF

Modification of Pitch Algorithm and Its Application to Noise (피치 알고리즘 수정 및 소음에의 적용)

  • Shin, Sung-Hwan;Ih, Jeong-Guon
    • Proceedings of the Korean Society for Noise and Vibration Engineering Conference
    • /
    • 2002.11a
    • /
    • pp.354.1-354
    • /
    • 2002
  • Pitch is a perception related to frequency, one of the psychological aspects or attributes of tones, and an important factor to determine sound quality of sound together with loudness and timber. while a study on pitch has been actively achieved In the part of speech recognition and speech separation, that for analysis and improvement of product sound quality is not yet enough. (omitted)

  • PDF