• Title/Summary/Keyword: Sound recognition

Search Result 311, Processing Time 0.027 seconds

Active Audition System based on 2-Dimensional Microphone Array (2차원 마이크로폰 배열에 의한 능동 청각 시스템)

  • Lee, Chang-Hun;Kim, Yong-Ho
    • Proceedings of the KIEE Conference
    • /
    • 2003.11b
    • /
    • pp.175-178
    • /
    • 2003
  • This paper describes a active audition system for robot-human interface in real environment. We propose a strategy for a robust sound localization and for -talking speech recognition(60-300cm) based on 2-dimensional microphone array. We consider spatial features, the relation of position and interaural time differences, and realize speaker tracking system using fuzzy inference profess based on inference rules generated by its spatial features.

  • PDF

A Study on Consonant/Vowel/Unvoiced Consonant Phonetic Value Segmentation and Recognition of Korean Isolated Word Speech (한국어 고립 단어 음성의 자음/모음/유성자음 음가 분할 및 인식에 관한 연구)

  • Lee, Jun-Hwan;Lee, Sang-Beom
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.6
    • /
    • pp.1964-1972
    • /
    • 2000
  • For the Korean language, on acoustics, it creates a different form of phonetic value not a phoneme by its own peculiar property. Therefore, the construction of extended recognition system for understanding Korean language should be created with a study of the Korean rule-based system, before it can be used as post-processing of the Korean recognition system. In this paper, text-based Korean rule-based system featuring Korean peculiar vocal sound changing rule is constructed. and based on the text-based phonetic value result of the system constructed, a preliminary phonetic value segmentation border points with non-uniform blocks are extracted in Korean isolated word speech. Through the way of merge and recognition of the non-uniform blocks between the extracted border points, recognition possibility of Korean voice as the form of the phonetic vale has been investigated.

  • PDF

Speaker Adapted Real-time Dialogue Speech Recognition Considering Korean Vocal Sound System (한국어 음운체계를 고려한 화자적응 실시간 단모음인식에 관한 연구)

  • Hwang, Seon-Min;Yun, Han-Kyung;Song, Bok-Hee
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.6 no.4
    • /
    • pp.201-207
    • /
    • 2013
  • Voice Recognition technique has been developed and it has been actively applied to various information devices such as smart phones and car navigation system. But the basic research technique related the speech recognition is based on research results in English. Since the lip sync producing generally requires tedious hand work of animators and it serious affects the animation producing cost and development period to get a high quality lip animation. In this research, a real time processed automatic lip sync algorithm for virtual characters in digital contents is studied by considering Korean vocal sound system. This suggested algorithm contributes to produce a natural lip animation with the lower producing cost and the shorter development period.

A Study on the Spoken KOrean-Digit Recognition Using the Neural Netwok (神經網을 利用한 韓國語 數字音 認識에 관한 硏究)

  • Park, Hyun-Hwa;Gahang, Hae Dong;Bae, Keun Sung
    • The Journal of the Acoustical Society of Korea
    • /
    • v.11 no.3
    • /
    • pp.5-13
    • /
    • 1992
  • Taking devantage of the property that Korean digit is a mono-syllable word, we proposed a spoken Korean-digit recognition scheme using the multi-layer perceptron. The spoken Korean-digit is divided into three segments (initial sound, medial vowel, and final consonant) based on the voice starting / ending points and a peak point in the middle of vowel sound. The feature vectors such as cepstrum, reflection coefficients, ${\Delta}$cepstrum and ${\Delta}$energy are extracted from each segment. It has been shown that cepstrum, as an input vector to the neural network, gives higher recognition rate than reflection coefficients. Regression coefficients of cepstrum did not affect as much as we expected on the recognition rate. That is because, it is believed, we extracted features from the selected stationary segments of the input speech signal. With 150 ceptral coefficients obtained from each spoken digit, we achieved correct recognition rate of 97.8%.

  • PDF

Implementation of a DI Multi-Touch Display Using an Improved Touch-Points Detection and Gesture Recognition (개선된 터치점 검출과 제스쳐 인식에 의한 DI 멀티터치 디스플레이 구현)

  • Lee, Woo-Beom
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.11 no.1
    • /
    • pp.13-18
    • /
    • 2010
  • Most of the research in the multi-touch area is based on the FTIR(Frustrated Total Internal Re리ection), which is just implemented by using the previous approach. Moreover, there are not the software solutions to improve a performance in the multi touch-blobs detection or the user gesture recognition. Therefore, we implement a multi-touch table-top display that is based on the DI(Diffused Illumination), the improved touch-points detection and user gesture recognition. The proposed method supports a simultaneous transformation multi-touch command for objects in the running application. Also, the system latency time is reduced by the proposed ore-testing method in the multi touch-blobs detection processing. Implemented device is simulated by programming the Flash AS3 application in the TUIO(Tangible User Interface Object) environment that is based on the OSC(Open Sound Control) protocol. As a result, Our system shows the 37% system latency reduction, and is successful in the multi-touch gestures recognition.

교차로 사고음 검지시스템의 방해음향 조사연구

  • Kang, Hee-Koo;Go, Young-Gwon;Kim, Jae-Yee
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2008.05a
    • /
    • pp.805-808
    • /
    • 2008
  • In this paper, it was performed the analysis on various intersection acoustic patterns for detection rate improvement of accident sound detection system : an acoustic pattern analysis on general traffic noise, an acoustic pattern analysis on engine noise, an acoustic pattern analysis on obstruct factors for accident sound detection system. There are remarkable differences between the acoustic patterns of traffic noise and accident sound, and we most consider the acoustic patterns when we compose the accident traffic detection system by acoustic because there is error range of 20[dB] according to the volume of traffic in intersection.

  • PDF

Classification of General Sound with Non-negativity Constraints (비음수 제약을 통한 일반 소리 분류)

  • 조용춘;최승진;방승양
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.10
    • /
    • pp.1412-1417
    • /
    • 2004
  • Sparse coding or independent component analysis (ICA) which is a holistic representation, was successfully applied to elucidate early auditor${\gamma}$ processing and to the task of sound classification. In contrast, parts-based representation is an alternative way o) understanding object recognition in brain. In this thesis we employ the non-negative matrix factorization (NMF) which learns parts-based representation in the task of sound classification. Methods of feature extraction from the spectro-temporal sounds using the NMF in the absence or presence of noise, are explained. Experimental results show that NMF-based features improve the performance of sound classification over ICA-based features.

Convolutional Neural Network Based Image Processing System

  • Kim, Hankil;Kim, Jinyoung;Jung, Hoekyung
    • Journal of information and communication convergence engineering
    • /
    • v.16 no.3
    • /
    • pp.160-165
    • /
    • 2018
  • This paper designed and developed the image processing system of integrating feature extraction and matching by using convolutional neural network (CNN), rather than relying on the simple method of processing feature extraction and matching separately in the image processing of conventional image recognition system. To implement it, the proposed system enables CNN to operate and analyze the performance of conventional image processing system. This system extracts the features of an image using CNN and then learns them by the neural network. The proposed system showed 84% accuracy of recognition. The proposed system is a model of recognizing learned images by deep learning. Therefore, it can run in batch and work easily under any platform (including embedded platform) that can read all kinds of files anytime. Also, it does not require the implementing of feature extraction algorithm and matching algorithm therefore it can save time and it is efficient. As a result, it can be widely used as an image recognition program.

A study on the voice command recognition at the motion control in the industrial robot (산업용 로보트의 동작제어 명령어의 인식에 관한 연구)

  • 이순요;권규식;김홍태
    • Journal of the Ergonomics Society of Korea
    • /
    • v.10 no.1
    • /
    • pp.3-10
    • /
    • 1991
  • The teach pendant and keyboard have been used as an input device of control command in human-robot sustem. But, many problems occur in case that the usef is a novice. So, speech recognition system is required to communicate between a human and the robot. In this study, Korean voice commands, eitht robot commands, and ten digits based on the broad phonetic analysis are described. Applying broad phonetic analysis, phonemes of voice commands are divided into phoneme groups, such as plosive, fricative, affricative, nasal, and glide sound, having similar features. And then, the feature parameters and their ranges to detect phoneme groups are found by minimax method. Classification rules are consisted of combination of the feature parameters, such as zero corssing rate(ZCR), log engery(LE), up and down(UD), formant frequency, and their ranges. Voice commands were recognized by the classification rules. The recognition rate was over 90 percent in this experiment. Also, this experiment showed that the recognition rate about digits was better than that about robot commands.

  • PDF

The Study for Advancing the Performance of Speaker Verification Algorithm Using Individual Voice Information (개별 음향 정보를 이용한 화자 확인 알고리즘 성능향상 연구)

  • Lee, Je-Young;Kang, Sun-Mee
    • Speech Sciences
    • /
    • v.9 no.4
    • /
    • pp.253-263
    • /
    • 2002
  • In this paper, we propose new algorithm of speaker recognition which identifies the speaker using the information obtained by the intensive speech feature analysis such as pitch, intensity, duration, and formant, which are crucial parameters of individual voice, for candidates of high percentage of wrong recognition in the existing speaker recognition algorithm. For testing the power of discrimination of individual parameter, DTW (Dynamic Time Warping) is used. We newly set the range of threshold which affects the power of discrimination in speech verification such that the candidates in the new range of threshold are finally discriminated in the next stage of sound parameter analysis. In the speaker verification test by using voice DB which consists of secret words of 25 males and 25 females of 8 kHz 16 bit, the algorithm we propose shows about 1% of performance improvement to the existing algorithm.

  • PDF