• Title/Summary/Keyword: Sound recognition

Search Result 311, Processing Time 0.028 seconds

Computerization and Application of Hangeul Standard Pronunciation Rule (음성처리를 위한 표준 발음법의 전산화)

  • 이계영
    • Proceedings of the IEEK Conference
    • /
    • 2003.07d
    • /
    • pp.1363-1366
    • /
    • 2003
  • This paper introduces computerized version of Hangout(Korean Language) Standard Pronunciation Rule that can be used in Korean processing systems such as Korean voice synthesis system and Korean voice recognition system. For this purpose, we build Petri net models for each items of the Standard Pronunciation Rule, and then integrate them into the vocal sound conversion table. The reversion of Hangul Standard Pronunciation Rule regulates the way of matching vocal sounds into grammatically correct written characters. This paper presents not only the vocal sound conversion table but also character conversion table obtained by reversely converting the vocal sound conversion table. Making use of these tables, we have implemented a Hangeul character into a vocal sound system and a Korean vocal sound into character conversion system, and tested them with various data sets reflecting all the items of the Standard Pronunciation Rule to verify the soundness and completeness of our tables. The test results shows that the tables improves the process speed in addition to the soundness and completeness.

  • PDF

Stress Detection and Classification of Laying Hens by Sound Analysis

  • Lee, Jonguk;Noh, Byeongjoon;Jang, Suin;Park, Daihee;Chung, Yongwha;Chang, Hong-Hee
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.28 no.4
    • /
    • pp.592-598
    • /
    • 2015
  • Stress adversely affects the wellbeing of commercial chickens, and comes with an economic cost to the industry that cannot be ignored. In this paper, we first develop an inexpensive and non-invasive, automatic online-monitoring prototype that uses sound data to notify producers of a stressful situation in a commercial poultry facility. The proposed system is structured hierarchically with three binary-classifier support vector machines. First, it selects an optimal acoustic feature subset from the sound emitted by the laying hens. The detection and classification module detects the stress from changes in the sound and classifies it into subsidiary sound types, such as physical stress from changes in temperature, and mental stress from fear. Finally, an experimental evaluation was performed using real sound data from an audio-surveillance system. The accuracy in detecting stress approached 96.2%, and the classification model was validated, confirming that the average classification accuracy was 96.7%, and that its recall and precision measures were satisfactory.

Design of Sound Source Localization Sensor Based on the Hearing Structure in the Parasitoid Fly, Ormia Ochracea (파리의 청각 구조를 이용한 음원 방향 검지용 센서 설계)

  • Lee, Sang-Moon;Park, Young-Jin
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.18 no.2
    • /
    • pp.126-132
    • /
    • 2012
  • The technique for estimation of sound source direction is one of the important methods necessary for various engineering fields such as monitoring system, military services and so on. As a new approach for estimation of sound source direction, this paper propose the bio-mimetic localization sensor based on mechanically coupling structure motivated by hearing structure of fly, Ormia Ochracea. This creature is known for its outstanding recognition ability to the sound which has large wavelength compared to its own size. ITTF (Inter-Tympanal Transfer Function) which is the transfer function between displacements of the tympanal membranes on each side has the all inter-tympanal information dependent on sound direction. The peak and notch features of desired ITTF can be generated by using the appropriate mechanical properties. A example of estimation of sound source direction using generated ITTF with monotonically changing notch and peak patterns is shown.

Comparison of environmental sound classification performance of convolutional neural networks according to audio preprocessing methods (오디오 전처리 방법에 따른 콘벌루션 신경망의 환경음 분류 성능 비교)

  • Oh, Wongeun
    • The Journal of the Acoustical Society of Korea
    • /
    • v.39 no.3
    • /
    • pp.143-149
    • /
    • 2020
  • This paper presents the effect of the feature extraction methods used in the audio preprocessing on the classification performance of the Convolutional Neural Networks (CNN). We extract mel spectrogram, log mel spectrogram, Mel Frequency Cepstral Coefficient (MFCC), and delta MFCC from the UrbanSound8K dataset, which is widely used in environmental sound classification studies. Then we scale the data to 3 distributions. Using the data, we test four CNNs, VGG16, and MobileNetV2 networks for performance assessment according to the audio features and scaling. The highest recognition rate is achieved when using the unscaled log mel spectrum as the audio features. Although this result is not appropriate for all audio recognition problems but is useful for classifying the environmental sounds included in the Urbansound8K.

Speaker Separation Based on Directional Filter and Harmonic Filter (Directional Filter와 Harmonic Filter 기반 화자 분리)

  • Baek, Seung-Eun;Kim, Jin-Young;Na, Seung-You;Choi, Seung-Ho
    • Speech Sciences
    • /
    • v.12 no.3
    • /
    • pp.125-136
    • /
    • 2005
  • Automatic speech recognition is much more difficult in real world. Speech recognition according to SIR (Signal to Interface Ratio) is difficult in situations in which noise of surrounding environment and multi-speaker exists. Therefore, study on main speaker's voice extractions a very important field in speech signal processing in binaural sound. In this paper, we used directional filter and harmonic filter among other existing methods to extract the main speaker's information in binaural sound. The main speaker's voice was extracted using directional filter, and other remaining speaker's information was removed using harmonic filter through main speaker's pitch detection. As a result, voice of the main speaker was enhanced.

  • PDF

Recognition of Individual Cattle by His and /or Her Voice

  • Yoshio, Ikeda;Yohei, Ishii
    • Proceedings of the Korean Society for Agricultural Machinery Conference
    • /
    • 1998.06b
    • /
    • pp.270-275
    • /
    • 1998
  • It was assumed that the voice of cattle is generated with the virtual white noise through the digital filter called the linear prediction filter, and filter parameters (prediction coefficients) were estimated by the maximum entropy method (MEM) , using the sound signal of the animal . The feature planes were defined by the pairs of two parameters selected appropriately from these parameters. The cattle voices were divided into three levels, that is the high, medium and low levels according to their total power equivalent to the variances of the sound signal . It was found that the straight lines could be used for recognizing tow cow and one calf for high level voices. For high and medium level voices, however, it was difficult or impossible to recognize individual cattle on the parameters planes.

  • PDF

Intelligent Speech Recognition System based on Situation Awareness for u-Green City (u-Green City 구현을 위한 상황인지기반 지능형 음성인식 시스템)

  • Cho, Young-Im;Jang, Sung-Soon
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.15 no.12
    • /
    • pp.1203-1208
    • /
    • 2009
  • Green IT based u-City means that u-City having Green IT concept. If we adopt the situation awareness or not, the processing of Green IT may be reduced. For example, if we recognize a lot of speech sound on CCTV in u-City environment, it takes a lot of processing time and cost. However, if we want recognize emergency sound on CCTV, it takes a few reduced processing cost. So, for detecting emergency state dynamically through CCTV, we propose our advanced speech recognition system. For the purpose of that, we adopt HMM (Hidden Markov Model) for feature extraction. Also, we adopt Wiener filter technique for noise elimination in many information coming from on CCTV in u-City environment.

Condition Monitoring of Tool wear using Sound Pressure and Fuzzy Pattern Recognition in Turning Processes (선삭공정에서 음압과 퍼지 패턴 인식을 이용한 공구 마멸 감시)

  • 김지훈
    • Proceedings of the Korean Society of Machine Tool Engineers Conference
    • /
    • 1998.10a
    • /
    • pp.164-169
    • /
    • 1998
  • This paper deals with condition monitoring for tool wear during tuning operation. To develop economic sensing and identification methods for turning processes, sound pressure measurement and digital signal processing technique are proposed. To identify noise sources of tool wear and reject background noise, noise rejection methodology is proposed. features to represent condition of tool wear are obtained through analysis using adaptive filter and FFT in time and frequency domain. By using fuzzy pattern recognition, we extract features, which are sensitive to condition of tool wear, from several features and make a decision on tool wear. The validity of the proposed system is condirmed through the large number of cutting tests in two cutting conditions.

  • PDF

Audio-Based Human-Robot Interaction Technology (오디오 기반 인간로봇 상호작용 기술)

  • Kwak, K.C.;Kim, H.J.;Bae, K.S.;Yoon, H.S.
    • Electronics and Telecommunications Trends
    • /
    • v.22 no.2 s.104
    • /
    • pp.31-37
    • /
    • 2007
  • 인간로봇 상호작용 기술(human-robot interaction)은 다양한 의사소통 채널인 로봇카메라, 마이크로폰, 기타 센서를 통해 인지 및 정서적으로 상호작용할 수 있도록 로봇시스템 및 상호작용 환경을 디자인하고 구현 및 평가하는 지능형 서비스 로봇의 핵심기술이다. 본 고에서는 오디오 기반 인간로봇 상호작용 기술 중에서 음원 추적(sound localization)과 화자인식(speaker recognition) 기술의 국내외 기술동향을 살펴보고 최근 ETRI 지능형로봇연구단에서 상용화를 추진중인 시청각 기반 음원 추적(audio visual sound localization)과 문장독립 화자인식(text-independent speaker recognition)기술들을 다룬다. 또한 이들 기술들을 가정환경에서 효과적으로 사용하기 위해 음성인식, 얼굴검출, 얼굴인식 등을 결합한 시나리오에 대해서 살펴본다.

A Study on the Distance and Object Recognition Applying the Airborne Ultrasonic Sensor (공중 초음파 센서를 응용한 거리 형상인식에 관한 연구)

  • Han, E.K.;Park, I.G.
    • Journal of the Korean Society for Nondestructive Testing
    • /
    • v.10 no.1
    • /
    • pp.10-17
    • /
    • 1990
  • Recently, object recognition ultrasonic sensor is being used with automatization of industrial machine. Points which characterize the object can be deleted by measuring the propagation time of ultrasonic impulse and azimuth which gives its maximum amplitude, and from these points shape, position and orientation of the object are deduced. A new measuring method is adopted, where the distance to the object is calculated by sound reflection time which is measured from O-cross point of sound wave, and azimuth is measured by angle indicating maximum amplitude. The measuring accuracy of 1.0mm for distance and $0.5-2^{\circ}$ for azimuth have been accomplished. By rotational scanning of sensor the characteristic point of an object can be known and it gives the information of its shape, position and orientation. Experimental results showed that the object of some complicated shape can be recognized, which suggest its applicability to robot.

  • PDF