• Title/Summary/Keyword: Sound signal

Search Result 898, Processing Time 0.026 seconds

A Blind Audio Watermarking using the Tonal Characteristic (토널 특성을 이용한 브라인드 오디오 워터마킹)

  • 이희숙;이우선
    • Journal of Korea Multimedia Society
    • /
    • v.6 no.5
    • /
    • pp.816-823
    • /
    • 2003
  • In this paper, we propose a blind audio watermarking using the tonal characteristic. First, we explain the perceptional effect of tonal on the existed researches and shout the experimental result that tonal characteristic is more stable than other characteristics used in previous watermarking studies against several signal processing. On the base of the result, we propose the blind audio watermarking using the relation among the signals on the frequency domain which compose a tonal masker. To evaluate the sound quality of our watermarked audios, we used the SDG(Subjective Diff-Grades) and got the average SDG 0.27. This result says the watermarking using the perceptional effect of tonal is available from the viewpoint of non-perception. And we detected the watermark hits from the watermarked audios which were changed by several signal processing and the detection ratios with exception of the time shift processing were over 98%. About the time shift processing, we applied the new method that searched the most proper position on the time domain and then detected the watermark bits by the ratio of 90%.

  • PDF

Voice Activity Detection Based on Entropy in Noisy Car Environment (차량 잡음 환경에서 엔트로피 기반의 음성 구간 검출)

  • Roh, Yong-Wan;Lee, Kue-Bum;Lee, Woo-Seok;Hong, Kwang-Seok
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.9 no.2
    • /
    • pp.121-128
    • /
    • 2008
  • Accurate voice activity detection have a great impact on performance of speech applications including speech recognition, speech coding, and speech communication. In this paper, we propose methods for voice activity detection that can adapt to various car noise situations during driving. Existing voice activity detection used various method such as time energy, frequency energy, zero crossing rate, and spectral entropy that have a weak point of rapid. decline performance in noisy environments. In this paper, the approach is based on existing spectral entropy for VAD that we propose voice activity detection method using MFB(Met-frequency filter banks) spectral entropy, gradient FFT(Fast Fourier Transform) spectral entropy. and gradient MFB spectral entropy. FFT multiplied by Mel-scale is MFB and Mel-scale is non linear scale when human sound perception reflects characteristic of speech. Proposed MFB spectral entropy method clearly improve the ability to discriminate between speech and non-speech for various in noisy car environments that achieves 93.21% accuracy as a result of experiments. Compared to the spectral entropy method, the proposed voice activity detection gives an average improvement in the correct detection rate of more than 3.2%.

  • PDF

Feature Vector Decision Method of Various Fault Signals for Neural-network-based Fault Diagnosis System (신경회로망 기반 고장 진단 시스템을 위한 고장 신호별 특징 벡터 결정 방법)

  • Han, Hyung-Seob;Cho, Sang-Jin;Chong, Ui-Pil
    • Transactions of the Korean Society for Noise and Vibration Engineering
    • /
    • v.20 no.11
    • /
    • pp.1009-1017
    • /
    • 2010
  • As rotating machines play an important role in industrial applications such as aeronautical, naval and automotive industries, many researchers have developed various condition monitoring system and fault diagnosis system by applying various techniques such as signal processing and pattern recognition. Recently, fault diagnosis systems using artificial neural network have been proposed. For effective fault diagnosis, this paper used MLP(multi-layer perceptron) network which is widely used in pattern classification. Since using obtained signals without preprocessing as inputs of neural network can decrease performance of fault classification, it is very important to extract significant features of captured signals and to apply suitable features into diagnosis system according to the kinds of obtained signals. Therefore, this paper proposes the decision method of the proper feature vectors about each fault signal for neural-network-based fault diagnosis system. We applied LPC coefficients, maximum magnitudes of each spectral section in FFT and RMS(root mean square) and variance of wavelet coefficients as feature vectors and selected appropriate feature vectors as comparing error ratios of fault diagnosis for sound, vibration and current fault signals. From experiment results, LPC coefficients and maximum magnitudes of each spectral section showed 100 % diagnosis ratios for each fault and the method using wavelet coefficients had noise-robust characteristic.

Feasibility Study on Audio-Tactile Display via Spectral Modulation (스펙트럼 변조를 이용한 청각정보의 촉감재현 가능성 연구)

  • Kwak, Hyun-Koo;Kim, Whee-Kuk;Chung, Ju-No;Kang, Dae-Im;Park, Yon-Kyu;Koo, Min-Mo
    • Journal of the Korean Society for Precision Engineering
    • /
    • v.28 no.5
    • /
    • pp.638-647
    • /
    • 2011
  • Various approaches directly using vibrations of speakers have been suggested to effectively display the aural information such as the music to the hearing-impaired or the deaf. However, in these approaches, the human can't sense the frequency information over the maximum perceivable vibro-tactile frequency (around 1kHz). Therefore, in this study, an approach via spectral modulation of compressing the high frequency audio information into perceivable vibro-tactile frequency domain and outputting the modulated signals through the designated speakers is proposed. Then it is shown, through simulations of using Short-Time Fourier Transform (STFT) with Hanning windows and through preliminary experiments of using the vibro-tactile display testbed which is built and interfaced with a notebook PC, that the modulated signal of a natural sound composing sounds of a frog, a bird, and a water stream could produce the noise-free signal suitable enough for vibro-tactile speakers without causing Significant interfering disturbances, Lastly, for three different combinations of information provided to the subject, that is, i) with only video image, ii) with video image along with the modulated vibro-tactile stimuli as proposed in this study to the forearm of the subject, and iii) with video image along with full audio information, the effects to the human sense of reality and his emotion to given audio-video clips including various sounds and images are investigated and compared. It is shown from results of those experiments that the proposed method of providing modulated vibro-tactile stimuli along with the video images to the human has very high feasibility to transmit pseudo-aural sense to the human.

Development of Context Awareness and Service Reasoning Technique for Handicapped People (멀티 모달 감정인식 시스템 기반 상황인식 서비스 추론 기술 개발)

  • Ko, Kwang-Eun;Sim, Kwee-Bo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.19 no.1
    • /
    • pp.34-39
    • /
    • 2009
  • As a subjective recognition effect, human's emotion has impulsive characteristic and it expresses intentions and needs unconsciously. These are pregnant with information of the context about the ubiquitous computing environment or intelligent robot systems users. Such indicators which can aware the user's emotion are facial image, voice signal, biological signal spectrum and so on. In this paper, we generate the each result of facial and voice emotion recognition by using facial image and voice for the increasing convenience and efficiency of the emotion recognition. Also, we extract the feature which is the best fit information based on image and sound to upgrade emotion recognition rate and implement Multi-Modal Emotion recognition system based on feature fusion. Eventually, we propose the possibility of the ubiquitous computing service reasoning method based on Bayesian Network and ubiquitous context scenario in the ubiquitous computing environment by using result of emotion recognition.

Classification of Whale Sounds using LPC and Neural Networks (신경망과 LPC 계수를 이용한 고래 소리의 분류)

  • An, Woo-Jin;Lee, Eung-Jae;Kim, Nam-Gyu;Chong, Ui-Pil
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.18 no.2
    • /
    • pp.43-48
    • /
    • 2017
  • The underwater transients signals contain the characteristics of complexity, time varying, nonlinear, and short duration. So it is very hard to model for these signals with reference patterns. In this paper we separate the whole length of signals into some short duration of constant length with overlapping frame by frame. The 20th LPC(Linear Predictive Coding) coefficients are extracted from the original signals using Durbin algorithm and applied to neural network. The 65% of whole signals were learned and 35% of the signals were tested in the neural network with two hidden layers. The types of the whales for sound classification are Blue whale, Dulsae whale, Gray whale, Humpback whale, Minke whale, and Northern Right whale. Finally, we could obtain more than 83% of classification rate from the test signals.

  • PDF

Design and Fabrication of an Implantable Microphone for Reduction of Skin Damping Effect through FEA Simulation (피부에 의한 이득 감쇠를 줄이기 위한 FEA 시뮬레이션 기반의 이식형 마이크로폰 설계 및 구현)

  • Han, Ji-Hun;Kim, Min-Woo;Kim, Dong-Wook;Seong, Ki-Woong;Cho, Sung-Mok;Park, Il-Yong;Cho, Jin-Ho
    • Journal of Biomedical Engineering Research
    • /
    • v.29 no.1
    • /
    • pp.59-65
    • /
    • 2008
  • Nowadays, implantable hearing aids have been developed to solve the problems of conventional hearing aids. In case of fully implantable hearing aids, an implantable microphone is necessary to receive sound signal beneath the skin. Normally, an implantable microphone has poor frequency response characteristics in high frequency bands of acoustic signal due to the high frequency attenuation effect of skin after implantation to human body. In this paper, the implantable microphone is designed to reduce the high frequency attenuation effect of a skin by putting its resonance frequency at the attenuated range through a finite element analysis (FEA) simulation. The designed implantable microphone through the simulated results has been fabricated by manufacturing process using bio-compatible materials. By the several in-vitro experiments with pig skin, it has been verified that the designed implantable microphone has a resonance frequency around the starting part of the attenuated range and reduces the attenuation effect.

Implementation of Virtual Violin with a Kinect (키넥트를 이용한 가상 바이올린 구현)

  • Shin, Young-Kyu;Kang, Dong-Gil;Lee, Jung-Chul
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.15 no.3
    • /
    • pp.85-90
    • /
    • 2014
  • In this paper, we propose a virtual violin implementation using the detection of bowing and finger dropping position from the estimated finger tip and finger board information with the 3D image data from a Kinect. Violin finger board pattern and depth information are extracted from the color image and depth image to detect the touch event on the violin finger board and to identify the touched position. Final decision of activated musical alphabet is carried out with the finger drop position and bowing information. Our virtual violin uses PC MIDI to output synthesized violin sound. The experimental results showed that the proposed method can detect finger drop position and bowing detection with high accuracy. Virtual violin can be utilized for the easy and convenient interface for a beginner to learn playing violin with the PC-based learning software.

Bird sounds classification by combining PNCC and robust Mel-log filter bank features (PNCC와 robust Mel-log filter bank 특징을 결합한 조류 울음소리 분류)

  • Badi, Alzahra;Ko, Kyungdeuk;Ko, Hanseok
    • The Journal of the Acoustical Society of Korea
    • /
    • v.38 no.1
    • /
    • pp.39-46
    • /
    • 2019
  • In this paper, combining features is proposed as a way to enhance the classification accuracy of sounds under noisy environments using the CNN (Convolutional Neural Network) structure. A robust log Mel-filter bank using Wiener filter and PNCCs (Power Normalized Cepstral Coefficients) are extracted to form a 2-dimensional feature that is used as input to the CNN structure. An ebird database is used to classify 43 types of bird species in their natural environment. To evaluate the performance of the combined features under noisy environments, the database is augmented with 3 types of noise under 4 different SNRs (Signal to Noise Ratios) (20 dB, 10 dB, 5 dB, 0 dB). The combined feature is compared to the log Mel-filter bank with and without incorporating the Wiener filter and the PNCCs. The combined feature is shown to outperform the other mentioned features under clean environments with a 1.34 % increase in overall average accuracy. Additionally, the accuracy under noisy environments at the 4 SNR levels is increased by 1.06 % and 0.65 % for shop and schoolyard noise backgrounds, respectively.

Investigation of the Acoustic Performance of Lower Grade Elementary School Classrooms (초등학교 저학년 교실의 실내음향성능 실태조사)

  • Jo, A-Hyeon;Park, Chan-Jae;Haan, Chan-Hoon
    • Journal of the Korean Institute of Educational Facilities
    • /
    • v.28 no.3
    • /
    • pp.3-14
    • /
    • 2021
  • Speech information of teachers is transmitted to students in classrooms so that appropriate aural environment should be provided for academic purposes. Many researches have been undertaken for classroom acoustics, and acoustic standards of domestic classrooms were suggested based on the reverberation time and background noise level. However, these standards are suitable for middle and high schools and so not consider the auditory ability by ages. As a precedent research, the present study was begun to suggest an acoustic standard for lower grade elementary school classrooms with children under age 9 who have not normal auditory ability. In order to do this, acoustic performances of the lower grade classrooms were measured and compared with the general classrooms. Also, change of acoustic parameters depending on the desk layout was measured and analyzed. The measured acoustic parameters were background noise, signal to noise ratio, RT, STI, D50, and IACC. As a result, it was found that background noise is exceed the standard of 35dB(A) at the schools along the road sides. Also, it was shown that most of acoustic parameters are higher in the classrooms built recently rather than the old classrooms. Generally, there are not much difference of acoustic parameters among the various desk layouts but, better acoustic performances are acquired at the center line and the seats near sound source. Also, Higher IACC was measured at the seats on the center line facing the source squarely.