• Title/Summary/Keyword: Sound Signal

Search Result 898, Processing Time 0.023 seconds

A Comparison Study on the Speech Signal Parameters for Chinese Leaners' Korean Pronunciation Errors - Focused on Korean /ㄹ/ Sound (중국인 학습자의 한국어 발음 오류에 대한 음성 신호 파라미터들의 비교 연구 - 한국어의 /ㄹ/ 발음을 중심으로)

  • Lee, Kang-Hee;You, Kwang-Bock;Lim, Ha-Young
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.7 no.6
    • /
    • pp.239-246
    • /
    • 2017
  • This paper compares the speech signal parameters between Korean and Chinese for Korean pronunciation /ㄹ/, which is caused many errors by Chinese leaners. Allophones of /ㄹ/ in Korean is divided into lateral group and tap group. It has been investigated the reasons for these errors by studying the similarity and the differences between Korean /ㄹ/ pronunciation and its corresponding Chinese pronunciation. In this paper, for the purpose of comparison the speech signal parameters such as energy, waveform in time domain, spectrogram in frequency domain, pitch based on ACF, Formant frequencies are used. From the phonological perspective the speech signal parameters such as signal energy, a waveform in the time domain, a spectrogram in the frequency domain, the pitch (F0) based on autocorrelation function (ACF), Formant frequencies (f1, f2, f3, and f4) are measured and compared. The data, which are composed of the group of Korean words by through a philological investigation, are used and simulated in this paper. According to the simulation results of the energy and spectrogram, there are meaningful differences between Korean native speakers and Chinese leaners for Korean /ㄹ/ pronunciation. The simulation results also show some differences even other parameters. It could be expected that Chinese learners are able to reduce the errors considerably by exploiting the parameters used in this paper.

DIGITAL IMAGE PROCESSING AND CLINICAL APPLICATION OF VIDEODENSITOMETER (실험적으로 제작한 Videodensitometer의 디지털 영상처리와 임상적 적용에 관한 연구)

  • Park Kwan-Soo;Lee Sang-Rae
    • Journal of Korean Academy of Oral and Maxillofacial Radiology
    • /
    • v.22 no.2
    • /
    • pp.273-282
    • /
    • 1992
  • The purpose of this study was to propose the utility which was evaluated the digital image processing and clinical application of the videodensitomery. The experiments were performed with IBM-PC/16bit-AT compatible, video camera(CCdtr55, Sony Co., Japan), an color monitor(MultiSync 3D, NEC, Japan) providing the resolution of 512×480 and 64 levels of gray. Sylvia Image Capture Board for the ADC(analog to digital converter) was used, composed of digitized image from digital signal and the radiographic density was measured by 256 level of gray. The periapical radiograph(Ektaspeed EP-21, Kodak Co., U. S. A) which was radiographed dried human mandible by exposure condition of 70 kVp and 48 impulses, was used for primary X-ray detector. And them evaluated for digitzed image by low and high pass filtering, correlations between aluminum equivalent values and the thickness of aluminum step wedge, aluminum equivalent values of sound enamel, dentin, and alveolar bone, the range of diffuse density for gray level ranging from 0 to 255. The obtained results were as follows: 1. The edge between aluminum steps of digitized image were somewhat blurred by low pass filtering, but edge enhancement could be resulted by high pass filtering. Expecially, edge enhancement between distal root of lower left 2nd molar and alveolar lamina dura was observed. 2. The correlation between aluminum equivalent values and the thickness of aluminum step wedge was intimated, yielding the coefficient of correlation r=0.9997(p<0.00l), the regression line was described by Y=0.9699X+0.456, and coefficient of variation amounting to 1.5%. 3. The aluminum equivalent values of sound enamel, dentin, and alvolar bone were 15.41㎜, 12.48㎜, 10.35㎜, respectively. 4. The range of diffuse density for gray level ranging from 0 to 255 was wider enough than that of photodenstiometer to be within the range of 1-4.9.

  • PDF

Performance improvement of a quiet zone using multichannel real-time active noise control system (다채널 실시간 능동 소음제어 시스템을 이용한 정숙공간 성능개선)

  • Mu, Xiangbin;Ko, JinSeok;Rheem, JaeYeol
    • The Journal of the Acoustical Society of Korea
    • /
    • v.35 no.3
    • /
    • pp.216-222
    • /
    • 2016
  • Generation of a quiet zone in noisy environment is undoubtedly of considerable realistic significance. This paper describes development and implementation of a multichannel real-time active noise control (ANC) system for 3 dimensional noisy environment to enhance the quiet zone performance in terms of size and noise cancellation gain. The proposed ANC system employes a multichannel delay-compensated filtered-X least mean square (FXLMS) algorithm; its real-time implementation is designed in TMS320C6713 digital signal processor (DSP) board. The system is evaluated for cancelling various tonal frequency noises in the range from 100 to 500 Hz, and the performance is then illustrated by measuring the quiet zone in terms of sound pressure level (SPL) attenuation. Experiment results show that a quiet zone of quiet with satisfactory size and maximum 24 dB noise attenuation is successfully generated.

Music Genre Classification using Spikegram and Deep Neural Network (스파이크그램과 심층 신경망을 이용한 음악 장르 분류)

  • Jang, Woo-Jin;Yun, Ho-Won;Shin, Seong-Hyeon;Cho, Hyo-Jin;Jang, Won;Park, Hochong
    • Journal of Broadcast Engineering
    • /
    • v.22 no.6
    • /
    • pp.693-701
    • /
    • 2017
  • In this paper, we propose a new method for music genre classification using spikegram and deep neural network. The human auditory system encodes the input sound in the time and frequency domain in order to maximize the amount of sound information delivered to the brain using minimum energy and resource. Spikegram is a method of analyzing waveform based on the encoding function of auditory system. In the proposed method, we analyze the signal using spikegram and extract a feature vector composed of key information for the genre classification, which is to be used as the input to the neural network. We measure the performance of music genre classification using the GTZAN dataset consisting of 10 music genres, and confirm that the proposed method provides good performance using a low-dimensional feature vector, compared to the current state-of-the-art methods.

DIRECTIVE HARMONIC WAVE DETECTING SYSTEM USING LINEAR MICROPHONE ARRAY (직선배열 Microphone에 의한 음원의 방향과 주파수의 분석 System)

  • CHANG J.;ABE M.;KIM C.;KIDO K.
    • Korean Journal of Fisheries and Aquatic Sciences
    • /
    • v.13 no.4
    • /
    • pp.145-149
    • /
    • 1980
  • Various methods have been so far proposed to find out the directions and spectra of sound waves from the sources for provisions of noise controls. The conventional methods are generally classified into three systems such as, single microphone system, moving microphone system and multi-microphone system, which composes a resultant super directivity by giving a appropriate delay and a weighting coefficient in the output of each microphone. In case of using a single microphone there is a difficulty in providing it with desirable super directivity in the low frequency range, while in case of using multi-microphone system there has been a disadvantage that the measurement of directivity could not separately be done with the spectrum analysing. And in case of the use of a moving microphone system it needs a condition that the sound source to be detected should be stationary state and in rest. However here we introduce a method that the spectral analysing and the directivity of synthesis can be separately carried out by using a linear array of many microphones, in which each output of the microphone is multiplied by appropriate weighting coefficient and all of those products are summed after passing through adequate filters. The resultant signal is then sampled with an adequate sampling frequency and taken average for processing.

  • PDF

Vocal Separation Using Selective Frequency Subtraction Considering with Energies and Phases (에너지와 위상을 고려한 선택적 주파수 차감법을 이용한 보컬 분리)

  • Kim, Hyuntae;Park, Jangsik
    • Journal of Broadcast Engineering
    • /
    • v.20 no.3
    • /
    • pp.408-413
    • /
    • 2015
  • Recently, According to increasing interest to original sound Karaoke instrument, MIDI type karaoke manufacturer attempt to make more cheap method instead of original recoding method. The specific method is to make the original sound accompaniment to remove only the voice of the singer in the singer music album. In this paper, a system to separate vocal components from music accompaniment for stereo recordings were proposed. Proposed system consists of two stages. The first stage is a vocal detection. This stage classifies an input into vocal and non vocal portions by using SVM with MFCC. In the second stage, selective frequency subtractions were performed at each frequency bin in vocal portions. In this case, it is determined in consideration not only the energies for each frequency bin but also the phase of the each frequency bin at each channel signal. Listening test with removed vocal music from proposed system show relatively high satisfactory level.

Electronic Stethoscope using PVDF Sensor for Wireless Transmission of Heart and Lung Sounds (PVDF를 이용한 청진 센서 및 심폐음 무선 전송이 가능한 전자 청진기)

  • Im, Jae Joong;Lim, Young Chul
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.12 no.6
    • /
    • pp.57-63
    • /
    • 2012
  • Effective use of stethoscope is very important for primary clinical diagnosis for the increasing cardiovascular and respiratory disease. This study developed the contact vibration sensor using piezopolymer film which minimizes the ambient noise, and signal processing algorithm was applied for providing better auscultation sounds compare to the existing electronic stethoscopes. Especially, low frequency heart sounds were acquired without distortion, and the quality of lung sounds were improved. Also, auscultating sounds could be transmitted using bluetooth, which made possible to be used for the u-healthcare environment. Results of this study, auscultation of heart and lung sounds, could be applied to the convergence industry of medical and information communication technology through remote diagnosis.

An Accuracy Improvement Method on Acoustic Source Localization Using Ground Reflection Effect (지면반사효과를 이용한 폭발 소음원의 위치 추정 정밀도 향상법)

  • Go, Yeong-Ju;Choi, Donghun;Lee, Jaehyung;Choi, Jong-Soo;Ha, Jae-Hyoun;Na, Taeheum
    • Transactions of the Korean Society for Noise and Vibration Engineering
    • /
    • v.26 no.1
    • /
    • pp.69-74
    • /
    • 2016
  • A technique for improving estimation accuracy is introduced in order to locate the impact position of artillery shell during the weapon scoring test. Study on localization of impacts using acoustic measurement has been conducted and the usability of sensor array is verified with experiments. When the blast occurs above the ground in the firing range, the acoustic sensor above the ground can measure the directly propagated sound with the ground-reflected one. In this study, a method for reducing estimation error by using the reflection signal measurements based on the time difference of arrival method. Considering the reflection sound works as same as placing a virtual sensor symmetrically through the ground. This idea enables a virtual three-dimensional array configuration with a two-dimensional plane array above the ground as such. The time difference between the direct and the reflected propagations can be estimated using cepstrum analysis. Performance test has been made in the simulation experiment in the football size area.

A Study on the Spoken KOrean-Digit Recognition Using the Neural Netwok (神經網을 利用한 韓國語 數字音 認識에 관한 硏究)

  • Park, Hyun-Hwa;Gahang, Hae Dong;Bae, Keun Sung
    • The Journal of the Acoustical Society of Korea
    • /
    • v.11 no.3
    • /
    • pp.5-13
    • /
    • 1992
  • Taking devantage of the property that Korean digit is a mono-syllable word, we proposed a spoken Korean-digit recognition scheme using the multi-layer perceptron. The spoken Korean-digit is divided into three segments (initial sound, medial vowel, and final consonant) based on the voice starting / ending points and a peak point in the middle of vowel sound. The feature vectors such as cepstrum, reflection coefficients, ${\Delta}$cepstrum and ${\Delta}$energy are extracted from each segment. It has been shown that cepstrum, as an input vector to the neural network, gives higher recognition rate than reflection coefficients. Regression coefficients of cepstrum did not affect as much as we expected on the recognition rate. That is because, it is believed, we extracted features from the selected stationary segments of the input speech signal. With 150 ceptral coefficients obtained from each spoken digit, we achieved correct recognition rate of 97.8%.

  • PDF

Quality Improvement of Low Bitrate HE-AAC using Linear Prediction Pre-processor (저 전송률 환경에서 선형예측 전처리기를 사용한 HE-AAC의 성능 향상)

  • Lee, Jae-Seong;Lee, Gun-Woo;Park, Young-Chul;Youn, Dae-Hee
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.34 no.8C
    • /
    • pp.822-829
    • /
    • 2009
  • This paper proposes a new method of improving the quality of High Efficiency Advanced Audio Coding (HE-AAC). HE-AAC encodes input source by allocating bits for each scalefactor bands appropriately according to human ear's psychoacoustic property. As a result, insufficient bits are assigned to the bands which have relatively low energy. This imbalance between different energy bands can cause decreasing of sound quality like musical noise. In the proposed system, a Linear Prediction (LP) module is combined with HE-AAC as a pre-processor to improve sound quality by even bits distribution. To apply accurate human being's psychoacoustic property, the psychoacoustic model uses Fast Fourier Transform (FFT) spectrum of original input signal to make masking threshold. In its implementation, masking threshold of psychoacoustic model is normalized using the LP spectral envelope in prior to quantization of the LP residual. Experimental result shows that, the proposed algorithm allocates bits appropriately for insufficient bits condition and improves the performance of HE-AAC.