• Title/Summary/Keyword: 음향성능

Search Result 2,164, Processing Time 0.022 seconds

Deviation of Heavy-Weight Floor Impact Sound Levels According to Measurement Positions (마이크로폰의 위치에 따른 중량 바닥충격음레벨의 편차)

  • Oh Yang-Ki;Joo Moon-Ki;Park Jong-Young;Kim Ha-Geun;Yang Kwan-Seop
    • The Journal of the Acoustical Society of Korea
    • /
    • v.25 no.2
    • /
    • pp.49-55
    • /
    • 2006
  • Measurement of impact sound insulation of floor, by current Korean Standard KS F 2810-2. is to be made with peak levels over 4 point in a receiving room. But it is often the case that there is inconsistency in results at various receiving points in the receiving room. Such variations obviously have effects on the repeatability and reproducibility of measured data. The result shows that there are even 10 dB deviations in 63Hz octave band frequency range and relatively less variations are occurred in other low frequency ranges. Such variations seems to be coming from modal overlaps of the receiving room. According to current rating method of floor impact sound. KS F 2863-2, that may affect on the single number latins scheme. From the result of tests in this study, there are 2dB to 6dB differences in the sin91e number with the combination of measurement points. This means that the reduction of measurement variations from the microphone positions is needed for a better credibility of measurement results.

Geoacoustic Inversion and Source Localization with an L-Shaped Receiver Array (L-자형 선배열을 이용한 지음향학적 인자 역산 및 음원 위치 추정)

  • Kim, Kyung-Seop;Lee, Keun-Hwa;Kim, Seong-Il;Kim, Young-Gyu;Seong, Woo-Jae
    • The Journal of the Acoustical Society of Korea
    • /
    • v.25 no.7
    • /
    • pp.346-355
    • /
    • 2006
  • Acoustic data from a shallow water experiment in the East Sea of Korea (MAPLE IV) is Processed to investigate the Performance of matched-field geo-acoustic inversion and source localization. The receiver array consists of two legs as in an L-shape. one vertical and the other horizontal lying on the seabed. Narrowband multi-tone CW source was towed along a slightly inclined bathymetry track. The matched-field geo-acoustic inversion includes comparisons between three processing techniques. all based on the Bartlett processor as; (1) the coherent processing of the data from the full array, (2) the incoherent Product of each output from both the horizontal and vertical arrays, and (3) the cross correlation between the horizontal and vertical arrays. as well as processing each array leg separately. To verify the inversion results. matched-field source localization for low level source signal components were performed using the same Processors used at the inversion stage.

Development of a Listener Position Adaptive Real-Time Sound Reproduction System (청취자 위치 적응 실시간 사운드 재생 시스템의 개발)

  • Lee, Ki-Seung;Lee, Seok-Pil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.29 no.7
    • /
    • pp.458-467
    • /
    • 2010
  • In this paper, a new audio reproduction system was developed in which the cross-talk signals would be reasonably cancelled at an arbitrary listener position. To adaptively remove the cross-talk signals according to the listener's position, a method of tracking the listener position was employed. This was achieved using the two microphones, where the listener direction was estimated using the time-delay between the two signals from the two microphones, respectively. Moreover, room reverberation effects were taken into consideration where linear prediction analysis was involved. To remove the cross-talk signals at the left-and right-ears, the paths between the sources and the ears were represented using the KEMAR head-related transfer functions (HRTFs) which were measured from the artificial dummy head. To evaluate the usefulness of the proposed listener tracking system, the performance of cross-talk cancellation was evaluated at the estimated listener positions. The performance was evaluated in terms of the channel separation ration (CSR), a -10 dB of CSR was experimentally achieved although the listener positions were more or less deviated. A real-time system was implemented using a floating-point digital signal processor (DSP). It was confirmed that the average errors of the listener direction was 5 degree and the subjects indicated that 80 % of the stimuli was perceived as the correct directions.

Implementation of Parallel Processor for Sound Synthesis of Guitar (기타의 음 합성을 위한 병렬 프로세서 구현)

  • Choi, Ji-Won;Kim, Yong-Min;Cho, Sang-Jin;Kim, Jong-Myon;Chong, Ui-Pil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.29 no.3
    • /
    • pp.191-199
    • /
    • 2010
  • Physical modeling is a synthesis method of high quality sound which is similar to real sound for musical instruments. However, since physical modeling requires a lot of parameters to synthesize sound of a musical instrument, it prevents real-time processing for the musical instrument which supports a large number of sounds simultaneously. To solve this problem, this paper proposes a single instruction multiple data (SIMD) parallel processor that supports real-time processing of sound synthesis of guitar, a representative plucked string musical instrument. To control six strings of guitar, we used a SIMD parallel processor which consists of six processing elements (PEs). Each PE supports modeling of the corresponding string. The proposed SIMD processor can generate synthesized sounds of six strings simultaneously when a parallel synthesis algorithm receives excitation signals and parameters of each string as an input. Experimental results using a sampling rate 44.1 kHz and 16 bits quantization indicate that synthesis sounds using the proposed parallel processor were very similar to original sound. In addition, the proposed parallel processor outperforms commercial TI's TMS320C6416 in terms of execution time (8.9x better) and energy efficiency (39.8x better).

Blind Rhythmic Source Separation (블라인드 방식의 리듬 음원 분리)

  • Kim, Min-Je;Yoo, Ji-Ho;Kang, Kyeong-Ok;Choi, Seung-Jin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.8
    • /
    • pp.697-705
    • /
    • 2009
  • An unsupervised (blind) method is proposed aiming at extracting rhythmic sources from commercial polyphonic music whose number of channels is limited to one. Commercial music signals are not usually provided with more than two channels while they often contain multiple instruments including singing voice. Therefore, instead of using conventional modeling of mixing environments or statistical characteristics, we should introduce other source-specific characteristics for separating or extracting sources in the under determined environments. In this paper, we concentrate on extracting rhythmic sources from the mixture with the other harmonic sources. An extension of nonnegative matrix factorization (NMF), which is called nonnegative matrix partial co-factorization (NMPCF), is used to analyze multiple relationships between spectral and temporal properties in the given input matrices. Moreover, temporal repeatability of the rhythmic sound sources is implicated as a common rhythmic property among segments of an input mixture signal. The proposed method shows acceptable, but not superior separation quality to referred prior knowledge-based drum source separation systems, but it has better applicability due to its blind manner in separation, for example, when there is no prior information or the target rhythmic source is irregular.

Salient Region Detection Algorithm for Music Video Browsing (뮤직비디오 브라우징을 위한 중요 구간 검출 알고리즘)

  • Kim, Hyoung-Gook;Shin, Dong
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.2
    • /
    • pp.112-118
    • /
    • 2009
  • This paper proposes a rapid detection algorithm of a salient region for music video browsing system, which can be applied to mobile device and digital video recorder (DVR). The input music video is decomposed into the music and video tracks. For the music track, the music highlight including musical chorus is detected based on structure analysis using energy-based peak position detection. Using the emotional models generated by SVM-AdaBoost learning algorithm, the music signal of the music videos is classified into one of the predefined emotional classes of the music automatically. For the video track, the face scene including the singer or actor/actress is detected based on a boosted cascade of simple features. Finally, the salient region is generated based on the alignment of boundaries of the music highlight and the visual face scene. First, the users select their favorite music videos from various music videos in the mobile devices or DVR with the information of a music video's emotion and thereafter they can browse the salient region with a length of 30-seconds using the proposed algorithm quickly. A mean opinion score (MOS) test with a database of 200 music videos is conducted to compare the detected salient region with the predefined manual part. The MOS test results show that the detected salient region using the proposed method performed much better than the predefined manual part without audiovisual processing.

The Error Pattern Analysis of the HMM-Based Automatic Phoneme Segmentation (HMM기반 자동음소분할기의 음소분할 오류 유형 분석)

  • Kim Min-Je;Lee Jung-Chul;Kim Jong-Jin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.25 no.5
    • /
    • pp.213-221
    • /
    • 2006
  • Phone segmentation of speech waveform is especially important for concatenative text to speech synthesis which uses segmented corpora for the construction of synthetic units. because the quality of synthesized speech depends critically on the accuracy of the segmentation. In the beginning. the phone segmentation was manually performed. but it brings the huge effort and the large time delay. HMM-based approaches adopted from automatic speech recognition are most widely used for automatic segmentation in speech synthesis, providing a consistent and accurate phone labeling scheme. Even the HMM-based approach has been successful, it may locate a phone boundary at a different position than expected. In this paper. we categorized adjacent phoneme pairs and analyzed the mismatches between hand-labeled transcriptions and HMM-based labels. Then we described the dominant error patterns that must be improved for the speech synthesis. For the experiment. hand labeled standard Korean speech DB from ETRI was used as a reference DB. Time difference larger than 20ms between hand-labeled phoneme boundary and auto-aligned boundary is treated as an automatic segmentation error. Our experimental results from female speaker revealed that plosive-vowel, affricate-vowel and vowel-liquid pairs showed high accuracies, 99%, 99.5% and 99% respectively. But stop-nasal, stop-liquid and nasal-liquid pairs showed very low accuracies, 45%, 50% and 55%. And these from male speaker revealed similar tendency.

A Performance Improvement Method using Variable Break in Corpus Based Japanese Text-to-Speech System (가변 Break를 이용한 코퍼스 기반 일본어 음성 합성기의 성능 향상 방법)

  • Na, Deok-Su;Min, So-Yeon;Lee, Jong-Seok;Bae, Myung-Jin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.2
    • /
    • pp.155-163
    • /
    • 2009
  • In text-to-speech systems, the conversion of text into prosodic parameters is necessarily composed of three steps. These are the placement of prosodic boundaries. the determination of segmental durations, and the specification of fundamental frequency contours. Prosodic boundaries. as the most important and basic parameter. affect the estimation of durations and fundamental frequency. Break prediction is an important step in text-to-speech systems as break indices (BIs) have a great influence on how to correctly represent prosodic phrase boundaries, However. an accurate prediction is difficult since BIs are often chosen according to the meaning of a sentence or the reading style of the speaker. In Japanese, the prediction of an accentual phrase boundary (APB) and major phrase boundary (MPB) is particularly difficult. Thus, this paper presents a method to complement the prediction errors of an APB and MPB. First, we define a subtle BI in which it is difficult to decide between an APB and MPB clearly as a variable break (VB), and an explicit BI as a fixed break (FB). The VB is chosen using the classification and regression tree, and multiple prosodic targets in relation to the pith and duration are then generated. Finally. unit-selection is conducted using multiple prosodic targets. In the MOS test result. the original speech scored a 4,99. while proposed method scored a 4.25 and conventional method scored a 4.01. The experimental results show that the proposed method improves the naturalness of synthesized speech.

Functional beamforming for high-resolution ultrasound imaging in the air with random sparse array transducer (고해상도 공기중 초음파 영상을 위한 기능성 빔형성법 적용)

  • Choon-Su Park
    • The Journal of the Acoustical Society of Korea
    • /
    • v.43 no.3
    • /
    • pp.361-367
    • /
    • 2024
  • Ultrasound in the air is widely used in industry as a measurement technique to prevent abnormalities in the machinery. Recently, the use of airborne ultrasound imaging techniques, which can find the location of abnormalities using an array transducers, is increasing. A beamforming method that uses the phase difference for each sensor is used to visualize the location of the ultrasonic sound source. We exploit a random sparse ultrasonic array and obtain beamforming power distribution on the source in a certain distance away from the array. Conventional beamforming methods inevitably have limited spatial resolution depending on the number of sensors used and the aperture size. A high-resolution ultrasound imaging technique was implemented by applying functional beamforming as a method to overcome the geometric constraints of the array. The functional beamforming method can be expressed as a generalized beam forming method mathematically, and has the advantage of being able to obtain high-resolution imaging by reducing main-lobe width and side lobes. As a result of observation through computer simulation, it was verified that the resolution of the ultrasonic source in the air was successfully increased by functional beamforming using the ultrasonic sparse array.

Characteristics of source localization with horizontal line array using frequency-difference autoproduct in the East Sea environment (동해 환경에서 차주파수 곱 및 수평선배열을 이용한 음원 위치추정 특성)

  • Joung-Soo Park;Jungyong Park;Su-Uk Son;Ho Seuk Bae;Keun-Wha Lee
    • The Journal of the Acoustical Society of Korea
    • /
    • v.43 no.1
    • /
    • pp.29-38
    • /
    • 2024
  • The Matched Field Processing (MFP) is an estimation method for a source range and depth based on the prediction of sound propagation. However, as the frequency increases, the prediction inaccuracy of sound propagation increases, making it difficult to estimate the source position. Recently proposed, the Frequency-Difference Matched Field Processing (FD-MFP) is known to be robust even if there is a mismatch by applying a frequency-difference autoproduct extracted from the auto-correlation of a high frequency signal. In this paper, in order to evaluate the performance of the FD-MFP using a horizontal line array, simulations were conducted in the environment of the East Sea of Korea. In the area of Bottom Bounce (BB) and Convergence Zone (CZ) where detection of a sound source is possible at a long range, and the results of localization were analyzed. According to the the FD-MFP simulations of horizontal line array, the accuracy of localization is similar or degraded compared to the conventional MFP due to diffracted field and mismatch of sound speed. There was no clear result from the simulations conforming that the FD-MFP was more robust to mismatch than the conventional MFP.