Search | Korea Science

Wavelet-based Pitch Detector for 2.4 kbps Harmonic-CELP Coder (2.4 kbps 하모닉-CELP 코더를 위한 웨이블렛 피치 검출기)

방상운;이인성;권오주
- The Journal of the Acoustical Society of Korea
- /
- v.22 no.8
- /
- pp.717-726
- /
- 2003
This paper presents the methods that design the Wavelet-based pitch detector for 2,4 kbps Harmonic-CELP Coder, and that achieve the effective waveform interpolation by decision window shape of the transition region, Waveform interpolation coder operates by encoding one pitch-period-sized segment, a prototype segment, of speech for each frame, generate the smooth waveform interpolation between the prototype segments for voiced frame, But, harmonic synthesis of the prototype waveforms between previous frame and current frame occur not only waveform errors but also discontinuity at frame boundary on that case of pitch halving or doubling, In addtion, in transition region since waveform interpolation coder synthesizes the excitation waveform by using overlap-add with triangularity window, therefore, Harmonic-CELP fail to model the instantaneous increasing speech and synthesis waveform linearly increases, First of all, in order to detect the precise pitch period, we use the hybrid 1st pitch detector, and increse the precision by using 2nd ACF-pitch detector, Next, in order to modify excitation window, we detect the onset, offset of frame by GCI, As the result, pitch doubling is removed and pitch error rate is decreased 5.4% in comparison with ACF, and is decreased 2,66% in comparison with wavelet detector, MOS test improve 0.13 at transition region.
PDF KSCI

Adaptive Enhancement Algorithm of Perceptual Filter Using Variable Threshold (가변 임계값을 이용한 지각 필터의 적응적인 음질 개선 알고리즘)

차형태
- The Journal of the Acoustical Society of Korea
- /
- v.23 no.6
- /
- pp.446-453
- /
- 2004
In this paper, a new adaptive perceptual filter using variable threshold to enhance audio signals degraded by additively nonstationary noise is proposed. The adaptive perceptual filter updates variable threshold each time according to the power of signal and the effect of noise variation. So the noisy audio signal is enhanced by the method which controls a residual noise effectively. The proposed algorithm uses the perceptual filter which transforms a time domain signal into frequency domain and calculates an intensity energy and an excitation energy in bark domain. In this method. the stage updated the response of filter is decided by threshold. The proposed algorithm using vairable threshold effectively controls a residual noise using the energy difference of audio signals degraded by the additive nonstationary noise. The proposed method is tested with the noisy audio signals degraded by nonstationary noise at various signal -to-noise ratios (SNR). We carry out NMR and MOS test when the input SNR is 15dB. 20dB. 25dB and 30dB. An approximate improvement of 17.4dB. 15.3dB, 12.8dB. 9.8dB in NMR and enhancement of 2.9, 2.5, 2.3, 1.7 in MOS test is achieved with the input signals. respectively.
PDF KSCI

The Performance Improvement of PLC by Using RTP Extension Header Data for Consecutive Frame Loss Condition in CELP Type Vocoder (CELP Type Vocoder에서 RTP 확장 헤더 데이터를 이용한 연속적인 프레임 손실에 대한 PLC 성능개선)

Hong, Seong-Hoon;Bae, Myung-Jin
- The Journal of the Acoustical Society of Korea
- /
- v.29 no.1
- /
- pp.48-55
- /
- 2010
It has a falling off in speech quality, especially when consecutive packet loss occurs, even if a vocoder implemented in the packet network has its own packet loss concealment (PLC) algorithm. PLC algorithm is divided into transmitter and receiver algorithm. Algorithm in the transmitter gives superior quality by additional information. however it is impossible to provide mutual compatibility and it occurs extra delay and transmission rate. The method applied in the receiver does not require additional delay. However, it sets limits to improve the speech quality. In this paper, we propose a new method that puts extra information for PLC in a part of Extension Header Data which is not used in RTP Header. It can solve the problem and obtain enhanced speech quality. There is no extra delay occurred by the proposed algorithm because there is a jitter buffer to adjust network delay in a receiver. Extra information, 16 bits each frame for G.729 PLC, is allocated for MA filter index in LP synthesis, excitation signal, excitation signal gain and residual gain reconstruction. It is because a transmitter sends speech data each 20 ms when it transfers RTP payload. As a result, the proposed method shows superior performance about 13.5%.
https://doi.org/10.7776/ASK.2010.29.1.048 인용 PDF KSCI

On a Pitch Alteration Method by Time-axis Scaling Compensated with the Spectrum for High Quality Speech Synthesis (고음질 합성용 스펙트럼 보상된 시간축조절 피치 변경법)

Bae, Myung-Jin;Lee, Won-Cheol;Im, Sung-Bin
- The Journal of the Acoustical Society of Korea
- /
- v.14 no.4
- /
- pp.89-95
- /
- 1995
The waveform coding technique has concerned with simply preserving the waveform shape of speech signal through a redundancy reduction process. In the case of speech synthesis, the waveform coding with high sound quality is mainly used to the synthesis by analysis. However, since the parameters of this coding are not classified into either excitation or vocal tract parameters, it is difficult to applying the waveform coding to the synthesis by rule. In order to apply the waveform coding to the synthesis by rule, the pitch alteration technique is required in prosody control. In this paper, we propose a new pitch alteration method that can change the pitch period in waveform coding by scaling the time-axis and compensating the spectrum. This is relevant to the time-frequency domain method were the phase components of the waveform is preserved with a little spectrum distortion of 2.5 % and less for 50% pitch change.
PDF

Fast Acoustic Radiation Force Impulse Imaging Using Non-focused Transmission in Medical Ultrasound Imaging (초음파 의료 영상에서 비집속 송신을 이용한 고속 음향 복사력 임펄스 영상법)

Choi, Seung-Min;Park, Jeong-Man;Kwon, Sung-Jae;Jeong, Mok-Kun
- The Journal of the Acoustical Society of Korea
- /
- v.31 no.3
- /
- pp.151-160
- /
- 2012
In medical ultrasound imaging, elasticity imaging helps to diagnose tumors such as cancer. This paper is concerned with the application of acoustic radiation force to soft tissue of interest to implement elasticity imaging. In order to reduce the data acquisition time, instead of relying on transmit focusing, a plane wave of burst type is transmitted to apply the acoustic radiation force simultaneously to an entire imaging region to be observed. A homogeneous phantom experiment confirms that increasing the transmit excitation duration instead of employing transmit focusing generates a high enough acoustic radiation force to obtain elasticity images. It is found, however, that a different displacement versus time characteristic is observed unlike the case of using a conventional focused acoustic radiation force. Experimental results obtained through the use of an ultrasound phantom and a bovine liver show that lesions can be correctly differentiated.
https://doi.org/10.7776/ASK.2012.31.3.151 인용 PDF KSCI

Design and Implementation of Simple Text-to-Speech System using Phoneme Units (음소단위를 이용한 소규모 문자-음성 변환 시스템의 설계 및 구현)

Park, Ae-Hee;Yang, Jin-Woo;Kim, Soon-Hyob
- The Journal of the Acoustical Society of Korea
- /
- v.14 no.3
- /
- pp.49-60
- /
- 1995
This paper is a study on the design and implementation of the Korean Text-to-Speech system which is used for a small and simple system. In this paper, a parameter synthesis method is chosen for speech syntheiss method, we use PARCOR(PARtial autoCORrelation) coefficient which is one of the LPC analysis. And we use phoneme for synthesis unit which is the basic unit for speech synthesis. We use PARCOR, pitch, amplitude as synthesis parameter of voice, we use residual signal, PARCOR coefficients as synthesis parameter of unvoice. In this paper, we could obtain the 60% intelligibility by using the residual signal as excitation signal of unvoiced sound. The result of synthesis experiment, synthesis of a word unit is available. The controlling of phoneme duration is necessary for synthesizing of a sentence unit. For setting up the synthesis system, PC 486, a 70[Hz]-4.5[KHz] band pass filter for speech input/output, amplifier, and TMS320C30 DSP board was used.
PDF

Optimal Design of Underwater SAW Devices (수중 SAW Device의 최적 설계법)

Roh, Yong-Rae
- The Journal of the Acoustical Society of Korea
- /
- v.9 no.4
- /
- pp.18-32
- /
- 1990
Deeping on purpose, SAW device may have to function while immersed in a liquid. Those who are familiar with SAW devices would anticipate difficulty since the propagating surface waves will tend to radiate energy into the liquid and hence suffer attenuation. Thus, to design an immerable SAW device, more attention and full information about the wave properites is required to overcome the attenuation and get the highest SAW generation eficiency. Though numerical simulation, the optimal geometry of underwater SAW devices, such as optimal piezoelectric crystal cut, SAW propagation direction and nondimensional wave number(ka) is determined to get the maximum SAW excitation efficiency, the minimum attenuation in propagation and pure mode propagation for all the modes of surface wave propagation. The design technique can be appliedto an arbitrary combination of a piezoelectric layer, a substrate and a liquid medium. In this paper, PZT and PVDF layers and a steel substrate are use for the solid medium. The technique can be easily employed for the design of underwater sensors and actuators for the applications, such as sonar marine antifouling, industrial and medical uses.
PDF

A Proposal of Output Method of Round Window Stimulation Type Middle Ear Implants using Acoustic Transmission (공기 전도형 출력을 갖는 정원창 자극형 인공중이의 출력방식 제안)

Seong, Kiwoong;Lee, KyuYup;Kim, Myoung Nam;Cho, Jin-Ho
- Journal of Korea Multimedia Society
- /
- v.21 no.6
- /
- pp.678-684
- /
- 2018
In order to broaden the indication of middle ear implant, research has been actively conducted on the reverse output method that stimulates the round window. However, it is very difficult to transmit the vibration output effectively because the indivisual anatomical difference of the round window niche is very large and also the visual field is not secured even by a skilled otolaryngologic surgeon. In this paper, we propose a new reverse stimulation method of middle ear implants that transmits energy to the inner ear by using air as a medium. This can compensate for the disadvantages of the conventional method of transmitting vibration energy and minimizes the energy transfer efficiency interference due to the combination of the excitation point and the output device. It was shown that forward and backward transfer characteristics were obtained by cadaveric experiments, and it was shown that it can overcome the acoustical impedance of high round window and transmit energy to inner ear. The receiver, which is the output device of the conventional hearing aids, can generate a constant volume velocity, so it can have a high output at a limited volume, such as a round window niche. So, suggested method can overcome the high acoustical impedance of the round window and deliver acoustic energy to the inner ear.
https://doi.org/10.9717/kmms.2018.21.6.678 인용 PDF KSCI

An Efficient Algebraic Codebook Search Method for ham Speech Coder (적응형 다중 비트율 음성 부호화기를 위한 효율적인 대수코드북 검색법)

변경진;정희범;한민수
- The Journal of the Acoustical Society of Korea
- /
- v.22 no.2
- /
- pp.129-134
- /
- 2003
In this paper, we efficiently implement the AMR speech coder by reducing the complexity of algebraic codebook search. To reduce the computational complexity of the algebraic codebook search, we propose a fast algebraic codebook search method that improves conventional depth first tree search method used in AMR speech coder algorithm. The proposed method reduces the search complexity by pruning the trees which are less possible to be selected as an optimum excitation. This method needs no additional computation for selecting the trees to be pruned and reduces the computational complexity considerably compared to the original depth first tree search method with slightly degradation or speech qualify. Applying our method to the implementation or AMR speech coder with 12.2 kbps mode by using the TeakLite DSP, we reduce the search complexity about 40% compared to the conventional method.
PDF KSCI

Performance Improvement of Perceptual Filter Using Noise Energy Control (잡음 에너지 제어를 통한 지각 필터 성능 개선)

Seo Joung-Kook;Cha Hyung-Tai
- The Journal of the Acoustical Society of Korea
- /
- v.24 no.1
- /
- pp.43-51
- /
- 2005
In this paper, we propose an algorithm that improves a tone quality of a noisy audio signal in order to enhance a Performance of perceptual filter using noise energy control. Most of the algorithms which were proposed by the other researchers usually applied a filter using the noise energy acquired from a silent range. In this case. the improvement rate of tone quality decreases if the noise energy is changed by the magnitude or environment variation in a signal frame. But the Proposed method Provides the means to find a food estimated noise through energy control of the estimated noise which is obtained from a silent range. Also we can get the enhancement of tone qualify in low frequency band unlike other methods. To show the performance of the Proposed algorithm, various input signals which had a different signal-to-noise ratio (SNR) such as 5dB, l0dB, 15dB and 20dB were used to test the proposed algorithm. With the proposed algorithm, we could confirm the enhancement of tone quality in terms of segmental SNR (SSNR). noise-to-mask ration (NMR) and mean opinion score (MOS) test.
PDF KSCI

Search Result 105, Processing Time 0.021 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)