• Title/Summary/Keyword: Voice quality estimate

Search Result 18, Processing Time 0.023 seconds

Complex nested U-Net-based speech enhancement model using a dual-branch decoder (이중 분기 디코더를 사용하는 복소 중첩 U-Net 기반 음성 향상 모델)

  • Seorim Hwang;Sung Wook Park;Youngcheol Park
    • The Journal of the Acoustical Society of Korea
    • /
    • v.43 no.2
    • /
    • pp.253-259
    • /
    • 2024
  • This paper proposes a new speech enhancement model based on a complex nested U-Net with a dual-branch decoder. The proposed model consists of a complex nested U-Net to simultaneously estimate the magnitude and phase components of the speech signal, and the decoder has a dual-branch decoder structure that performs spectral mapping and time-frequency masking in each branch. At this time, compared to the single-branch decoder structure, the dual-branch decoder structure allows noise to be effectively removed while minimizing the loss of speech information. The experiment was conducted on the VoiceBank + DEMAND database, commonly used for speech enhancement model training, and was evaluated through various objective evaluation metrics. As a result of the experiment, the complex nested U-Net-based speech enhancement model using a dual-branch decoder increased the Perceptual Evaluation of Speech Quality (PESQ) score by about 0.13 compared to the baseline, and showed a higher objective evaluation score than recently proposed speech enhancement models.

A Study on SNR Estimation of Continuous Speech Signal (연속음성신호의 SNR 추정기법에 관한 연구)

  • Song, Young-Hwan;Park, Hyung-Woo;Bae, Myung-Jin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.4
    • /
    • pp.383-391
    • /
    • 2009
  • In speech signal processing, speech signal corrupted by noise should be enhanced to improve quality. Usually noise estimation methods need flexibility for variable environment. Noise profile is renewed on silence region to avoid effects of speech properties. So we have to preprocess finding voice region before noise estimation. However, if received signal does not have silence region, we cannot apply that method. In this paper, we proposed SNR estimation method for continuous speech signal. The waveform which is stationary region of voiced speech is very correlated by pitch period. So we can estimate the SNR by correlation of near waveform after dividing a frame for each pitch. For unvoiced speech signal, vocal track characteristic is reflected by noise, so we can estimate SNR by using spectral distance between spectrum of received signal and estimated vocal track. Lastly, energy of speech signal is mostly distributed on voiced region, so we can estimate SNR by the ratio of voiced region energy to unvoiced.

Low-Delay LSF FEC Technique Robust in Lossy VoIP Environment (VoIP 손실 환경에 강인한 저지연 LSF FEC 기법)

  • Yang, Hae-Yong;Lee, Kyung-Hoon;Hwang, In-Ho
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.39 no.6
    • /
    • pp.687-695
    • /
    • 2002
  • Media-specific FEC techniques, suggested to confront with VoIP speech packet loss, improve speech quality at the expense of generating additional one-frame delay. In this paper, we suggest new media-specific FEC, i.e, LSF FEC technique which is able to improve speech quality with much shortened additional delay. In the proposed technique, the LSF parameters of the future frame are utilized to recover a lost packet. To evaluate performance of the proposed technique, we use ITU-T G.723.1 and G.729 Codec and apply Gilbert packet loss model and estimate MOS per every packet loss rate using PESQ speech quality estimation algorithm. The proposed technique has effect of shortening delay over from 6.5ms to 27ms compared with existing media-specific FEC techniques. Simulation results for comparison of reconstructed speech quality show this novel technique improves the MOS over 0.1 in practical lossy environment of 5 % packet loss rate.

Derivation of Weights for Customer Requirements Attribute in Kano-QFD Integration Model (Kano-QFD 통합모형에서의 고객 요구속성 중요도 산정)

  • Moon, Kyung-Won;Kim, Nak-Hoon;Jeong, Byung-Ho
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.37 no.1
    • /
    • pp.68-78
    • /
    • 2014
  • Recently, companies are trying to gain a competitive advantage in the market to meet the voice of customer. For this purpose, QFD has been used as product development technology in many areas to include the customer' requirements. Also, Kano model has been used to understand the customer' requirements for an effective way. Therefore integration of Kano model and QFD can more efficiently reflect the customer' requirements when designing a new service. This paper proposes PI index by taking into account the current satisfaction position of our company and competitors while IR (Improvement Ratio) value was set uniformity. This study suggests a more accurate index to predict potential improvements and calculates the final importance or priority. Through case studies targeted at elevator maintenance companies, we can have a general idea how much to improve in the near future and estimate the final importance of customer requirements.

Consumer Public Complaint Behaviors and Satisfaction of Complaint Handling By Credit Card Services (신용카드서비스에 대한 공적불평행동과 불평처리 만족에 관한 연구)

  • Lee, Youngae
    • Korean Journal of Human Ecology
    • /
    • v.21 no.5
    • /
    • pp.957-973
    • /
    • 2012
  • This study analyzed consumer public complaint behaviors and the satisfaction of complaint handling among credit card users who availed of credit card services. Relatively little research has been done in this area, despite the obvious importance of understanding and improving credit card market conditions. The purpose of this study was to examine consumer compliant behaviors with a focus on public actions, such as voice responses and the third party actions among credit card users. With the goal of providing consumers with more positive expectations of credit card companies' complaint handling process, this study investigated the status of public actions and the negative effect of complaints on the overall satisfaction of post-complaint behavior toward credit card services. The responses from 1,000 credit card users were analyzed using descriptive analysis, factor analysis, multi-logit analysis, and Heckman selection estimate. The analysis provided three major results: (1) perceived service quality among credit card users was conceptualized into groups such as responsiveness, innovation, company, additional service, and fee, (2) perceived service qualities, age, residential area, employment status, and subjective economic status had significant effect on public compliant action behaviors, and (3) unidimensional factors resulting from post-complaint behaviors were analyzed and several variables, such as period of credit card use, average amount used, and perceived service quality had significant effects on the degree of satisfaction associated with complaint handling in terms of credit card services. Several implications and directions for further research are discussed.

An AP Selection Scheme for Enhancement of Multimedia Streaming in Wireless Network Environments (무선 네트워크 환경에서 멀티미디어 서비스를 위한 AP 선정 기법)

  • Ryu, Dong-Woo;Wang, Wei-Bin;Kang, Kyung-Jin
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.11 no.3
    • /
    • pp.997-1005
    • /
    • 2010
  • Recently, there has been a growing interest in the use of WLAN technology due to its easy deployment, flexibility and so on. Examples of WLAN applications range from standard internet services such as Web access to real-time services with strict latency/throughput requirements such as multimedia video and voice over IP on wireless network environments. Fair and efficient distribution of the traffic loads among APs(Access Points) has become an important issue for improved utilization of WLAN. This paper focuses on an AP selection scheme for achieving better load balance, and hence increasing network resource utilization for each user on wireless network environments. This scheme makes use of active scan patterns and the network delay as main parameters of load measurement and AP selection. This scheme attempts to estimate the AP traffic loads by observing the up/down delay and utilize the results to maximize the link resource efficiency through load balancing. We compared the proposed scheme with the original SNR(Signal to Noise Ratio)-based scheme using the NS-2(Network Simulation.2). We found that the proposed scheme improves the throughput by 12.5% and lower the network up/down link delay by 36.84% and 60.42%, respectively. All in all, the new scheme can significantly increase overall network throughput and reduce up/down delay while providing excellent quality for voice and video services.

A Study of Enemy Aptitude of Pistol Sound Source for Space Estimation (공간평가를 위한 피스톨음원의 적정성에 관한 연구)

  • Shon, Jang-Ryul;Kim, Jung-Joong
    • Transactions of the Korean Society for Noise and Vibration Engineering
    • /
    • v.15 no.3 s.96
    • /
    • pp.320-328
    • /
    • 2005
  • Last target of architectural acoustics is that people wish to convey voice effectively from the space adaptively in use purpose in building. But, how exactly through space sound (sound source) that wish to deliver from indoor can be passed method to do quantification and evaluate quantity of sound by method to serve indoor architectural acoustics estimation summer period and methods to estimate definition propose. This Study searches special quality of sound source about MLS signal that is occurred short-answer sound source (pistol sound source) and nondirectional speaker among indoor sound estimation method, and measure and analyzed reverberation time (RT60), definition (C80, D50) by regulation of each ISO 3382 in age place (classroom, hall, gymnasium). Analysis result and sound factor among could know that d of two sound sources converges in measurement error extent about reverberation time (RT60) of analysis incidental and sound factors and value shows change irregularly about sound factor of D50, C80, pistol sound source judged there is problem. Also, could know that problem is happened in deflection except reverberation time is in deflection analysis with wave that measure each in fixed distance in branch. Finally, when differ size of sound source and measure about change of sound pressure level in case measure sound pressure level giving difference about 10 dB, sound factor could know that there is no different effect.

Speech Enhancement Based on Modified IMCRA Using Spectral Minima Tracking with Weighted Subband Selection (서브밴드 가중치를 적용한 스펙트럼 최소값 추적을 이용하는 수정된 IMCRA 기반의 음성 향상 기법)

  • Park, Yun-Sik;Park, Gyu-Seok;Lee, Sang-Min
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.49 no.3
    • /
    • pp.89-97
    • /
    • 2012
  • In this paper, we propose a novel approach to noise power estimation for speech enhancement in noisy environments. The method based on IMCRA (improved minima controlled recursive averaging) which is widely used in speech enhancement utilizes a rough VAD (voice activity detection) algorithm which excludes speech components during speech periods in order to improves the performance of the noise power estimation by reducing the speech distortion caused by the conventional algorithm based on the minimum power spectrum derived from the noisy speech. However, since the VAD algorithm is not sufficient to distinguish speech from noise at non-stationary noise and low SNRs (signal-to-noise ratios), the speech distortion resulted from the minimum tracking during speech periods still remained. In the proposed method, minimum power estimate obtained by IMCRA is modified by SMT (spectral minima tracking) to reduce the speech distortion derived from the bias of the estimated minimum power. In addition, in order to effectively estimate minimum power by considering the distribution characteristic of the speech and noise spectrum, the presented method combines the minimum estimates provided by IMCRA and SMT depending on the weighting factor based on the subband. Performance of the proposed algorithm is evaluated by subjective and objective quality tests under various environments and better results compared with the conventional method are obtained.