Search | Korea Science

Performance comparison evaluation of speech enhancement using various loss functions (다양한 손실 함수를 이용한 음성 향상 성능 비교 평가)

Hwang, Seo-Rim;Byun, Joon;Park, Young-Cheol
- The Journal of the Acoustical Society of Korea
- /
- v.40 no.2
- /
- pp.176-182
- /
- 2021
This paper evaluates and compares the performance of the Deep Nerual Network (DNN)-based speech enhancement models according to various loss functions. We used a complex network that can consider the phase information of speech as a baseline model. As the loss function, we consider two types of basic loss functions; the Mean Squared Error (MSE) and the Scale-Invariant Source-to-Noise Ratio (SI-SNR), and two types of perceptual-based loss functions, including the Perceptual Metric for Speech Quality Evaluation (PMSQE) and the Log Mel Spectra (LMS). The performance comparison was performed through objective evaluation and listening tests with outputs obtained using various combinations of the loss functions. Test results show that when a perceptual-based loss function was combined with MSE or SI-SNR, the overall performance is improved, and the perceptual-based loss functions, even exhibiting lower objective scores showed better performance in the listening test.
https://doi.org/10.7776/ASK.2021.40.2.176 인용 PDF KSCI

Effect Analysis of Timing Offsets for Asynchronous MC-CDMA Uplink Systems (비동기 MC-CDMA 상향 링크 시스템에서의 시간 옵셋 영향 분석)

Ko, Kyun-Byoung;Woo, Choong-Chae
- Journal of the Institute of Electronics Engineers of Korea TC
- /
- v.47 no.8
- /
- pp.1-8
- /
- 2010
This paper models a symbol timing offset (STO) with respect to the guard period and the maximum access delay time for asynchronous multicarrier code division multiple access (MC-CDMA) uplink systems over frequency-selective multipath fading channels. Analytical derivation shows that STO causes desired signal power degradation and generates self-interferences. This effect of the STO on the average bit error rate (BER) and the effective signal-to-noise ratio (SNR) is evaluated. The approximated BER and the SNR loss caused by STO are then obtained as closed-form expressions. The tightness between the analytical result and the simulated one is verified for the different STOs and SNRs. Furthermore, the derived analytical results are verified via Monte Carlo simulations.
PDF KSCI

A study on loss combination in time and frequency for effective speech enhancement based on complex-valued spectrum (효과적인 복소 스펙트럼 기반 음성 향상을 위한 시간과 주파수 영역 손실함수 조합에 관한 연구)

Jung, Jaehee;Kim, Wooil
- The Journal of the Acoustical Society of Korea
- /
- v.41 no.1
- /
- pp.38-44
- /
- 2022
Speech enhancement is performed to improve intelligibility and quality of the noise-corrupted speech. In this paper, speech enhancement performance was compared using different loss functions in time and frequency domains. This study proposes a combination of loss functions to utilize advantage of each domain by considering both the details of spectrum and the speech waveform. In our study, Scale Invariant-Source to Noise Ratio (SI-SNR) is used for the time domain loss function, and Mean Squared Error (MSE) is used for the frequency domain, which is calculated over the complex-valued spectrum and magnitude spectrum. The phase loss is obtained using the sin function. Speech enhancement result is evaluated using Source-to-Distortion Ratio (SDR), Perceptual Evaluation of Speech Quality (PESQ), and Short-Time Objective Intelligibility (STOI). In order to confirm the result of speech enhancement, resulting spectrograms are also compared. The experimental results over the TIMIT database show the highest performance when using combination of SI-SNR and magnitude loss functions.
https://doi.org/10.7776/ASK.2022.41.1.038 인용 PDF KSCI

Effect of Channel Estimation Error on Capacity of MIMO Systems (MIMO 시스템의 채널 용량에 대한 채널 추정 오차의 영향 분석)

함재상;심세준;이충용;박현철;홍대식
- Journal of the Institute of Electronics Engineers of Korea TC
- /
- v.41 no.8
- /
- pp.63-68
- /
- 2004
The capacity of MIMO systems is numerically analyzed when channel estimation error exists. The analysis shows that the capacity is influenced by Mean Square Error (MSE) as well as average Signal to Noise Ratio (SNR). Furthermore, in this paper we present the standard selecting a channel estimator suitable to a system owing to get a tolerable channel estimation error in a given average SNR and channel capacity loss. The simulation results show that the tolerable MSEs for 1 bps/Hz capacity loss are about 10$^{-2}$ and 10$^{-4}$ at n dB and 40 dB average SNR, respectively.
PDF KSCI

Auditory Recognition of Digit-in-Noise under Unaided and Aided Conditions in Moderate and Severe Sensorineural Hearing Loss

Aghasoleimani, Mina;Jalilvand, Hamid;Mahdavi, Mohammad Ebrahim;Ahmadi, Roghayeh
- Journal of Audiology & Otology
- /
- v.25 no.2
- /
- pp.72-79
- /
- 2021
Background and Objectives: The speech-in-noise test is typically performed using an audiometer. The results of the digit-in-noise recognition (DIN) test may be influenced by the flat frequency response of free-field audiometry and frequency of the hearing aid fit based on fitting rationale. This study aims to investigate the DIN test in unaided and aided conditions. Subjects and Methods: Thirty four adults with moderate and severe sensorineural hearing loss (SNHL) participated in the study. The signal-to-noise ratio (SNR) for 50% of the DIN test was obtained in the following two conditions: 1) the unaided condition, performed using an audiometer in a free field; and 2) aided condition, performed using a hearing aid with an unvented individual earmold that was fitted based on NAL-NL2. Results: There was a statistically significant elevation in the mean SNR for the severe SNHL group in both test conditions when compared with that of the moderate SNHL group. In both groups, the SNR for the aided condition was significantly lower than that of the unaided condition. Conclusions: Speech recognition in hearing-impaired patients can be realized by fitting hearing aids based on evidence-based fitting rationale rather than by measuring it using free-field audiometry measurement that is utilized in a routine clinic setup.
https://doi.org/10.7874/jao.2020.00094 인용

A Study on Variation and Determination of Gaussian function Using SNR Criteria Function for Robust Speech Recognition (잡음에 강한 음성 인식에서 SNR 기준 함수를 사용한 가우시안 함수 변형 및 결정에 관한 연구)

전선도;강철호
- The Journal of the Acoustical Society of Korea
- /
- v.18 no.7
- /
- pp.112-117
- /
- 1999
In case of spectral subtraction for noise robust speech recognition system, this method often makes loss of speech signal. In this study, we propose a method that variation and determination of Gaussian function at semi-continuous HMM(Hidden Markov Model) is made on the basis of SNR criteria function, in which SNR means signal to noise ratio between estimation noise and subtracted signal per frame. For proving effectiveness of this method, we show the estimation error to be related with the magnitude of estimated noise through signal waveform. For this reason, Gaussian function is varied and determined by SNR. When we test recognition rate by computer simulation under the noise environment of driving car over the speed of 80㎞/h, the proposed Gaussian decision method by SNR turns out to get more improved recognition rate compared with the frequency subtracted and non-subtracted cases.
PDF

On Design and Performance Analysis of Asymmetric 2PAM: 5G Network NOMA Perspective (비대칭 2PAM의 설계와 성능 분석: 5G 네트워크의 비직교 다중 접속 관점에서)

Chung, Kyuhyuk
- Journal of Convergence for Information Technology
- /
- v.10 no.10
- /
- pp.24-31
- /
- 2020
In non-orthogonal multiple access (NOMA), the degraded performance of the weaker channel gain user is a problem. In this paper, we propose the asymmetric binary pulse amplitude modulation (2PAM), to improve the bit-error rate (BER) performance of the weaker channel user in NOMA with the tolerable BER loss of the stronger channel user. First, we design the asymmetric 2PAM, calculate the total allocated power, and derive the closed-form expression for the BER of the proposed scheme. Then it is shown that the BER of the weaker channel user improves, with the small BER loss of the stronger channel user. The superiority of the proposed scheme is also validated by demonstating that the signal-to-noise ratio (SNR) gain of the weaker channel user is about 10 dB, with the SNR loss of 3 dB of the stronger channel user. In result, the asymmetric 2PAM could be considered in NOMA of 5G systems. As a direction of the future research, it would be meaningful to analyze the achievable data rate for the propsed scheme.
https://doi.org/10.22156/CS4SMB.2020.10.10.024 인용 PDF KSCI

Depending on PACS Operating System Differences Analysis of Usefulness of Lossless Compression Method in Medical Image Upload: SNR, CNR, Histogram Comparative Analysis (PACS운영 시스템 차이에 따른 의료 영상 업로드 시 무손실 압축 방식의 유용성 분석: SNR, CNR, Histogram 비교 분석을 중심으로)

Choi, Ji-An;Hwang, Jun-Ho;Lee, Kyung-Bae
- The Journal of the Korea Contents Association
- /
- v.18 no.3
- /
- pp.299-308
- /
- 2018
This study focused on the fact that medical images that are issued at different hospitals may affect image quality on PACS when different software is used. A university hospital image was copied to the DICOM file and registered on the PACS of the university hospital B. The capacity and image quality of the software used in the university hospital were evaluated by SNR, CNR and histogram. As the compression ratio increased, SNR and CNR tended to decrease. Note that Lossless Compression decreased the data size by half compared to No Compression, but SNR and CNR did not change. As a result of the histogram analysis, the information loss due to the underflow phenomenon was conspicuous. When moving to another hospital, No compression or lossless compression method should be used. In conclusion, it is useful to use the lossless compression method, considering waiting time and economic efficiency in uploading.
https://doi.org/10.5392/JKCA.2018.18.03.299 인용 PDF KSCI

Auditory Recognition of Digit-in-Noise under Unaided and Aided Conditions in Moderate and Severe Sensorineural Hearing Loss

Aghasoleimani, Mina;Jalilvand, Hamid;Mahdavi, Mohammad Ebrahim;Ahmadi, Roghayeh
- Korean Journal of Audiology
- /
- v.25 no.2
- /
- pp.72-79
- /
- 2021
Background and Objectives: The speech-in-noise test is typically performed using an audiometer. The results of the digit-in-noise recognition (DIN) test may be influenced by the flat frequency response of free-field audiometry and frequency of the hearing aid fit based on fitting rationale. This study aims to investigate the DIN test in unaided and aided conditions. Subjects and Methods: Thirty four adults with moderate and severe sensorineural hearing loss (SNHL) participated in the study. The signal-to-noise ratio (SNR) for 50% of the DIN test was obtained in the following two conditions: 1) the unaided condition, performed using an audiometer in a free field; and 2) aided condition, performed using a hearing aid with an unvented individual earmold that was fitted based on NAL-NL2. Results: There was a statistically significant elevation in the mean SNR for the severe SNHL group in both test conditions when compared with that of the moderate SNHL group. In both groups, the SNR for the aided condition was significantly lower than that of the unaided condition. Conclusions: Speech recognition in hearing-impaired patients can be realized by fitting hearing aids based on evidence-based fitting rationale rather than by measuring it using free-field audiometry measurement that is utilized in a routine clinic setup.
https://doi.org/10.7874/jao.2020.00094 인용

BS-PLC(Both Side-Packet Loss Concealment) for CELP Coder (CELP 부호화기를 위한 양방향 패킷 손실 은닉 알고리즘)

Lee In-Sung;Hwang Jeong-Joon;Jeong Gyu-Hyeok
- Journal of the Institute of Electronics Engineers of Korea TC
- /
- v.42 no.12
- /
- pp.127-134
- /
- 2005
Lost packet robustness is an most important quality measure for voice over IP networks(VoIP). Recovery of the lost packet from the received information is crucial to realize this robustness. So, this paper proposes the lost packet recovery method from the received information for real-time communication for CELP coder. The proposed BS-PLC (Both Side Packet Loss Concealment) based WSOLA(Waveform Shift OverLab Add) allow the lost packet to be recovered from both the 'previous' and 'next' good packet as the LP parameter and the excitation signal are respectively recovered. The burst of packet loss is modeled by Gilbert model. The proposed scheme is applied to G.729 most used in VoIP and is evaluated through the SNR(signal to noise) and the MOS(Mean Opinion Score) test. As a simulation result, The proposed scheme provide 0.3 higher in Mean Opinion Score and 2 dB higher in terms of SNR than an error concealment procedure in the decoder of G.729 at $20\%$ average packet loss rate.
PDF KSCI

Search Result 123, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)