• Title/Summary/Keyword: 음질평가

Search Result 353, Processing Time 0.019 seconds

Laryngeal height and voice characteristics in children with autism spectrum disorders (자폐스펙트럼장애 아동의 후두 높이 및 음성 특성)

  • Lee, Jung-Hun;Kim, Go-Woon;Kim, Seong-Tae
    • Phonetics and Speech Sciences
    • /
    • v.13 no.2
    • /
    • pp.91-101
    • /
    • 2021
  • The purpose of this study was to investigate laryngeal characteristics in children with autism spectrum disorders (ASD). A total of 50 children participated, including eight children aged 2 to 4 years old diagnosed with ASD and 42 normal controls at the same age. All children recorded X-ray images of the midsagittal plane of the cervical spine and larynx, and compared the laryngeal positions of ASD and control. In addition, samples of children with vowel prolongation were collected and analyzed for acoustic parameters. X-rays showed that the height of the hyoid bone in the normal group was the lowest at 3 years of age, and ascended at 4 years of age. Nevertheless, the distance from the external acoustic meatus to the hyoid bone was longest at age 4. 4-year-olds with explosive language development showed laryngeal height elevation and anteriorization. In contrast, the hyoid height of the ASD group of all ages was lower than that of the control group, and there was no difference in the hyoid position between the ages. As a result of acoustic evaluation, PFR, vFo, and vAm were significantly higher ASD than control. Low laryngeal height of ASD children may be associated with delayed language development. PFR, vFo, and vAm seem to be voice markers showing the difference between normal and ASD children.

A study on loss combination in time and frequency for effective speech enhancement based on complex-valued spectrum (효과적인 복소 스펙트럼 기반 음성 향상을 위한 시간과 주파수 영역 손실함수 조합에 관한 연구)

  • Jung, Jaehee;Kim, Wooil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.41 no.1
    • /
    • pp.38-44
    • /
    • 2022
  • Speech enhancement is performed to improve intelligibility and quality of the noise-corrupted speech. In this paper, speech enhancement performance was compared using different loss functions in time and frequency domains. This study proposes a combination of loss functions to utilize advantage of each domain by considering both the details of spectrum and the speech waveform. In our study, Scale Invariant-Source to Noise Ratio (SI-SNR) is used for the time domain loss function, and Mean Squared Error (MSE) is used for the frequency domain, which is calculated over the complex-valued spectrum and magnitude spectrum. The phase loss is obtained using the sin function. Speech enhancement result is evaluated using Source-to-Distortion Ratio (SDR), Perceptual Evaluation of Speech Quality (PESQ), and Short-Time Objective Intelligibility (STOI). In order to confirm the result of speech enhancement, resulting spectrograms are also compared. The experimental results over the TIMIT database show the highest performance when using combination of SI-SNR and magnitude loss functions.

Speech Reinforcement Based on G.729A Speech Codec Parameter Under Near-End Background Noise Environments (근단 배경 잡음 환경에서 G.729A 음성부호화기 파라미터에 기반한 새로운 음성 강화 기법)

  • Choi, Jae-Hun;Chang, Joon-Hyuk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.4
    • /
    • pp.392-400
    • /
    • 2009
  • In this paper, we propose an effective speech reinforcement technique base on ITU-T G.729A CS-ACELP codec under the near-end background noise environments. In general, since the intelligibility of the far-end speech for the near-end listener is significantly reduced under near-end noise environments, we require a far-end speech reinforcement approach to avoid this phenomena. In contrast to the conventional speech reinforcement algorithm, we reinforce the excitation signal of the codec's parameters received from the far-end speech signal based on the G.729A speech codec under various background noise environments. Specifically, we first estimate the excitation signal of ambient noise at the near-end through the encoder of the G.729A speech codec, reinforcing the excitation signal of the far-end speech transmitted from the far-end. we specially propose a novel approach to directly reinforce the excitation signal of far-end speech signal based on the decoder of the G.729A. The performance of the proposed algorithm is evaluated by the CCR (Comparison Category Rating) test of the method for subjective determination of transmission quality in ITU-T P.800 under various noise environments and shows better performances compared with conventional SNR Recovery methods.