• Title/Summary/Keyword: 스펙트럼 음향 매개변수

Search Result 9, Processing Time 0.018 seconds

Acoustic Analysis of King Songdok Bell Using Parameter Estimation of Transient Signals (과도기형태 신호의 매개변수 추정기법을 이용한 성덕대왕 신종의 음향분석)

  • 김영수;진용옥
    • The Journal of the Acoustical Society of Korea
    • /
    • v.17 no.7
    • /
    • pp.91-100
    • /
    • 1998
  • 본 논문에서는 지수함수적 감쇄신호의 매개변수를 효율적으로 추정하기 위한 신호 모델링 기법을 제안한다. 이 방법은 맥놀이 주파수 성분을 갖으면서 과도기 형태 신호인 성 덕대왕 신종의 음파(공중음파 및 지중음파)를 분석하기 위하여 개발되었으며 선형예측모델 을 기본으로 하고 있다. 제안된 방법은 일반적인 데이터 행렬 대신에 자기상관 유사행렬을 사용하였으며 SVD 방법을 이용하여 매개변수를 추정한다. 성덕대왕 신종의 스펙트럼 및 감 쇄계수 특성을 분석하기 위하여 제안된 방법을 수집한 데이터에 적용하였고 분석결과를 토 대로하여 고유주파수 신호의 감쇄계수 및 움통의 역할도 규명하였다.

  • PDF

Comparison of Speaker's Source Characteristics in Different Vowel Characteristics (모음에 따른 화자의 음원특성 비교)

  • 이후동;강선미;장문수;박한상
    • Proceedings of the KSLP Conference
    • /
    • 2003.11a
    • /
    • pp.240-240
    • /
    • 2003
  • 본 논문에서는 기존의 매개변수들과 달리 화자의 고유한 특성을 보여주는 화자인식 매개변수를 발성유형에서 찾고자 한다. 일반적으로 화자의 음원 특성이 발성 유형을 결정한다. 발성유형의 특성을 나타내는 매개변수로는 개방지수(open quotient)와 스펙트럼의 기울기 (spectral tilt)가 있으며, 스펙트럼의 기울기는 음향학적으로 그 특성을 측정할 수 있다. 그러나 기존의 측정방식은 사람마다 다른 기본 주파수와 모음의 영향을 전부 혹은 일부 배제하지 못하였다. (중략)

  • PDF

An Improved Parametric Estimation Method of High-Resolution Bispectrum (고해상도의 바이스펙트럼을 추정하기 위한 개선된 매개변수 방법)

  • Park, So-Hyeon;An, Chong-Koo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.14 no.2E
    • /
    • pp.19-24
    • /
    • 1995
  • The maximum entropy method is a well-known parametric estimation method of the power spectrum with high-resolution for short-time signals. Although a parametric estimation method for the bispectrum was proposed in recent years, it is not easy to estimate the bispectrum with high resolution for relatively short-time signals of which the total length is about 1000 data points. In this paper, a bispectrum estimation method is proposed to estimate the high-resolution bispectrum even for the relatively short-time signals.

  • PDF

Classification of nasal places of articulation based on the spectra of adjacent vowels (모음 스펙트럼에 기반한 전후 비자음 조음위치 판별)

  • Jihyeon Yun;Cheoljae Seong
    • Phonetics and Speech Sciences
    • /
    • v.15 no.1
    • /
    • pp.25-34
    • /
    • 2023
  • This study examined the utility of the acoustic features of vowels as cues for the place of articulation of Korean nasal consonants. In the acoustic analysis, spectral and temporal parameters were measured at the 25%, 50%, and 75% time points in the vowels neighboring nasal consonants in samples extracted from a spontaneous Korean speech corpus. Using these measurements, linear discriminant analyses were performed and classification accuracies for the nasal place of articulation were estimated. The analyses were applied separately for vowels following and preceding a nasal consonant to compare the effects of progressive and regressive coarticulation in terms of place of articulation. The classification accuracies ranged between approximately 50% and 60%, implying that acoustic measurements of vowel intervals alone are not sufficient to predict or classify the place of articulation of adjacent nasal consonants. However, given that these results were obtained for measurements at the temporal midpoint of vowels, where they are expected to be the least influenced by coarticulation, the present results also suggest the potential of utilizing acoustic measurements of vowels to improve the recognition accuracy of nasal place. Moreover, the classification accuracy for nasal place was higher for vowels preceding the nasal sounds, suggesting the possibility of higher anticipatory coarticulation reflecting the nasal place.

Impact of face masks on spectral and cepstral measures of speech: A case study of two Korean voice actors (한국어 스펙트럼과 캡스트럼 측정시 안면마스크의 영향: 남녀 성우 2인 사례 연구)

  • Wonyoung Yang;Miji Kwon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.43 no.4
    • /
    • pp.422-435
    • /
    • 2024
  • This study intended to verify the effects of face masks on the Korean language in terms of acoustic, aerodynamic, and formant parameters. We chose all types of face masks available in Korea based on filter performance and folding type. Two professional voice actors (a male and a female) with more than 20 years of experience who are native Koreans and speak standard Korean participated in this study as speakers of voice data. Face masks attenuated the high-frequency range, resulting in decreased Vowel Space Area (VSA) and Vowel Articulation Index (VAI)scores and an increased Low-to-High spectral ratio (L/H ratio) in all voice samples. This can result in lower speech intelligibility. However, the degree of increment and decrement was based on the voice characteristics. For female speakers, the Speech Level (SL) and Cepstral Peak Prominence (CPP) increased with increasing face mask thickness. In this study, the presence or filter performance of a face mask was found to affect speech acoustic parameters according to the speech characteristics. Face masks provoked vocal effort when the vocal intensity was not sufficiently strong, or the environment had less reverberance. Further research needs to be conducted on the vocal efforts induced by face masks to overcome acoustic modifications when wearing masks.

On the Classification of Voice Sound and the Recognition of Vowels for Korean Continuous Speech (한국어 연속음인식에 관한 연구(유성음 분류 및 단모음 인식 ))

  • 하판봉;이철희;방승찬;안수길
    • The Journal of the Acoustical Society of Korea
    • /
    • v.5 no.3
    • /
    • pp.28-35
    • /
    • 1986
  • 우리나라 음성의 유성음을 모음, 비음 및 유성화 자음으로 분류하는 알고리즘을 기술하였다. 먼 저 기존의 PITCH 검출 알고리즘에 의하여 음성을 유성음과 무성음으로 나눈 뒤, 단지 정규화된 1차 상 관계수, 영교차율, LOG 에너지 및 LPG 에너지의 골짜기 검출만을 이용하여, 유성음은 모음, 비음 및 유 성화자음으로 분류하고 무성음은 실제의 무성음과 묵음으로 분류하였다. 그리고 이렇게 분류된 모음에 대하여 단모음 인식을 행하였다. 단지 한 FRAME으로 모음을 대표하였기 때문에 메모리 크기와 인식 시간을 줄였다. 여기서 UP & DOWN 및 수정된 영교차율을 새로이 정의하여 적용한 결과 만족한 결과 를 얻을 수 있었다. LPC 매개변수 및 전력 스펙트럼도 단모음 인식의 FEATURE로 사용하였다. 그리고 각 FEATURE 의 성능을 비교하였다. 이들 FEATURE을 잘 조합하여 2단계 인식을 행한 결과 92%의 높은 인식율을 얻을 수 있었다.

  • PDF

Laryngeal height and voice characteristics in children with autism spectrum disorders (자폐스펙트럼장애 아동의 후두 높이 및 음성 특성)

  • Lee, Jung-Hun;Kim, Go-Woon;Kim, Seong-Tae
    • Phonetics and Speech Sciences
    • /
    • v.13 no.2
    • /
    • pp.91-101
    • /
    • 2021
  • The purpose of this study was to investigate laryngeal characteristics in children with autism spectrum disorders (ASD). A total of 50 children participated, including eight children aged 2 to 4 years old diagnosed with ASD and 42 normal controls at the same age. All children recorded X-ray images of the midsagittal plane of the cervical spine and larynx, and compared the laryngeal positions of ASD and control. In addition, samples of children with vowel prolongation were collected and analyzed for acoustic parameters. X-rays showed that the height of the hyoid bone in the normal group was the lowest at 3 years of age, and ascended at 4 years of age. Nevertheless, the distance from the external acoustic meatus to the hyoid bone was longest at age 4. 4-year-olds with explosive language development showed laryngeal height elevation and anteriorization. In contrast, the hyoid height of the ASD group of all ages was lower than that of the control group, and there was no difference in the hyoid position between the ages. As a result of acoustic evaluation, PFR, vFo, and vAm were significantly higher ASD than control. Low laryngeal height of ASD children may be associated with delayed language development. PFR, vFo, and vAm seem to be voice markers showing the difference between normal and ASD children.

Speech Enhancement Based on Minima Controlled Recursive Averaging Technique Incorporating Conditional MAP (조건 사후 최대 확률 기반 최소값 제어 재귀평균기법을 이용한 음성향상)

  • Kum, Jong-Mo;Park, Yun-Sik;Chang, Joon-Hyuk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.27 no.5
    • /
    • pp.256-261
    • /
    • 2008
  • In this paper, we propose a novel approach to improve the performance of minima controlled recursive averaging (MCRA) which is based on the conditional maximum a posteriori criterion. A crucial component of a practical speech enhancement system is the estimation of the noise power spectrum. One state-of-the-art approach is the minima controlled recursive averaging (MCRA) technique. The noise estimate in the MCRA technique is obtained by averaging past spectral power values based on a smoothing parameter that is adjusted by the signal presence probability in frequency subbands. We improve the MCRA using the speech presence probability which is the a posteriori probability conditioned on both the current observation the speech presence or absence of the previous frame. With the performance criteria of the ITU-T P.862 perceptual evaluation of speech quality (PESQ) and subjective evaluation of speech quality, we show that the proposed algorithm yields better results compared to the conventional MCRA-based scheme.

Modified AWSSDR method for frequency-dependent reverberation time estimation (주파수 대역별 잔향시간 추정을 위한 변형된 AWSSDR 방식)

  • Min Sik Kim;Hyung Soon Kim
    • Phonetics and Speech Sciences
    • /
    • v.15 no.4
    • /
    • pp.91-100
    • /
    • 2023
  • Reverberation time (T60) is a typical acoustic parameter that provides information about reverberation. Since the impacts of reverberation vary depending on the frequency bands even in the same space, frequency-dependent (FD) T60, which offers detailed insights into the acoustic environments, can be useful. However, most conventional blind T60 estimation methods, which estimate the T60 from speech signals, focus on fullband T60 estimation, and a few blind FDT60 estimation methods commonly show poor performance in the low-frequency bands. This paper introduces a modified approach based on Attentive pooling based Weighted Sum of Spectral Decay Rates (AWSSDR), previously proposed for blind T60 estimation, by extending its target from fullband T60 to FDT60. The experimental results show that the proposed method outperforms conventional blind FDT60 estimation methods on the acoustic characterization of environments (ACE) challenge evaluation dataset. Notably, it consistently exhibits excellent estimation performance in all frequency bands. This demonstrates that the mechanism of the AWSSDR method is valuable for blind FDT60 estimation because it reflects the FD variations in the impact of reverberation, aggregating information about FDT60 from the speech signal by processing the spectral decay rates associated with the physical properties of reverberation in each frequency band.