• 제목/요약/키워드: Speech rate

검색결과 1,241건 처리시간 0.032초

발화 속도와 말차례 교체 빈도에 따른 운율 단위 변화에 관한 연구 (A study on the change of prosodic units by speech rate and frequency of turn-taking)

  • 원유권
    • 말소리와 음성과학
    • /
    • 제14권2호
    • /
    • pp.29-38
    • /
    • 2022
  • 이 연구는 국립국어원 일상 대화 음성 코퍼스(2020)에서 나타나는 발화를 분석하여 발화 속도 및 말차례 교체 빈도가 운율 단위 변화에 어떤 영향을 끼치는지 밝히는 것을 목적으로 하였다. 분석 결과, 발화 속도가 증가할수록 억양구, 어절 빈도, 발화 길이가 증가하는 양의 상관관계를 보였으나 상관관계가 낮았고, 회귀모형의 적합도는 3%-11%로 설명력이 약했다. 말차례 교체 빈도에 따른 평균 발화 속도는 유의미한 차이가 있었고, 말차례 교체 빈도가 증가할수록 발화 속도는 감소하였다. 또한 말차례 교체 빈도가 증가할수록 억양구 및 어절 빈도와 발화 길이는 감소하였으며 높은 음의 상관관계가 있는 것으로 나타났다. 회귀 모형의 적합도는 27%-32%로 계산되었다. 말차례 교체 빈도가 발화 속도와 운율 단위를 변화시키는 요인으로 작용했을 수 있다. 이는 대화체에서 나타나는 비유창성, 말차례 교체 특성, 화자 간 활발한 상호작용 등이 영향을 미쳤을 것이라 추측된다.

A Low Bit Rate Speech Coder Based on the Inflection Point Detection

  • Iem, Byeong-Gwan
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • 제15권4호
    • /
    • pp.300-304
    • /
    • 2015
  • A low bit rate speech coder based on the non-uniform sampling technique is proposed. The non-uniform sampling technique is based on the detection of inflection points (IP). A speech block is processed by the IP detector, and the detected IP pattern is compared with entries of the IP database. The address of the closest member of the database is transmitted with the energy of the speech block. In the receiver, the decoder reconstructs the speech block using the received address and the energy information of the block. As results, the coder shows fixed data rate contrary to the existing speech coders based on the non-uniform sampling. Through computer simulation, the usefulness of the proposed technique is shown. The SNR performance of the proposed method is approximately 5.27 dB with the data rate of 1.5 kbps.

한국인 화자의 영어 발화 속도와 피치, 강세 간의 관계 연구 (A Study on the Relation among English Speech Rate, Pitch and Stress by Korean Speakers)

  • 김지은
    • 말소리와 음성과학
    • /
    • 제6권3호
    • /
    • pp.101-108
    • /
    • 2014
  • This study investigates the relation among pitch range differences, speech rate and realization of stress. To identify the realization of the stress, vowel formants and durational differences of stressed and unstressed vowels are measured. The Korean learners were asked to read a textbook passage which includes nine sentences. The major results indicate that: (1) Korean speakers' pitch range is less than 50% of the native speakers; (2) There is a significantly negative relation between high-low pitch range and speech rate; (3) The vowel qualities and durations of the stressed and unstressed vowels are related to the speech rate. But these are not related to the high-low pitch range.

Effects of gender, age, and individual speakers on articulation rate in Seoul Korean spontaneous speech

  • Kim, Jungsun
    • 말소리와 음성과학
    • /
    • 제10권4호
    • /
    • pp.19-29
    • /
    • 2018
  • The present study investigated whether there are differences in articulation rate by gender, age, and individual speakers in a spontaneous speech corpus produced by 40 Seoul Korean speakers. This study measured their articulation rates using a second-per-syllable metric and a syllable-per-second metric. The findings are as follows. First, in spontaneous Seoul Korean speech, there was a gender difference in articulation rates only in age group 10-19, among whom men tended to speak faster than women. Second, individual speakers showed variability in their rates of articulation. The tendency for some speakers to speak faster than others was variable. Finally, there were metric differences in articulation rate. That is, regarding the coefficients of variation, the values of the second-per-syllable metric were much higher than those for the syllable-per-second metric. The articulation rate for the syllable-per-second metric tended to be more distinct among individual speakers. The present results imply that data gathered in a corpus of Seoul Korean spontaneous speech may reflect speaker-specific differences in articulatory movements.

언어 및 인지 과제 동시수행이 발화속도에 미치는 영향 (Effects of Concurrent Linguistic or Cognitive Tasks on Speech Rate)

  • 한지연;김효정;김문정
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2007년도 한국음성과학회 공동학술대회 발표논문집
    • /
    • pp.102-105
    • /
    • 2007
  • This study was designed to examination effects of concurrent linguistic or cognitive tasks on speech rate. Eight normal speakers were repeated sentences either with or without simultaneous a linguistic task and a cognitive task. Linguistic task was conducted by generating verbs from nouns and cognitive task meaned performing mental arithmetic. Speech rate was measured from acoustic data. One-way ANOVA conducted to know speech rate difference among 3 different type of tasks. The results showed there was no significant difference between sentence repeat and linguistic tasks. But There was significant difference findings: sentence repeat and linguistic task, linguistic and cognitive task.

  • PDF

Comparison of Speech Rate and Long-Term Average Speech Spectrum between Korean Clear Speech and Conversational Speech

  • Yoo, Jeeun;Oh, Hongyeop;Jeong, Seungyeop;Jin, In-Ki
    • 대한청각학회지
    • /
    • 제23권4호
    • /
    • pp.187-192
    • /
    • 2019
  • Background and Objectives: Clear speech is an effective communication strategy used in difficult listening situations that draws on techniques such as accurate articulation, a slow speech rate, and the inclusion of pauses. Although too slow speech and improperly amplified spectral information can deteriorate overall speech intelligibility, certain amplitude of increments of the mid-frequency bands (1 to 3 dB) and around 50% slower speech rates of clear speech, when compared to those in conversational speech, were reported as factors that can improve speech intelligibility positively. The purpose of this study was to identify whether amplitude increments of mid-frequency areas and slower speech rates were evident in Korean clear speech as they were in English clear speech. Subjects and Methods: To compare the acoustic characteristics of the two methods of speech production, the voices of 60 participants were recorded during conversational speech and then again during clear speech using a standardized sentence material. Results: The speech rate and longterm average speech spectrum (LTASS) were analyzed and compared. Speech rates for clear speech were slower than those for conversational speech. Increased amplitudes in the mid-frequency bands were evident for the LTASS of clear speech. Conclusions:The observed differences in the acoustic characteristics between the two types of speech production suggest that Korean clear speech can be an effective communication strategy to improve speech intelligibility.

Adaptive Multi-Rate(AMR) 음성부호화 알고리즘 (Adaptive Multi-Rate(AMR) Speech Coding Algorithm)

  • 서정욱;배건성
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2000년도 하계종합학술대회 논문집(4)
    • /
    • pp.92-97
    • /
    • 2000
  • An AMR(Adaptive Multi-Rate) speech coding algorithm has been adopted as a standard speech codec for IMT-2000. It is based on the algebraic CELP, and consists of eight speech coding modes having the bit rate from 4.75 kbit/s to 12.2 kbit/s. It also contains the VAD(Voice Activity Detector), SCR (Source Controlled Rate) operation, and error concealment scheme for robustness in a radio channel. The bit rate of AMR is changed on a frame basis depending on the channel condition. In this paper, we introduced AMR speech coding algorithm and performed the real-time implementation using TMS320C6201, i.e., a Texas Instrument's fixed-point DSP. With the ANSI C source code released from ETSI and 3GPP, we convert and optimize the program to make it run in real time using the C compiler and assembly language. It is verified that the decoded result of the implemented speech codec on the DSP is identical with the PC simulation result using ANSI C code for test sequences. Also, actual sound input/output test using microphone and speaker demonstrates its proper real-time operation without distortions or delays.

  • PDF

음성과 영상 정보를 이용한 우리말 숫자음 인식 (Digit Recognition using Speech and Image Information)

  • 이종혁;최재원
    • 한국정보통신학회논문지
    • /
    • 제6권1호
    • /
    • pp.83-88
    • /
    • 2002
  • 대부분 음성 인식 시스템에서는 음성 신호에서 추출한 특징 파라미터를 입력 정보로 하고 있다. 본 연구에서는 숫자음 인식률을 높이기 위하여 음성 인식 시스템에 음성과 영상 정보를 동시에 이용할 수 있는 방법을 제안하였다. 실험을 통하여 음성정보만을 사용한 인식결과와 음성과 영상정보를 동시에 사용한 인식결과를 비교한 결과, 음성과 영상 정보를 동시에 입력했을 때 약 6%정도의 인식률의 증가를 가져옴을 알 수 있었다. 이를 통해 숫자음 인식을 위해 음성정보만을 사용하는 것보다 영상정보를 같이 사용하는 것이 더욱 효과적임을 알 수 있었다.

은닉 마코프 모델과 켑스트럴 계수들에 따른 한국어 속삭임의 인식 비교 (Comparison of HMM models and various cepstral coefficients for Korean whispered speech recognition)

  • 박찬응
    • 전자공학회논문지 IE
    • /
    • 제43권2호
    • /
    • pp.22-29
    • /
    • 2006
  • 본 논문에서는 모바일 환경에 따른 속삭임의 사용이 증가하는 데 따른 속삭임 인식을 위하여 음성인식에 많이 사용되고 있는 특징벡터들을 은닉 마코프 모델을 이용, 정상어 모델, 속삭임 모델, 정상어, 속삭임 통합 모델들에 인식 시험하고 결과를 분석하여 가장 적합한 인식 시스템을 찾으려고 하였다. 인식 시험을 통하여 속삭임의 인식은 정상어 모델로 인식하는 시스템은 낮은 인식률로 실용성이 없으며 속삭임 모델을 별도로 사용하는 것이 85%이상의 가장 높은 인식률을 보였다. 또한 '정상어+속삭임' 모델도 인식률은 조금 벌어지나 가능성을 확인할 수 있었다. 특징벡터로는 속삭임 모델을 사용하는 경우 MFCC 혹은 PLCC를 사용하는 것이 거의 유사하게 높은 인식률을 얻을 수 있었으나 '정상어+속삭임' 모델을 사용하는 경우 PLCC를 특징벡터로 사용하는 것이 속삭임 인식에서 가장 좋은 결과를 보였다.

Trellis excitation을 이용한 half rate 음성부호화기 (A Half Rate Speech Soder using Trellis Excitation)

  • 강상원;이형수;김영수;정진욱
    • 전자공학회논문지B
    • /
    • 제33B권2호
    • /
    • pp.88-94
    • /
    • 1996
  • In this paper, we present a half rate speech coder using trellis excitation. The coder combines code-excited linear prediction (CELP) system and trellis quantization method using the codebook expansion, and it produces higher speech quality than the typical CELP coder for the same transmission rate. A subjective comparison with 3~8 bit .$\mu$-law PCM indicates that the half rate coder provides speech quality between 5-bit and 6-bit $\mu$-law PCM .

  • PDF