• 제목/요약/키워드: Speech acoustics

검색결과 62건 처리시간 0.022초

Harmonic Structure Features for Robust Speaker Diarization

  • Zhou, Yu;Suo, Hongbin;Li, Junfeng;Yan, Yonghong
    • ETRI Journal
    • /
    • 제34권4호
    • /
    • pp.583-590
    • /
    • 2012
  • In this paper, we present a new approach for speaker diarization. First, we use the prosodic information calculated on the original speech to resynthesize the new speech data utilizing the spectrum modeling technique. The resynthesized data is modeled with sinusoids based on pitch, vibration amplitude, and phase bias. Then, we use the resynthesized speech data to extract cepstral features and integrate them with the cepstral features from original speech for speaker diarization. At last, we show how the two streams of cepstral features can be combined to improve the robustness of speaker diarization. Experiments carried out on the standardized datasets (the US National Institute of Standards and Technology Rich Transcription 04-S multiple distant microphone conditions) show a significant improvement in diarization error rate compared to the system based on only the feature stream from original speech.

교실 음향에 대한 에어컨 소음의 영향 (The Effects of Air Conditioner Noise on Classroom Acoustics)

  • 김수연;전진용
    • 한국소음진동공학회:학술대회논문집
    • /
    • 한국소음진동공학회 2005년도 춘계학술대회논문집
    • /
    • pp.176-179
    • /
    • 2005
  • A case-study in classroom acoustics was conducted and the effects of two types(system air conditioner and packaged air conditioner) of air conditioner were investigated. Acoustical measurements were made in two different classrooms. Each classroom has different acoustics showing sound quality of air conditioner. Mental concentration test was conducted to evaluate the effects of air conditioner noise with different sound presure level(dBA). Speech intelligibility test was also planed with adopting Korean phonetic balanced words.

  • PDF

A Study on the Performance of TDNN-Based Speech Recognizer with Network Parameters

  • Nam, Hojung;Kwon, Y.;Paek, Inchan;Lee, K.S.;Yang, Sung-Il
    • The Journal of the Acoustical Society of Korea
    • /
    • 제16권2E호
    • /
    • pp.32-37
    • /
    • 1997
  • This paper proposes a isolated speech recognition method of Korean digits using a TDNN(Time Delay Neural Network) which is able to recognizc time-varying speech properties. We also make an investigation of effect on network parameter of TDNN ; hidden layers and time-delays. TDNNs in our experiments consist of 2 and 3 hidden layers and have several time-delays. From experiment result, TDNN structure which has 2 hidden-layers, gives a good result for speech recognition of Korean digits. Mis-recognition by time-delays can be improved by changing TDNN structures and mis-recognition separated from time-delays can be improved by changing input patterns.

  • PDF

교회 건축물의 실내음향 특성에 관한 연구 (A Study on the Room Acoustics in Churches)

  • 주진수
    • 소음진동
    • /
    • 제9권4호
    • /
    • pp.681-686
    • /
    • 1999
  • In a church, speech intelligibility is very important together with the reverberance for musical activities. In order to obtain the primary data of a acoustical design for churches records were refereed and churches were measured in Europe and Japan. And in the base of measurements, those were judged by subjective hearing test. As some results, it has been found that the room acoustics of churches were different in a country and the reverberation time was perferred two seconds for speech intelligibility. However, although personal deviations were admitted, more long echoes were preferred for the music.

  • PDF

라플라시안 피라미드 프로세싱과 백터 양자화 방법을 이용한 영상 데이타 압축 (Image Data Compression Using Laplacian Pyramid Processing and Vector Quantization)

  • 박광훈;차일환;윤대희
    • 대한전기학회:학술대회논문집
    • /
    • 대한전기학회 1987년도 전기.전자공학 학술대회 논문집(II)
    • /
    • pp.1347-1351
    • /
    • 1987
  • This thesis aims at studying laplacian pyramid vector quantization which keeps a simple compression algorithm and stability against various kinds of image data. To this end, images are devied into two groups according to their statistical characteristics. At 0.860 bits/pixel and 0.360 bits/pixel respectively, laplacian pyramid vector quantization is compared to the existing spatial domain vector quantization and transform coding under the same condition in both objective and subjective value. The laplacian pyramid vector quantization is much more stable against the statistical characteristics of images than the existing vector quantization and transform coding.

  • PDF

TTS를 이용한 매장음원방송에서 고객의 인지도 향상을 위한 음향효과 연구 (A Study on the Sound Effect for Improving Customer's Speech Recognition in the TTS-based Shop Music Broadcasting Service)

  • 강선미;김현득;장문수
    • 말소리와 음성과학
    • /
    • 제1권4호
    • /
    • pp.105-109
    • /
    • 2009
  • This thesis describes the method for well voice announcement using the TTS(Text-To-Speech) technology in the shop music broadcasting service. Offering a high quality TTS sound service for each shop requires a great expense. According to a report on the architectural acoustics the room acoustic indexes such as reverberation time and early decay time are closely connected with a subjective awareness about acoustics. By using the result the customers will be able to recognize better the voice announcement by applying sound effect to speech files made by TTS. The result of an aural comprehension examination has shown better about almost all of the parameters by applying reverb effect to TTS sound.

  • PDF

영어파열음 시구간신호의 음향과 지각 비대칭성 연구 (The Study on Asymmetry between Acoustics and Perception of the Temporal Cues of English Plosives)

  • 강석한
    • 대한음성학회지:말소리
    • /
    • 제55권
    • /
    • pp.15-31
    • /
    • 2005
  • This study tests the hypothesis that the voiced-voiceless distinction is influenced by the relationship between acoustics and perception. Production and perception tests are conducted with temporal cues in different environments(CV, VCV, VC). The result showed that acoustic cues indicating significant difference between voiceless/voiced plosives do not behave just as do in perception. The result also showed that there existed an asymmetry between acoustics and perception.

  • PDF

1차 단순 확산체를 적용한 교실음향설계 (The Application of 1-Dimensional Diffusers in Classroom Acoustics)

  • 최영지
    • 교육시설 논문지
    • /
    • 제18권5호
    • /
    • pp.3-11
    • /
    • 2011
  • In this study, the effect of treating 1-dimensional diffusers on the classroom acoustics was investigated to determine if the diffuser are beneficial for performing the preferred acoustical conditions for speech. A 1/10 scale model of a classroom was used to measure the acoustical parameters, T30, $C_{50}$, STI and SNR in that room. The room acoustical conditions were varied by treating diffusers either on the front or side walls of the classroom. When the diffusers were treated on the side walls around the student's areas, a shorter reverberation time at low frequencies was obtained and resulted in performing uniform reverberation times across the frequency bands. The $C_{50}$ values at mid- and high-frequencies were increased by treating the diffusers either on front or side wall surfaces. The highest STI and SNR values were obtained when the diffuser was treated on the front wall around the teacher's areas. It is found that diffusers are beneficial to increase the intelligibility of speech for the rear seats of the rooms.

  • PDF

라이스의 전화기 발명과 통화 음질의 문제 (The Invention of Reis Telephone and Its Problem of Speech Quality)

  • 구자현
    • 한국음향학회지
    • /
    • 제29권6호
    • /
    • pp.395-401
    • /
    • 2010
  • 리어스는 그레이나 벨 등보다 훨씬 앞서서 도선을 통해 목소리를 전달하는 데 성공했기 때문에 전화기의 발명자라고 불리는 것이 마땅하지만 살아 있는 동안 그의 위대한 발명품으로 영예를 누리지 못했다. 그는 과학자 집단에 속해 있었으므로 그의 연구를 특허를 낼 수 있는 발명품이 아니라 과학적 발견으로 제시하였다. 또한 그는 유럽 음향학의 실험 전통에 따라 단속적 전기를 사용하였기에 그의 전화기는 통화음질에 치명적 결점을 갖게 되었다. 반면에 전기와 음향학에 초보자였던 벨은 소리 신호를 전송하기 위해 가변 전류를 사용하였고 그것은 라이스의 전화기보다 더 나은 통화 음질을 보장하였다.

국내 교육시설의 음향기준 제정의 필요성 제고 (Towards better acoustic conditions in school buildings in Korea-a need for Korean standard for classroom acoustics)

  • 최영지
    • 한국음향학회지
    • /
    • 제42권2호
    • /
    • pp.113-123
    • /
    • 2023
  • 본 논문에서는 국내 초·중·고 및 대학교 학습공간의 음향 상태에 관해 설명하고 국내 교육시설의 음향 기준 제정의 필요성을 제시하였다. 여러 나라의 학교시설 내 다양한 학습공간에서 요구되는 배경소음, 잔향시간, 그리고 차음 설계의 음향 기준을 이해하기 위해 학교시설 음향설계기준 및 지침을 소개하였다. 국내 초·중·고 및 대학교 학습공간의 음향 상태를 파악하기 위해 현장 측정 결과를 바탕으로 하였으며, 그 결과에서는 공석과 만석 상태의 음향 특성, 배경 소음레벨, 그리고 차음성능을 제시하여 비교하였다. 실제 대학 강의 현장에서 좋은 음성 명료도를 성취하기 위한 음향지표 값도 제시하였다.