• Title/Summary/Keyword: Speaker characteristics

Search Result 255, Processing Time 0.036 seconds

Voice transformation for HTS using correlation between fundamental frequency and vocal tract length (기본주파수와 성도길이의 상관관계를 이용한 HTS 음성합성기에서의 목소리 변환)

  • Yoo, Hyogeun;Kim, Younggwan;Suh, Youngjoo;Kim, Hoirin
    • Phonetics and Speech Sciences
    • /
    • v.9 no.1
    • /
    • pp.41-47
    • /
    • 2017
  • The main advantage of the statistical parametric speech synthesis is its flexibility in changing voice characteristics. A personalized text-to-speech(TTS) system can be implemented by combining a speech synthesis system and a voice transformation system, and it is widely used in many application areas. It is known that the fundamental frequency and the spectral envelope of speech signal can be independently modified to convert the voice characteristics. Also it is important to maintain naturalness of the transformed speech. In this paper, a speech synthesis system based on Hidden Markov Model(HMM-based speech synthesis, HTS) using the STRAIGHT vocoder is constructed and voice transformation is conducted by modifying the fundamental frequency and spectral envelope. The fundamental frequency is transformed in a scaling method, and the spectral envelope is transformed through frequency warping method to control the speaker's vocal tract length. In particular, this study proposes a voice transformation method using the correlation between fundamental frequency and vocal tract length. Subjective evaluations were conducted to assess preference and mean opinion scores(MOS) for naturalness of synthetic speech. Experimental results showed that the proposed voice transformation method achieved higher preference than baseline systems while maintaining the naturalness of the speech quality.

An Experimental Study on the Prediction of Indoor Sound Level Distribution in Apartment for Exterior Noise (외부소음에 대한 공동주택 실내 소음레벨분포에 관한 실험적 연구)

  • Park, Hyeon-Ku;Kim, Jong-Bin;Kang, Dong-Yong;Jang, Hyun-Choong;Song, Hyuk;Kim, Sun-Woo
    • Proceedings of the Korean Society for Noise and Vibration Engineering Conference
    • /
    • 2001.05a
    • /
    • pp.259-264
    • /
    • 2001
  • It is necessary to predict the sound pressure level(SPL) in rooms before designing an apartment when exterior noises are produced. In order to predict SPL for an apartment that has some specific exterior noises, the following should be known: the characteristics of outdoor noise, sound insulation performance and sound level differences of each room. The purpose of this study is to find out the possibility of predicting sound pressure level of rooms in an apartment by analysing sound level differences among rooms. Sound sources used in this experiment are construction noise, aircraft noise, railroad noise, road traffic noise and white noise as a reference to compare with the previous four. These noises were recorded and reproduced by speaker. As a result, we found that within the sound reduction pattern, the sound difference level appeared uniform depending on the sound insulation characteristics of the windows installed when facing the noise source. When the windows having the same acoustic performance were installed, the SPL in each room resulted in nearly the same values.

  • PDF

A study on the noise reduction of practical duct system with the air cavity (공기층을 갖는 실제덕트 구조물에서의 소음저감에 관한 연구)

  • Kim, Chan-Mook;Lee, Doo-Ho;Bahng, Keuk-Ho
    • Proceedings of the Korean Society for Noise and Vibration Engineering Conference
    • /
    • 2000.06a
    • /
    • pp.1687-1692
    • /
    • 2000
  • In this paper, experimental methods to find acoustic characteristics of acoustically treated air-conditioning duct system are proposed. Existing methods to analyze acoustic properties of duct with absorbent material have a dilemma which has to assume the wave in duct to be a plane wave. Under this assumption, applicable frequency limitation makes accurate analysis of practical air-conditioning system impossible. In order to analyze the properties of in-lined treated absorbent with high degree of accuracy, in this experiments the range of exciting frequency of sound source is broadband, which means that source speaker excites higher mode of in-duct sound field. Also, to define the relations of air cavity to the acoustic characteristics, acoustic experiments on ducts with air cavity of different depth are operated. In conclusion, air-cavity makes the absorbing ability of duct improved in low frequency range. Due to the interactions between the air cavity depth and the depth of absorbents, according to depth of cavity, the magnitude of absorption coefficients vs frequencies in specific range is changed. In lower frequency range, the absorption of sound energy by air cavity is more dominant than by absorbent itself, in higher range, the inversion is true.

  • PDF

Effect of Acoustic Reflector's Surface Density on Sound Absorption Characteristics and Stage Acoustics (음향 반사판의 밀도별 흡음특성 및 무대음향에의 영향)

  • Kim, Young-Sun;Jeong, Jeong-Ho;Jeon, Jin-Yong;Kim, Myeong-Seok
    • Transactions of the Korean Society for Noise and Vibration Engineering
    • /
    • v.22 no.5
    • /
    • pp.429-436
    • /
    • 2012
  • In concert halls and auditoriums, acoustic reflector and stage enclosure is one of the main factors on the room and stage acoustic characteristics. As a stage enclosure and acoustic reflector honey comb based light-weight reflector is widely used, because it is easy to install. However, there was not enough research on the surface density effect on room and stage acoustics. In this study, sound absorption coefficient tests on three kinds of wooden acoustic reflectors with different surface density were conducted. Surface density of acoustic reflector was changed from 11 kg/$m^2$ to 41 kg/$m^2$. For the low frequency excitation, sub-woofer was used with omnidirectional loud-speaker simultaneously. From the experiments, it was found that sound absorption coefficient below 250 Hz band was decrease by the increment of surface density. In order to check the influence of the surface density on room and stage acoustic parameters, room acoustic simulation was conducted with sound absorption coefficients, which were tested in reverberation chamber. By the increment of surface density of acoustic reflector, RT(reverberation time) and EDT(early decay time) were increased. Also, ST(stage support) was improved in low frequency bands.

A study of reciting the formal poetries of Korea and French in digital era - Shijo(Korean verse) vs Sonnet (French) (콘텐츠를 위한 한ㆍ불 정형시가 낭송법의 비교 고찰)

  • 이산호
    • Sijohaknonchong
    • /
    • v.19 no.1
    • /
    • pp.85-106
    • /
    • 2003
  • Recently, the sonnet and the shijo, each representing French and Korean formal poetries, are tend to be read with the eyes only, as were more accustomed to written literature. But even after almost three millennia of written literature and increased use of digitalized poems, poetry retains its appeal to the ear as well as to the eye. To read a poem only by eyes might be wrong because it is designed to be read aloud by mouth and understood by ear, and will decrease the aesthetic sense otherwise. It is essential to find the right way to recite a poem in this dramatically changed society, and is especially important when many shijos are changing into digitalized forms to adapt the new wave of our society. The sonnet and the shijo emphasize the importance of the harmony of sounds and rhythms with certain structure, and have their own prosodies. The emotions of the speaker in poems are expressed with words. When they are pronounced. each phoneme has its own phonemic characteristics. When comparing the The Broken Bell(Baudelaire) and Chopoong ga (Jong Seo Kim) in terms of prosody and phonetics. the speakers emotions are closely related with the phonetic structure of each word. In The Broken Bell, the phonetic value of rhymes, repeated phonemes, concentration of front and back vowels. rhythms of onesyllable words shape the overall image of this poem describing the productivity of bells as appose to the sterility of the soul. Chopoong ga also shows the determined and strong will of the speaker by frequent glottalized sounds. distribution and concentration of certain vowels. and frequent use of plosives. As you see in these examples, phones, beats, and rhythms are not the mere transmitter of meaning but possess their expressive values of their own and should be the first to be considered when reciting a poem.

  • PDF

Acoustic Characteristic of Emergency Broadcasting Speakers (비상방송용 스피커의 음향 특성 비교)

  • Jeong, Jeong-Ho;Seo, Bo-Youl;Park, Kye-Won;Shin, Yi-Chul;Hong, Won-Hwa
    • Fire Science and Engineering
    • /
    • v.33 no.1
    • /
    • pp.130-137
    • /
    • 2019
  • In this study, the acoustic characteristics of 13 types of emergency broadcasting speakers were tested under the test set-up of UL 2043 and compared. When the sound pressure level of 1 W speakers was compared with speakers with a 15 W output, the SPL of the 15 W speakers was approximately 20 dB higher in some frequency bands. Loudness analysis showed that people can recognize emergency sound from a 15 W speaker twice as loud as the emergency sound from 1 W speakers. The analysis results on the articulation index (room) had an opposite tendency with loudness results, meaning that small speakers can generate clearer sound. Therefore, it is necessary to improve emergency broadcasting speakers to generate louder and clearer sound. Moreover, a performance evaluation standard is needed based on the reasonable and quantitative measurements and evaluations of the acoustic characteristics of the emergency-broadcasting speakers so that a sufficient and clear sound can be generated in various spaces. In addition, it is necessary to establish standards for the clarity of emergency broadcasting in various spaces.

A Interdisciplinary Study about Voice Change of the Presidential Candidate and Cognition Change of the Voters (선거 연설에서 대통령 후보자의 목소리 변화에 따른 유권자의 인지 변화에 대한 융합 연구)

  • Hahm, Sang-Woo;Park, Hyungwoo
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.18 no.3
    • /
    • pp.193-200
    • /
    • 2018
  • In a formal speech, the speaker's voice has a variety of effects on the listener. Depending on the voice characteristics, the effectiveness and efficiency of the speech change as well. In the presidential election, the candidate's voice characteristics will affect the cognition of voters. Thus, we need to understand about the a more effective voice of candidate. This study proves whether voters will change their cognition if the candidate changes owe voice. If the cognition of voters can be changed according to the changed voice characteristics of the candidate, we will be able to explain what voices are needed for the candidate. We will also be able to suggest the necessary voice change strategies for effective speech. We describe cognition change of the voters according to the change of the voices of the presidential candidates by the dimension of the sori-engineering and the cognitive dimension. Hence, this study explains the voice characteristics and change strategies needed for candidates for effective speech.

A Study on the Propagation Characteristics of Fire Alarm Sound in Buildings (화재비상경보음의 건물 내 전달특성에 관한 연구)

  • Baek, Eun-Sun
    • Fire Science and Engineering
    • /
    • v.23 no.5
    • /
    • pp.153-160
    • /
    • 2009
  • This study aims to review the propagation characteristics of fire alarm sound in building through computer simulation. In order to achieve this goal, the sound power level of existing three different emergency alarms were measured in an anechoic chamber. Sound power level of alarm bell was 98.6dB and electronic-siren speaker was 95.7dB, and electronic-siren phon was 101.8dB at the voltage of DC 24V in the condition of anechoic chamber. As the results of acoustic simulation, it was shown that sound levels at the corridor of the building were relatively high and even. But, there were large difference in sound level at all the frequency bands between corridor and lecture rooms. This mean that alarm sound couldn't be recognized sometimes in lecture rooms. Through the computer simulation, the propagation characteristics of fire alarm sound could be forecasted and compared due to plans of buildings.

Speech Rate and Pause Characteristics in Speaker with Flaccid Dysarthria (이완형 마비말장애 화자의 말속도와 쉼 특성)

  • Hong, Saemi;Byeon, Haewon
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.15 no.5
    • /
    • pp.2930-2936
    • /
    • 2014
  • The purposes of this study were to investigate the characteristics of speech rate and pause in patients with flaccid dysarthria. For this purpose, 15 patients with flaccid dysarthria and 15 normal speakers matched in gender and age participated as subjects. The overall speech rate, the articulation rate, the inter-sentence and the intra-sentence pause duration and pause frequency were measured during reading the standardized passage "Autumn"(Kim, 1996). As a result, the overall speech rate and articulation rate of patients with flaccid dysarthria were significantly slower than normal speakers and intra-sentence pause duration and frequency of patients with flaccid dysarthria were significantly higher than normal speakers, but those of inter-sentence weren't. The results from this study provides the speech rate index of flaccid dysarthria and indicates that to control the speech rate of flaccid dysarthria have regard to not only the overall speech rate and the articulation rate but also the intra-sentence pause duration and the frequency.

Summarization of Korean Dialogues through Dialogue Restructuring (대화문 재구조화를 통한 한국어 대화문 요약)

  • Eun Hee Kim;Myung Jin Lim;Ju Hyun Shin
    • Smart Media Journal
    • /
    • v.12 no.11
    • /
    • pp.77-85
    • /
    • 2023
  • After COVID-19, communication through online platforms has increased, leading to an accumulation of massive amounts of conversational text data. With the growing importance of summarizing this text data to extract meaningful information, there has been active research on deep learning-based abstractive summarization. However, conversational data, compared to structured texts like news articles, often contains missing or transformed information, necessitating consideration from multiple perspectives due to its unique characteristics. In particular, vocabulary omissions and unrelated expressions in the conversation can hinder effective summarization. Therefore, in this study, we restructured by considering the characteristics of Korean conversational data, fine-tuning a pre-trained text summarization model based on KoBART, and improved conversation data summary perfomance through a refining operation to remove redundant elements from the summary. By restructuring the sentences based on the order of utterances and extracting a central speaker, we combined methods to restructure the conversation around them. As a result, there was about a 4 point improvement in the Rouge-1 score. This study has demonstrated the significance of our conversation restructuring approach, which considers the characteristics of dialogue, in enhancing Korean conversation summarization performance.