• Title/Summary/Keyword: Voice evaluation

Search Result 357, Processing Time 0.024 seconds

Performance Evaluation of Multiplexing Algorithms with Both Delay and Loss Priorities in ATM Networks (ATM 통신망에서의 지연 및 손실 우선순위를 갖는 다중화 알고리즘의 성능 평가)

  • 전용희
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.19 no.5
    • /
    • pp.842-856
    • /
    • 1994
  • The various services that a broadband integrated services digital network (B-ISDN) carries, have a wide range of delay, delay jitter and cell loss probability requirements. Design of appropriate control schemes for B-ISDN is an extremely important and challenging problem. In this paper, we proposed multiplexing algorithm with both delay and loss priorities in order to satisfy the diverse requirements. For the implementation of cell lose priority, we assumed that voice cells are generated as non-discardable(i.e., high priority) and discardable (i.e., low priotity)cells. The low priority voice cell may be discarded inside the network if congestion occurs. The cell dropping scheme is shown to reduce cell losses as well as delays for both voice and data. Such a load shedding scheme is expected to improve significantly utilization of B-ISDN.

  • PDF

Evaluation of Mental Fatigue Using Vowel Formant Analysis (모음 포먼트 분석을 통한 정신적 피로 평가)

  • Ha, Wook Hyun;Park, Sung Ha
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.37 no.1
    • /
    • pp.26-32
    • /
    • 2014
  • Mental fatigue is inevitable in the workplace. Since mental fatigue can lead to decreased efficiency and critical accidents, it is important to manage mental fatigue from the viewpoint of accident prevention. An experiment was performed to evaluate mental fatigue using the formant frequency analysis of human voices. The experimental task was to mentally add or subtract two one-digit numbers. After completing the tasks with four different levels of mental fatigue, subjects were asked to read Korean vowels and their voices were recorded. Five vowel sounds of "아", "어", "오", "우", and "이" from the voice recorded were then used to extract formant 1 frequency. Results of separate ANOVAs showed significant main effects of mental fatigue on formant 1 frequencies of all five vowels concerned. However, post-hoc comparisons revealed that formant 1 frequencies of "아" and "어" were most sensitive to mental fatigue level employed in this experiment. Formant 1 frequencies of "아" and "어" significantly decrease as the mental fatigue accumulates. The formant frequency extracted from human voice would be potentially applicable for detecting mental fatigue induced during industrial tasks.

A Study on the Evaluation of Equilibrium Price between PSTN and VoIP Service (PSTN과 VoIP 서비스 간의 균형가격 도출에 관한 연구)

  • Yoon, Sang-Hum;Jin, Xiang-Hua;Park, Jong-Heon;Park, Young-Jun;Juhn, Jae-Ho;Ha, Gui-Ryong
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.33 no.3
    • /
    • pp.137-145
    • /
    • 2010
  • The objective of this paper is to evaluate the equilibrium price between PSTN and VoIP telephony services in the case of non-linear utility function. Currently there are two types of wired phone services we are known PSTN (Public Switched Telephone Network) and VoIP (Voice over Internet Protocol). The PSTN telephony which provide high quality service and VoIP which provides relatively low quality service form a vertically differentiated oligopoly. Therefore, the evaluation of the equilibrium price between PSTN and VoIP services is very important to wired phone service providers. The equilibrium price depends on the state of the service cost function has been proved different value. This paper was evaluated each equilibrium price for the state of the linear cost function and non-linear cost function. Subsequently, this paper analyzed the demand of both services and the equilibrium profit which can maximize the profit of both service providers.

A Study on the DAISY Service Interface for the Print-Disabled (독서장애인을 위한 DAISY 서비스 인터페이스 구성에 관한 연구)

  • Bae, Kyung-Jae
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.22 no.3
    • /
    • pp.173-188
    • /
    • 2011
  • This research intended to identify empirical recommendations for designing the DAISY service interface and used a case-based method. The case of this research was LG Digital Talking Book Library(http://voice.lg.or.kr) of LG Sangnam Library. A group of experts evaluated the current DAISY interface serviced by the web. After evaluation, major recommendations were suggested and these recommendations were used to develop the new DAISY Interface. Major recommendations were consideration of the reading flow of the screen-reader program, prevention of perception as an error about the time delay, development of web-based software, support for convenient functions and prevention of shortcut key overlap and so on.

VoiceXML Dialog System Based on RSS for Contents Syndication (콘텐츠 배급을 위한 RSS 기반의 VoiceXML 다이얼로그 시스템)

  • Kwon, Hyeong-Joon;Kim, Jung-Hyun;Lee, Hyon-Gu;Hong, Kwang-Seok
    • The KIPS Transactions:PartB
    • /
    • v.14B no.1 s.111
    • /
    • pp.51-58
    • /
    • 2007
  • This paper suggests prototype of dialog system combining VXML(VoiceXML) that is the W3C's standard XML format for specifying interactive voice dialogues between human and computer, and RSS(RDF Site Summary or Really Simple Syndication) that is representative technology of semantic web for syndication and subscription of updated web-contents. Merits of the proposed system are as following: 1) It is a new method that recognize spoken contents using ire and wireless telephone networks and then provide contents to user via STT(Speech-to-Text) and TTS(Text-to-Speech) instead of traditional method using web only. 2) It can apply advantage of RSS that subscription of updated contents is converted to VXML without modifying traditional method to provide RSS service, 3) In terms of users, it can reduce restriction on time-spate in search of contents that is provided by RSS because it uses ire and wireless telephone networks, not internet environment. 4) In terms of information provider, it does not need special component for syndication of the newest contents using speech recognition and synthesis technology. We implemented a news service system using VXML and RSS for performance evaluation of the proposed system. In experiment results, we estimated the response time and the speech recognition rate in subscription and search of actuality contents, and confirmed that the proposed system can provide contents those are provided using RSS Feed.

Pediatric Voice Handicap Index-Korean(pVHI-K) : A Pilot Study for Standardization (한국어판 소아음성장애지수(pVHI-K : Pediatric Voice Handicap Index-Korean) : 표준화를 위한 예비연구)

  • Park, Sung-Shin;Choi, Seong-Hee;Hong, Young-Hye;Jeong, Nyun-Gi;Sung, Myung-Whun;Kim, Kwang-Hyun;Kwon, Tack-Kyun
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.22 no.2
    • /
    • pp.137-142
    • /
    • 2011
  • Background and Objectives : The aim of this study is to introduce Korea version of pediatric VHI and to compare pVHI-K scores between children with dysphonia and children without voice problems before pVHI-K is developed as a preliminary study. Additionally, the relationship between pVHI and acoustic measures were investigated. Materials and Methods : pVHI-K scores in normal group were obtained from 15 parents who have children with no present or past history of a voice disorder, hearing loss, or related disability that can affect the their voice or speech. Dysphonia group consisted of 15 parents who have children with bilateral vocal fold nodule's at Department of Otolaryngology, the Seoul National University Hospital (SNUH). pVHI-K and acoustic parameters were measured in two group. Results : The mean pVHI scores (total, functional, physical, emotional) in normal group were 2.33 (T), 0.80 (F) 1.33 (P) and 0.27 (E), respectively whereas those of pVHI in children group with dysphonia were 23.13 (T), 11.07 (F), 5.73 (P) and 6.13 (E), respectively and significant differences were revealed in total pVHI score as well as in all of the sub-pVHI scores. Moreover, significant correlation between pVHI-K parameters (T, F, P) and acoustic measures [Shimmer(%)] were shown in children in dysphonia group. Conclusion : Reported by parents can be useful as a supplementary clinical tool for diagnosing and measuring treatment effectiveness in young children with dysphonia.

  • PDF

Comparison of Vowel and Text-Based Cepstral Analysis in Dysphonia Evaluation (발성장애 평가 시 /a/ 모음연장발성 및 문장검사의 켑스트럼 분석 비교)

  • Kim, Tae Hwan;Choi, Jeong Im;Lee, Sang Hyuk;Jin, Sung Min
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.26 no.2
    • /
    • pp.117-121
    • /
    • 2015
  • Background : Cepstral analysis which is obtained from Fourier transformation of spectrum has been known to be effective indicator to analyze the voice disorder. To evaluate the voice disorder, phonation of sustained vowel /a/ sound or continuous speech have been used but the former was limited to capture hoarseness properly. This study is aimed to compare the effectiveness in analysis of cepstrum between the sustained vowel /a/ sound and continuous speech. Methods : From March 2012 to December 2014, total 72 patients was enrolled in this study, including 24 unilateral vocal cord palsy, vocal nodule and vocal polyp patients, respectively. The entire patient evaluated their voice quality by VHI (Voice Handicap Index) before and after treatment. Phonation of sustained vowel /a/ sample and continuous speech using the first sentence of autumn paragraph was subjected by cepstral analysis and compare the pre-treatment group and post-treatment group. Results : The measured values of pre and post treatment in CPP-a (cepstral peak prominence in /a/ vowel sound) was 13.80, 13.91 in vocal cord palsy, 16.62, 17.99 in vocal cord nodule, 14.19, 18.50 in vocal cord polyp respectively. Values of CPP-s (cepstral peak prominence in text-based speech) in pre and post treatment was 11.11, 12.09 in vocal cord palsy, 12.11, 14.09 in vocal cord nodule, 12.63, 14.17 in vocal cord polyp. All 72 patients showed subjective improvement in VHI after treatment. CPP-a showed statistical improvement only in vocal polyp group, but CPP-s showed statistical improvement in all three groups (p<0.05). Conclusion : In analysis of cepstrum, text-based analysis is more representative in voice disorder than vowel sound speech. So when the acoustic analysis of voice by cepstrum, both phonation of sustained vowel /a/ sound and text based speech should be performed to obtain more accurate result.

  • PDF

Development and validation of Speech Range Profile task (발화범위 프로파일 과제 개발 및 타당성 검증)

  • Kim, Jaeock;Lee, Seung Jin
    • Phonetics and Speech Sciences
    • /
    • v.11 no.3
    • /
    • pp.77-87
    • /
    • 2019
  • The study aimed to develop Speech Range Profile (SRP) and to examine and validate its clinical application. Forty-five participants without voice disorders aged 18-29 years were compared using SRP and Voice Range Profile (VRP). The authors developed the "Fire!" paragraph as a SRP task compromising 14 sentences including all Korean spoken phonemes and sentence types. To compare SRP and VRP results, the participants read the paragraph (reading) and counted from 21 to 30 (counting) as a part of SRP tasks, and produced a vowel /a/ from low to high frequencies (gliding) and a shortened form of the VRP as a part of VRP tasks. $F0_{max}$, $F0_{min}$, $F0_{range}$, $I_{max}$, $I_{min}$, and $I_{range}$ for each task were measured and compared, showing that $F0_{max}$, $F0_{min}$, $F0_{range}$, $I_{max}$, and $I_{range}$ were not different between reading and gliding. $I_{min}$, had the lowest value in counting. It is concluded that the newly developed SRP task, reading the "Fire" paragraph, can yield a maximum phonation range similar to that found by VRP. Therefore, it is expected that voice evaluation can be effectively performed in a relatively short time by applying SRP with the "Fire" paragraph, a functional utterance task, in place of VRP, which may be difficult to measure long term or in cases of severe voice disorders.

Complex nested U-Net-based speech enhancement model using a dual-branch decoder (이중 분기 디코더를 사용하는 복소 중첩 U-Net 기반 음성 향상 모델)

  • Seorim Hwang;Sung Wook Park;Youngcheol Park
    • The Journal of the Acoustical Society of Korea
    • /
    • v.43 no.2
    • /
    • pp.253-259
    • /
    • 2024
  • This paper proposes a new speech enhancement model based on a complex nested U-Net with a dual-branch decoder. The proposed model consists of a complex nested U-Net to simultaneously estimate the magnitude and phase components of the speech signal, and the decoder has a dual-branch decoder structure that performs spectral mapping and time-frequency masking in each branch. At this time, compared to the single-branch decoder structure, the dual-branch decoder structure allows noise to be effectively removed while minimizing the loss of speech information. The experiment was conducted on the VoiceBank + DEMAND database, commonly used for speech enhancement model training, and was evaluated through various objective evaluation metrics. As a result of the experiment, the complex nested U-Net-based speech enhancement model using a dual-branch decoder increased the Perceptual Evaluation of Speech Quality (PESQ) score by about 0.13 compared to the baseline, and showed a higher objective evaluation score than recently proposed speech enhancement models.

A Study on the Perceptual Aspects of an Emotional Voice Using Prosody Transplantation (운율이식을 통해 나타난 감정인지 양상 연구)

  • Yi, So-Pae
    • MALSORI
    • /
    • no.62
    • /
    • pp.19-32
    • /
    • 2007
  • This study investigated the perception of emotional voices by transplanting some or all of the prosodic aspects, i.e. pitch, duration, and intensity, of the utterances produced with emotional voices onto those with normal voices and vice versa. Listening evaluation by 24 raters revealed that prosodic effect was greater than segmental & vocal quality effect on the preception of the emotion. The degree of influence of prosody and that of segments & vocal quality varied according to the type of emotion. As for fear, prosodic elements had far greater influence than segmental & vocal quality elements whereas segmental and vocal elements had as much effect as prosody on the perception of happy voices. Different amount of contribution to the perception of emotion was found among prosodic features with the descending order of pitch, duration and intensity. As for the length of the utterances, the perception of emotion was more effective with long utterances than with short utterances.

  • PDF