• Title/Summary/Keyword: Voice evaluation

Search Result 358, Processing Time 0.023 seconds

A Survey of Queueing Approaches in ATM (비동기식 전송방식 (ATM) 에서의 대기행렬이론 응용에 관한 조사연구)

  • Park, No-Ik;Lee, Ho-Woo
    • IE interfaces
    • /
    • v.9 no.3
    • /
    • pp.120-142
    • /
    • 1996
  • Asynchronous Transfer Mode (ATM) is considered to be the most promising transfer technique for BISDN due to its efficiency and flexibility. Queueing theory has been playing a very important role in performance evaluation of ATM for the past few years. This paper is composed of two parts. The first part is concerned with the several basic concepts of ATM. The second part surveys queueing approaches in ATM performance evaluation. It deals with stochastic models which have been proposed for the three basic categories of traffic sources (voice, data, video), various queueing models for statistical multiplexer and switch, and priority strategies for buffer control schemes.

  • PDF

Machine Learning-Based Programming Analysis Model Proposal : Based on User Behavioral Analysis

  • Jang, Seonghoon;Shin, Seung-Jung
    • International journal of advanced smart convergence
    • /
    • v.9 no.4
    • /
    • pp.179-183
    • /
    • 2020
  • The online education platform market is developing rapidly after the coronavirus infection-19 pandemic. As school classes at various levels are converted to non-face-to-face classes, interest in non-face-to-face online education is increasing more than ever. However, the majority of online platforms currently used are limited to the fragmentary functions of simply delivering images, voice and messages, and there are limitations to online hands-on training. Indeed, digital transformation is a traditional business method for increasing coding education and a corporate approach to service operation innovation strategy computing thinking power and platform model. There are many ways to evaluate a computer programmer's ability. Generally, piecemeal evaluation methods are used to evaluate results in time through coding tests. In this study, the purpose of this study is to propose a comprehensive evaluation of not only the results of writing, but also the execution process of the results, etc., and to evaluate the programmer's propensity habits based on the programmer's coding experience to evaluate the programmer's ability and productivity.

One-shot multi-speaker text-to-speech using RawNet3 speaker representation (RawNet3를 통해 추출한 화자 특성 기반 원샷 다화자 음성합성 시스템)

  • Sohee Han;Jisub Um;Hoirin Kim
    • Phonetics and Speech Sciences
    • /
    • v.16 no.1
    • /
    • pp.67-76
    • /
    • 2024
  • Recent advances in text-to-speech (TTS) technology have significantly improved the quality of synthesized speech, reaching a level where it can closely imitate natural human speech. Especially, TTS models offering various voice characteristics and personalized speech, are widely utilized in fields such as artificial intelligence (AI) tutors, advertising, and video dubbing. Accordingly, in this paper, we propose a one-shot multi-speaker TTS system that can ensure acoustic diversity and synthesize personalized voice by generating speech using unseen target speakers' utterances. The proposed model integrates a speaker encoder into a TTS model consisting of the FastSpeech2 acoustic model and the HiFi-GAN vocoder. The speaker encoder, based on the pre-trained RawNet3, extracts speaker-specific voice features. Furthermore, the proposed approach not only includes an English one-shot multi-speaker TTS but also introduces a Korean one-shot multi-speaker TTS. We evaluate naturalness and speaker similarity of the generated speech using objective and subjective metrics. In the subjective evaluation, the proposed Korean one-shot multi-speaker TTS obtained naturalness mean opinion score (NMOS) of 3.36 and similarity MOS (SMOS) of 3.16. The objective evaluation of the proposed English and Korean one-shot multi-speaker TTS showed a prediction MOS (P-MOS) of 2.54 and 3.74, respectively. These results indicate that the performance of our proposed model is improved over the baseline models in terms of both naturalness and speaker similarity.

Development of Voice Information System for Safe Navigation in Marine Simulator (시뮬레이터 기반 음성을 이용한 항행정보 안내시스템의 개발)

  • Son N. S.;Kim S. Y.
    • Journal of the Korean Society for Marine Environment & Energy
    • /
    • v.5 no.3
    • /
    • pp.28-34
    • /
    • 2002
  • As the technology of Speech Recognition(SR) and Text-To-Speech(TTS) develops rapidly, voice control and guidance system is thought to be very helpful for safe navigation. But Voice Control and Guidance System(VCGS) is not yet so popularly included in Navigation Supporting System(NSS). The main reason of this is that VCGS is so complicated and user-unfriendly that navigation officers hesitate to use VCGS. Frequent errors in operating VCGS due to low rate of SR are another reason. To make VCGS more practicable for safe navigation, we design the user-friendly VCGS. Firstly, by using interviews we survey functions and procedures that navigation officers want to be included in VCGS. Secondly, to raise the rate of SR, we tun the environmental noise in bridge and to reduce the errors due to low rate of SR in operating VCGS, we design the functions of self-correction. Also we apply a user-independent SR engine so that procedures of teaming of speakers is basically not necessary. Using simulator experiments the functions and procedures of the user-friendly YCGS for safe navigation are evaluated and the results of evaluation are fed back to the design. As a result, we can design the VCGS more helpful for safe navigation. In this paper, we describe the features of the user-friendly VCGS for safe navigation and discuss the results of simulator experiments.

  • PDF

An ACLMS-MPC Coding Method Integrated with ACFBD-MPC and LMS-MPC at 8kbps bit rate. (8kbps 비트율을 갖는 ACFBD-MPC와 LMS-MPC를 통합한 ACLMS-MPC 부호화 방식)

  • Lee, See-woo
    • Journal of Internet Computing and Services
    • /
    • v.19 no.6
    • /
    • pp.1-7
    • /
    • 2018
  • This paper present an 8kbps ACLMS-MPC(Amplitude Compensation and Least Mean Square - Multi Pulse Coding) coding method integrated with ACFBD-MPC(Amplitude Compensation Frequency Band Division - Multi Pulse Coding) and LMS-MPC(Least Mean Square - Multi Pulse Coding) used V/UV/S(Voiced / Unvoiced / Silence) switching, compensation in a multi-pulses each pitch interval and Unvoiced approximate-synthesis by using specific frequency in order to reduce distortion of synthesis waveform. In integrating several methods, it is important to adjust the bit rate of voiced and unvoiced sound source to 8kbps while reducing the distortion of the speech waveform. In adjusting the bit rate of voiced and unvoiced sound source to 8 kbps, the speech waveform can be synthesized efficiently by restoring the individual pitch intervals using multi pulse in the representative interval. I was implemented that the ACLMS-MPC method and evaluate the SNR of APC-LMS in coding condition in 8kbps. As a result, SNR of ACLMS-MPC was 15.0dB for female voice and 14.3dB for male voice respectively. Therefore, I found that ACLMS-MPC was improved by 0.3dB~1.8dB for male voice and 0.3dB~1.6dB for female voice compared to existing MPC, ACFBD-MPC and LMS-MPC. These methods are expected to be applied to a method of speech coding using sound source in a low bit rate such as a cellular phone or internet phone. In the future, I will study the evaluation of the sound quality of 6.9kbps speech coding method that simultaneously compensation the amplitude and position of multi-pulse source.

Analysis of the Effect of Intralesional Steroid Injection on the Voice During Laryngeal Microsurgery (후두 미세수술 중 병변 내 스테로이드 주입이 음성에 미치는 효과 분석)

  • Jae Seon, Park;Hyun Seok, Kang;In Buhm, Lee;Sung Min, Jin;Sang Hyuk, Lee
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.33 no.3
    • /
    • pp.166-171
    • /
    • 2022
  • Background and Objectives Vocal fold (VF) scar is known to be the most common cause of dysphonia after laryngeal microsurgery (LMS). Steroids reduce postoperative scar formation by inhibiting inflammation and collagen deposition. However, the clinical evidence of whether steroids are helpful in reducing VF scar formation after LMS is still lacking. The purpose of this study is to determine whether intralesional VF steroid injection after LMS helps to reduce postoperative scar formation and voice quality. Materials and Method This study was conducted on 80 patients who underwent LMS for VF polyp, Reinke's edema, and leukoplakia. Among them, 40 patients who underwent VF steroid injection after LMS were set as the injection group, and patients who had similar sex, age, and lesion size and who underwent LMS alone were set as the control group. In each group, stroboscopy, multi-dimensional voice program, Aerophone II, and voice handicap index (VHI) were performed before and 1 month after surgery, and the results were statistically analyzed. Results There were no statistically significant differences in the distribution of sex, age, symptom duration, occupation and smoking status between each group. Both groups consisted of VF polyp (n=21), Reinke's edema (n=11), and leukoplakia (n=9). On stroboscopy, the lesion disappeared after surgery, and the amplitude and mucosal wave were symmetrical on both sides of the VFs in all patients. Acoustic parameters and VHI significantly improved after surgery in all patients. However, there was no significant difference between the injection and control group in most of the results. Conclusion There was no significant difference in the results of stroboscopy, acoustic, aerodynamic, and subjective evaluation before and after surgery in the injection group and the control group.

A study on combination of loss functions for effective mask-based speech enhancement in noisy environments (잡음 환경에 효과적인 마스크 기반 음성 향상을 위한 손실함수 조합에 관한 연구)

  • Jung, Jaehee;Kim, Wooil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.40 no.3
    • /
    • pp.234-240
    • /
    • 2021
  • In this paper, the mask-based speech enhancement is improved for effective speech recognition in noise environments. In the mask-based speech enhancement, enhanced spectrum is obtained by multiplying the noisy speech spectrum by the mask. The VoiceFilter (VF) model is used as the mask estimation, and the Spectrogram Inpainting (SI) technique is used to remove residual noise of enhanced spectrum. In this paper, we propose a combined loss to further improve speech enhancement. In order to effectively remove the residual noise in the speech, the positive part of the Triplet loss is used with the component loss. For the experiment TIMIT database is re-constructed using NOISEX92 noise and background music samples with various Signal to Noise Ratio (SNR) conditions. Source to Distortion Ratio (SDR), Perceptual Evaluation of Speech Quality (PESQ), and Short-Time Objective Intelligibility (STOI) are used as the metrics of performance evaluation. When the VF was trained with the mean squared error and the SI model was trained with the combined loss, SDR, PESQ, and STOI were improved by 0.5, 0.06, and 0.002 respectively compared to the system trained only with the mean squared error.

Clinical Study on Laryngo - Microscopic Surgery For Vocal Nodules and Polyps (후두결절 및 폴립의 후두미세 수술에 관한 임상연구)

  • 문영일
    • Proceedings of the KOR-BRONCHOESO Conference
    • /
    • 1983.05a
    • /
    • pp.11.2-11
    • /
    • 1983
  • Vocal nodules and polyps are much more frequent in singers, public speakers, teachers and actors. Voice trauma and voice misuse, at times associated with mild inflammatory reaction, appear to be important in their etiology. It is generally agreed that vocal cord nodules and polyps are inflammatory in nature and they arise in the subepithelial layer of loose connective tissue of the vocal cord. Since the junction of anterior and middle thirds of the membranous cord and has the greatest amplitude of vibration. This is the site of predilection for vocal cord nodules. The author performed laryngomicrosurgery for 70 cases of vocal nodules and polyps at Ewha Womans University Hospital during the period of 5 years. The result obtained were as follows ; 1) Surgical excision is not necessarily the best approach because vocal nodules in the early stages will resolve with the simplest voice therapy. 2) In children, surgery is rarely indicated because most nodules in children regress during adolescence. 3) For patients who use their voices professionally, voice therapy is indicated for three months. 4) If after three month of conservative treatment the cord lesion does not improve and the patient it still dissatisfied with his voice, laryngomicrosurgery can then be considered. 5) The small cuffed endotracheal tube in the interarytenoid space helps to keep the cords immobile and in an abducted position. 6) Removal of the nodule shoule be started by gentle retraction posteriorly and as soon as a tear appears anterior to the nodule. 7) On occasion it is preferable to start the dissection with a siccle knife while the nodule is held on the stretch. 8) Voice rest should be maintained for a week following which the free edges of the cords are usually healed.

  • PDF

An Evaluation of Effectiveness of Scalp Acupuncture on Post-stroke Dysarthria Group through Graphed Vowel Space (두침치료를 시행한 중추성 마비 말장애(Dysarthria) 환자의 모음 공간 평가)

  • Shin, Yoo-Jeong;Park, Jong-Hyuck;Baek, Kyung-Min;Chang, Woo-Seok;Choi, Yang-Kyu
    • The Journal of Internal Korean Medicine
    • /
    • v.29 no.2
    • /
    • pp.413-420
    • /
    • 2008
  • Objective : To investigate the therapeutic effect of scalp acupuncture on dysarthria after stroke. Methods : A-B model in single subject research. The experimental group of 7 members received scalp acupuncture and the control group of 5 nothing for dysarthria itself. Five single vowels were recorded by voice recorder and measured for 1st & 2nd formant. By the formants, a vowel space was made according to each one of the patients. Praat program was used for evaluation. Results : 4 cases in the experimental group showed significant change and just 1 case of the none-operated group got better. Conclusion : In this study, it's proven that acupuncture treatment might have positive efficacy on dysarthria cases. At this moment, more clinical reports are needed for objective oriental medical study.

  • PDF

The Relationship Between Assertiveness and Clinical Stress in Nursing Students (간호학생의 자기표현 수준과 임상실습 스트레스와의 관계 연구)

  • Cho Nam-Ok
    • The Journal of Korean Academic Society of Nursing Education
    • /
    • v.4 no.2
    • /
    • pp.317-330
    • /
    • 1998
  • A descriptive-explanatory design was employed in this study. The purpose of this study was identify the level of assertiveness and clinical stress and the relationships between assertiveness and clinical stress in nursing students. A convenient sample of 143 nursing students was used for the study. The results of this study are as follows : 1) The level of assertiveness of nursing students was 3.65 in domain of contents, 3.68 in domain of voice, and 4.13 in domain of body language. Thus the level of asertiveness of nursing students was higher in body language. 2) The level of clinical stress of nursing students was 3.73 in domain of clinical education and evaluation by professors, 3.59 in domain of nurses, 3.32 in domain of human relationships, and 3.30 in domain of environment. Thus the level of clinical stress of nursing students was higher in clinical education and evaluation by professors. 3) The assertiveness of nursing students was found significantly related to human relationships. 4) The clinical stress of nursing students was found significantly related to satisfaction of nursing, satisfaction of clinical practice and priority of candidate for nursing. 5) The assertiveness of nursing students was not found significantly related to clinical stress in nursing students.

  • PDF