• Title/Summary/Keyword: Speech quality

Search Result 806, Processing Time 0.024 seconds

Virtual displays and virtual environments

  • Gilkey, R.H.;Isabelle, S.K.;Simpson, B.B.
    • Journal of the Ergonomics Society of Korea
    • /
    • v.16 no.2
    • /
    • pp.101-122
    • /
    • 1997
  • Our recent work on virtual environments and virtual displays is reviewed, including our efforts to establish the Virtual Environment Research, Interactive Technology, And Simulation (VERITAS) facility and our research on spatial hearing. VERITAS is a state-of -the-art multisensory facility, built around the ${CAVE}^{TM}$ technology. High-quality 3D audio is included and haptic interfaces are planned. The facility will support technical and non-technical users working in a wide variety of application areas. Our own research emphasizes the importance of auditory stimulation in virtual environments and complex display systems. Experiments on auditory-aided visual target acquistion, sensory conflict, sound localization in noise, and loxalization of speech stimuli are discussed.

  • PDF

Performance Assessment of Several Established Pitch Detection Algorithms in Voices of Benign Vocal Fold Lesions (양성후두 질환 음성에 대한 여러 기존 피치검출 알고리즘의 성능 평가)

  • Jang, Seung-Jin;Choi, Seong-Hee;Kim, Hyo-Min;Choi, Hong-Shik;Yoon, Young-Ro
    • Proceedings of the IEEK Conference
    • /
    • 2007.07a
    • /
    • pp.407-408
    • /
    • 2007
  • Robust pitch estimation is an important study in many areas of speech processing. In voice pathology, diverse statistics extracted form pitch were commonly used to test voice quality. In this study, we compared several established pitch detection algorithms (PDAs) for verification of adequacy of the PDAs. In the database of total pathological voices of 99 and normal voices of 30, an analysis of errors related with pitch detection was evaluated between pathological and normal voices, or among the types of pathological voices such as benign vocal fold lesions; polyp, nodule, and cysts. Consequently, it is required to survey the severity of tested voice in order to obtain accurate pitch estimates.

  • PDF

Vector Quantization of Image Signal using Larning Count Control Neural Networks (학습 횟수 조절 신경 회로망을 이용한 영상 신호의 벡터 양자화)

  • 유대현;남기곤;윤태훈;김재창
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.34C no.1
    • /
    • pp.42-50
    • /
    • 1997
  • Vector quantization has shown to be useful for compressing data related with a wide rnage of applications such as image processing, speech processing, and weather satellite. Neural networks of images this paper propses a efficient neural network learning algorithm, called learning count control algorithm based on the frquency sensitive learning algorithm. This algorithm can train a results more codewords can be assigned to the sensitive region of the human visual system and the quality of the reconstructed imate can be improved. We use a human visual systrem model that is a cascade of a nonlinear intensity mapping function and a modulation transfer function with a bandpass characteristic.

  • PDF

The Criterion of Speech Quality Measurement for VoIP (VoIP를 위한 음질 평가 기준 연구)

  • Cho A Seo;Park Sang Wook;Park Young Chul;Youn Dae Hee
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • spring
    • /
    • pp.221-224
    • /
    • 2002
  • VoIP 음성 통신 시스템에서 통화를 할 때, 네트워크 상황이 나빠짐에 따라 시간 지연, 패킷 손실, 지터 등의 QoS 파라미터에 의한 영향이 발생하므로 통화 품질이 떨어지게 된다 통화 품질을 개선하기 위해서는 통화 품질과 QoS 파라미터와의 관계를 명확히 파악하고 그에 대한 개선 방법을 연구해야 한다. 따라서 본 논문에서는 통화 품질과 QoS 파라미터와의 상관관계를 회귀 분석을 통해 도출해 내었다. 제시된 음질 평가 기준은 QoS 파라미터만을 가지고 음질을 예측하기 때문에 계산량이 매우 적으며, 음질 평가 수행 중에 음성 통신 시스템에 거의 영향을 미치지 않는다는 장점을 가지고 있다.

  • PDF

An Applicability of Teager Energy Operator and Energy Separation Algorithm for Waveform Distortion Analysis : Harmonics, Inter-harmonics and Frequency Variation

  • Cho, Soo-Hwan;Hur, Jin;Chung, Il-Yop
    • Journal of Electrical Engineering and Technology
    • /
    • v.9 no.4
    • /
    • pp.1210-1216
    • /
    • 2014
  • This paper deals with an application of Teager Energy Operator (TEO) and Energy Separation Algorithm(ESA) to detect and determine various voltage waveform distortions like harmonics, inter-harmonics and frequency variation. Because the TEO and DESA algorithm was initially proposed for speech or communication analysis, its applications are limited to some types of waveform in the power quality analysis area. For example, an undistorted voltage signal is similar with a pure sinusoid. A voltage fluctuation is very similar with an amplitude-modulated signal, from the viewpoint of signal theory. And a continuous frequency variation is similar with a frequency-modulated signal, which is also known as a chirp signal. This paper is written to show that the TEO and DESA algorithm can be used for detecting occurrences of the representative waveform distortions and determining their instantaneous information of amplitude and frequency.

Effects of Experience on the Production of English Unstressed Vowels

  • Lee, Bo-Rim;Guion Susan G.
    • MALSORI
    • /
    • no.60
    • /
    • pp.47-66
    • /
    • 2006
  • This study examined the effect of English-language experience on Korean- and Japanese-English late learners' production of English unstressed vowels in terms of four acoustic phonetic features: F0, duration, intensity and vowel reduction. The learners manifested some improvement with experience. The native-like attainment of a phonetic feature, however, was related to the phonological status of that feature in the speakers' native language. The results suggest that the extent to which the non-native speakers' production of English unstressed vowels improved with English-language experience varied as a function of their native language background.

  • PDF

Improvement of Prosody Transplantation Technology for English Prosody Education and Its Application (운율교육을 위한 운율이식기술 개선 방안 연구)

  • Yi, So-Pae
    • MALSORI
    • /
    • no.61
    • /
    • pp.49-62
    • /
    • 2007
  • This study focused on the improvement of prosody transplantation technology to be used for effective prosody education. Issues making the technology a less acceptable tool for prosody education were addressed. Instead of merely copying the target pitch onto a learner's utterances, the target pitch was resealed in semitone before the transplantation. In so doing, distortion of a signal was minimized and the transplanted utterance could have the quality of sound not different from the learner's utterances. Instead of manual transplantation, an automatic procedure was proposed to increase the reliability and the consistency of the outcome and enable real time processing. The perceptual performance of the automatic transplantation was evaluated by the perception experiment showing the automatic ransplantation was as good as the manual process.

  • PDF

On a Reduction of Pitch Searching Time by Preliminary Pitch in the CELP Vocoder

  • Bae, Seong-Gyun;Kim, Hyung-Rae;Kim, Dae-Sik;Bae, Myung-Jin
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1994.06a
    • /
    • pp.1104-1111
    • /
    • 1994
  • Code Excited Linear Prediction(CELP) as a speech coder exhibits good performance at data rates below 4.8 kbps. The major drawback to CELP type coders is their large amount of computation. In this paper, we propose a new pitch search method that preserves the quality of the CELP vocoder with reduced complexity. The basic idea is to restrict the pitch searching range by estimating the preliminary pitches. Applying the proposed method to the CELP vocoder, we can get approximately 87% complexity reduction in the pitch search.

  • PDF

Evolution and current status of microsurgical tongue reconstruction, part II

  • Choi, Jong-Woo;Alshomer, Feras;Kim, Young-Chul
    • Archives of Craniofacial Surgery
    • /
    • v.23 no.5
    • /
    • pp.193-204
    • /
    • 2022
  • Tongue reconstruction remains a major aspect of head and neck reconstructive procedures. Surgeons planning tongue reconstruction should consider several factors to optimize the overall outcomes. Specifically, various technical aspects related to tongue reconstruction have been found to affect the outcomes. Multidisciplinary teams dedicated to oncologic, reconstructive, and rehabilitative approaches play an essential role in the reconstructive process. Moreover, operative planning addressing certain patient-related and defect-related factors is crucial for optimizing functional speech and swallowing, as well as quality of life outcomes. Furthermore, tongue reconstruction is a delicate process, in which overall functional outcomes result from proper flap selection and shaping, recipient vessel preparation and anastomosis, surgical approaches to flap insetting, and postoperative management. The second part of this review summarizes these factors in relation to tongue reconstruction.

Real-time Implementation of Variable Transmission Bit Rate Vocoder Integrating G.729A Vocoder and Reduction of the Computational Amount SOLA-B Algorithm Using the TMS320C5416 (TMS320C5416을 이용한 G.729A 보코더와 계산량 감소된 SOLA-B 알고리즘을 통합한 가변 전송율 보코더의 실시간 구현)

  • 함명규;배명진
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.40 no.6
    • /
    • pp.84-89
    • /
    • 2003
  • In this paper, we real-time implemented to the TMS320C5416 the vocoder of variable bit rate applied the SOLA-B algorithm by Henja to the ITU-T G.729A vocoder of 8kbps transmission rate. This proposed method using the SOLA-B algorithm is that it is reduced the duration of the speech in encoding and is played at the speed of normal by extending the duration of the speech in decoding. At this time, we bandied that the interval of cross correlation function if skipped every 3 sample for decreasing the computational amount of SOLA-B algorithm. The real-time implemented vocoder of C.729A and SOLA-B algorithm is represented the complexity of maximum that is 10.2MIPS in encoder and 2.8MIPS in decoder of 8kbps transmission rate. Also, it is represented the complexity of maximum that is 18.5MIPS in encoder and 13.1MIPS in decoder of 6kbps, it is 18.5MIPS in encoder and 13.1MIPS in decoder of 4kbps. The used memory is about program ROM 9.7kwords, table ROM 4.5kwords, RAM 5.1 kwords. The waveform of output is showed by the result of C simulator and Bit Exact. Also, for evaluation of speech quality of the vocoder of real-time implemented variable bit rate, it is estimated the MOS score of 3.69 in 4kbps.