• Title/Summary/Keyword: speechTool

Search Result 155, Processing Time 0.019 seconds

A Study on Intonation Patterns of Speech Produced by Cochlear Implanted Children

  • Park, Sang-Hee;Jang, Tae-Yeoub;Lee, Sang-Heun;Jeong, Ok-Ran;Seok, Dong-Il
    • Speech Sciences
    • /
    • v.9 no.1
    • /
    • pp.27-38
    • /
    • 2002
  • The purpose of the study is to examine intonation patterns of cochlear implanted children compared with those of normal hearing children. The data tokens of three normal and five cochlear implanted children were collected and investigated. Their intonation patterns were analyzed using the speech analysis tool, Praat. The characteristics of the two utterance types, interrogative and declarative, were investigated. No significant difference in intonation patterns between the two subject groups was found. However, the general pitch of cochlear implanted children was higher than that of normal hearing children. In addition, cochlear implanted children showed frequent pitch breaks.

  • PDF

A Usability Evaluation Method for Speech Recognition Interfaces (음성인식용 인터페이스의 사용편의성 평가 방법론)

  • Han, Seong-Ho;Kim, Beom-Su
    • Journal of the Ergonomics Society of Korea
    • /
    • v.18 no.3
    • /
    • pp.105-125
    • /
    • 1999
  • As speech is the human being's most natural communication medium, using it gives many advantages. Currently, most user interfaces of a computer are using a mouse/keyboard type but the interface using speech recognition is expected to replace them or at least be used as a tool for supporting it. Despite the advantages, the speech recognition interface is not that popular because of technical difficulties such as recognition accuracy and slow response time to name a few. Nevertheless, it is important to optimize the human-computer system performance by improving the usability. This paper presents a set of guidelines for designing speech recognition interfaces and provides a method for evaluating the usability. A total of 113 guidelines are suggested to improve the usability of speech-recognition interfaces. The evaluation method consists of four major procedures: user interface evaluation; function evaluation; vocabulary estimation; and recognition speed/accuracy evaluation. Each procedure is described along with proper techniques for efficient evaluation.

  • PDF

Auditory-Perceptual Variables of Speech Evaluation in Dysarthria Literature (마비말장애 연구문헌에서 살펴본 말평가의 청지각적 요소)

  • Suh, Mee-Kyung;Kim, Hyang-Hee
    • Speech Sciences
    • /
    • v.13 no.3
    • /
    • pp.197-206
    • /
    • 2006
  • Perceptual judgement method is frequently used in evaluating dysarthric speech. Although most of speech pathologists and researchers focus on the 38 perceptual features provided by Darley, Aronson & Brown(1969) during evaluation, there are additional characteristics that may be useful to describe dysarthria in literature. We reviewed previous dysarthria literature and selected 46 perceptual characteristics that could be examined at various subsystems of speech production. We also provided explanations and rationale for the rating method for each of the perceptual characteristics. This attempt might aid to offer a basic ground for developing a diagnostic tool of dysarthria.

  • PDF

Korean Prosody Generation Based on Stem-ML (Stem-ML에 기반한 한국어 억양 생성)

  • Han, Young-Ho;Kim, Hyung-Soon
    • MALSORI
    • /
    • no.54
    • /
    • pp.45-61
    • /
    • 2005
  • In this paper, we present a method of generating intonation contour for Korean text-to-speech (TTS) system and a method of synthesizing emotional speech, both based on Soft template mark-up language (Stem-ML), a novel prosody generation model combining mark-up tags and pitch generation in one. The evaluation shows that the intonation contour generated by Stem-ML is better than that by our previous work. It is also found that Stem-ML is a useful tool for generating emotional speech, by controling limited number of tags. Large-size emotional speech database is crucial for more extensive evaluation.

  • PDF

Speech Interactive Agent on Car Navigation System Using Embedded ASR/DSR/TTS

  • Lee, Heung-Kyu;Kwon, Oh-Il;Ko, Han-Seok
    • Speech Sciences
    • /
    • v.11 no.2
    • /
    • pp.181-192
    • /
    • 2004
  • This paper presents an efficient speech interactive agent rendering smooth car navigation and Telematics services, by employing embedded automatic speech recognition (ASR), distributed speech recognition (DSR) and text-to-speech (ITS) modules, all while enabling safe driving. A speech interactive agent is essentially a conversational tool providing command and control functions to drivers such' as enabling navigation task, audio/video manipulation, and E-commerce services through natural voice/response interactions between user and interface. While the benefits of automatic speech recognition and speech synthesizer have become well known, involved hardware resources are often limited and internal communication protocols are complex to achieve real time responses. As a result, performance degradation always exists in the embedded H/W system. To implement the speech interactive agent to accommodate the demands of user commands in real time, we propose to optimize the hardware dependent architectural codes for speed-up. In particular, we propose to provide a composite solution through memory reconfiguration and efficient arithmetic operation conversion, as well as invoking an effective out-of-vocabulary rejection algorithm, all made suitable for system operation under limited resources.

  • PDF

Wideband Speech Reconstruction Using Modular Neural Networks (모듈화한 신경 회로망을 이용한 광대역 음성 복원)

  • Woo Dong Hun;Ko Charm Han;Kang Hyun Min;Jeong Jin Hee;Kim Yoo Shin;Kim Hyung Soon
    • MALSORI
    • /
    • no.48
    • /
    • pp.93-105
    • /
    • 2003
  • Since telephone channel has bandlimited frequency characteristics, speech signal over the telephone channel shows degraded speech quality. In this paper, we propose an algorithm using neural network to reconstruct wideband speech from its narrowband version. Although single neural network is a good tool for direct mapping, it has difficulty in training for vast and complicated data. To alleviate this problem, we modularize the neural networks based on appropriate clustering of the acoustic space. We also introduce fuzzy computing to compensate for probable misclassification at the cluster boundaries. According to our simulation, the proposed algorithm showed improved performance over the single neural network and conventional codebook mapping method in both objective and subjective evaluations.

  • PDF

Separation of Voiced Sounds and Unvoiced Sounds for Corpus-based Korean Text-To-Speech (한국어 음성합성기의 성능 향상을 위한 합성 단위의 유무성음 분리)

  • Hong, Mun-Ki;Shin, Ji-Young;Kang, Sun-Mee
    • Speech Sciences
    • /
    • v.10 no.2
    • /
    • pp.7-25
    • /
    • 2003
  • Predicting the right prosodic elements is a key factor in improving the quality of synthesized speech. Prosodic elements include break, pitch, duration and loudness. Pitch, which is realized by Fundamental Frequency (F0), is the most important element relating to the quality of the synthesized speech. However, the previous method for predicting the F0 appears to reveal some problems. If voiced and unvoiced sounds are not correctly classified, it results in wrong prediction of pitch, wrong unit of triphone in synthesizing the voiced and unvoiced sounds, and the sound of click or vibration. This kind of feature is usual in the case of the transformation from the voiced sound to the unvoiced sound or from the unvoiced sound to the voiced sound. Such problem is not resolved by the method of grammar, and it much influences the synthesized sound. Therefore, to steadily acquire the correct value of pitch, in this paper we propose a new model for predicting and classifying the voiced and unvoiced sounds using the CART tool.

  • PDF

Speech Rate Analysis of Dysarthric Patients with Parkinson's Disease and Multiple System Atrophy (파킨슨병과 다계통위축증 환자군 간의 말속도 비교평가)

  • Kim, Hyang-Hee;Lee, Mi-Sook;Kim, Sun-Woo;Lee, Won-Yong
    • Speech Sciences
    • /
    • v.10 no.4
    • /
    • pp.221-227
    • /
    • 2003
  • Diadochokinetic (DDK) speech task has been utilized as an evaluating tool for speakers with dysarthria for many years. This study attempted to differently diagnose multiple system atrophy (MSA) from idiopathic Parkinson's disease (PD) using patients' performance of DDK (i.e., alternate motion rate (AMR)). The subjects included 11 cases of pathologically confirmed MSA and 16 IPD patients who commonly presented with parkinsonian syndrome. The speech sample of each patient was analyzed acoustically using the MSPTM(Motor Speech Profile, a module of CSL). The results showed that the average DDK rate was significantly faster in the IPD than the MSA groups in all three syllables (i.e., /puh/, /tuh/. and /kuh/). We propose the average DDK rate variable as a core clinical trait in differentiating the two pathological conditions.

  • PDF

Speech Recognition in Noisy Environments using Wiener Filtering (Wiener Filtering을 이용한 잡음환경에서의 음성인식)

  • Kim, Jin-Young;Eom, Ki-Wan;Choi, Hong-Sub
    • Speech Sciences
    • /
    • v.1
    • /
    • pp.277-283
    • /
    • 1997
  • In this paper, we present a robust recognition algorithm based on the Wiener filtering method as a research tool to develop the Korean Speech recognition system. We especially used Wiener filtering method in cepstrum-domain, because the method in frequency-domain is computationally expensive and complex. Evaluation of the effectiveness of this method has been conducted in speaker-independent isolated Korean digit recognition tasks using discrete HMM speech recognition systems. In these tasks, we used 12th order weighted cepstral as a feature vector and added computer simulated white gaussian noise of different levels to clean speech signals for recognition experiments under noisy conditions. Experimental results show that the presented algorithm can provide an improvement in recognition of as much as from $5\%\;to\;\20\%$ in comparison to spectral subtraction method.

  • PDF

Literature Review of Listening Effort Using Subjective Scaling (주관적 측정을 이용한 청취 노력의 문헌 고찰)

  • Lee, Jihyeon;Lee, Seungwan;Han, Woojae;Kim, Jinsook
    • Korean Journal of Otorhinolaryngology-Head and Neck Surgery
    • /
    • v.60 no.3
    • /
    • pp.99-106
    • /
    • 2017
  • Listening effort is defined as a listener's mental exertion required to understand a speaker's auditory message, especially when distracting conditions are present. This review paper analyzed several subjective scaling tools used to measure the listening effort in order to suggest the best tool for use with hearing-impaired listeners who have to expend much effort even in everyday life. We first explained the importance of measuring listening effort and discussed various kinds of measurements. We then analyzed and categorized 15 recently published articles (i.e., from 2014 to 2016) into three topics: performance and listening effort, listening effort and fatigue, and clinical implication of listening effort. We compared the articles in terms of pros and cons and also identified 10 tools for use in the subjective scaling. Although none of these tools were unified or standardized easily, we concluded that 7-point scale would be the most reasonable as a less time-consuming measurement for compartmentalizing the degree of listening effort. If used with objective tools for measuring the listening effort, the subjective scaling could be a powerful tool for clinical use.