• Title/Summary/Keyword: speechTool

Search Result 155, Processing Time 0.02 seconds

Estimation of Articulatory Characteristics of Vowels Using 'ArtSim' (Artsim'을 이용한 모음의 조음점 추정에 관한 연구)

  • Kim Dae-Ryun;Cho Cheol-Woo
    • MALSORI
    • /
    • no.35_36
    • /
    • pp.121-129
    • /
    • 1998
  • In this paper, articulatory simulator 'Artsim' is used as a tool for the experiments to examine the articulatory characteristics of 6 different vowels. Each vowels are defined by some articulatory points from their vocal tract area functions and shapes of tongues. Each points are varied systematically to synthesize vowels and the synthesized sound is evaluated by human listners. Finally distributions of each vowels within vowel space is obtained. From the experimental results it is verified that our articulatory simulator can be used effectively to investigate the articulatory characteristics of speech.

  • PDF

The Effect of the Telephone Channel to the Performance of the Speaker Verification System (전화선 채널이 화자확인 시스템의 성능에 미치는 영향)

  • 조태현;김유진;이재영;정재호
    • The Journal of the Acoustical Society of Korea
    • /
    • v.18 no.5
    • /
    • pp.12-20
    • /
    • 1999
  • In this paper, we compared speaker verification performance of the speech data collected in clean environment and in channel environment. For the improvement of the performance of speaker verification gathered in channel, we have studied on the efficient feature parameters in channel environment and on the preprocessing. Speech DB for experiment is consisted of Korean doublet of numbers, considering the text-prompted system. Speech features including LPCC(Linear Predictive Cepstral Coefficient), MFCC(Mel Frequency Cepstral Coefficient), PLP(Perceptually Linear Prediction), LSP(Line Spectrum Pair) are analyzed. Also, the preprocessing of filtering to remove channel noise is studied. To remove or compensate for the channel effect from the extracted features, cepstral weighting, CMS(Cepstral Mean Subtraction), RASTA(RelAtive SpecTrAl) are applied. Also by presenting the speech recognition performance on each features and the processing, we compared speech recognition performance and speaker verification performance. For the evaluation of the applied speech features and processing methods, HTK(HMM Tool Kit) 2.0 is used. Giving different threshold according to male or female speaker, we compare EER(Equal Error Rate) on the clean speech data and channel data. Our simulation results show that, removing low band and high band channel noise by applying band pass filter(150~3800Hz) in preprocessing procedure, and extracting MFCC from the filtered speech, the best speaker verification performance was achieved from the view point of EER measurement.

  • PDF

Development of the Korean version of ICF e-Learning tool

  • Lee, HaeJung;Song, JuMin
    • The Journal of Korean Physical Therapy
    • /
    • v.31 no.2
    • /
    • pp.88-93
    • /
    • 2019
  • Purpose: The aim of the study was to develop a Korean version of an ICF e-Learning tool (KICF e-Learning tool). Methods: The process of translation and adaptation of the ICF e-Learning tool was followed: two translators developed the Korean versions independently, and a consensus version of the translation was then produced. An expert committee, which was composed of five experts from physiotherapy, occupational therapy, speech pathology, and social welfare, reviewed the consensus Korean version to make a beta version of the tool. A field test was conducted to determine if the Korean version of the tool was easy to understand and suitable to use in ICF learning. Feedback from the field test were used for the final adaptation of the KICF e-Learning tool. Results: One-hundred and twenty-six volunteers (40 males and 76 females) were invited to examine the KICF e-Learning tool. The participants reported various levels of ICF knowledge from none to very good. Forty-eight participants reported no knowledge of ICF. The majority of participants (n=84) reported that Korean terms or expression in the tool were easy to understand and one-hundred fourteen participants would recommend the tool to another person. The Korean cases would be helpful for a Korean audience to study the ICF using the tool. Conclusion: The KICF e-Learning tool was developed and is ready for use by the public for the consistency of ICF education. On the other hand, development of an advanced module will be needed.

The Influence of Feedback in the Simulated Patient Case-History Training among Audiology Students at the International Islamic University Malaysia

  • Dzulkarnain, Ahmad Aidil Arafat;Sani, Maryam Kamilah Ahmad;Rahmat, Sarah;Jusoh, Masnira
    • Journal of Audiology & Otology
    • /
    • v.23 no.3
    • /
    • pp.121-128
    • /
    • 2019
  • Background and Objectives: There is a scant evidence on the use of simulations in audiology (especially in Malaysia) for case-history taking, although this technique is widely used for training medical and nursing students. Feedback is one of the important components in simulations training; however, it is unknown if feedback by instructors could influence the simulated patient (SP) training outcome for case-history taking among audiology students. Aim of the present study is to determine whether the SP training with feedback in addition to the standard role-play and seminar training is an effective learning tool for audiology case-history taking. Subjects and Methods: Twenty-six second-year undergraduate audiology students participated. A cross-over study design was used. All students initially attended two hours of seminar and role-play sessions. They were then divided into three types of training, 1) SP training (Group A), 2) SP with feedback (Group B), and 3) a non-additional training group (Group C). After two training sessions, the students changed their types of training to, 1) Group A and C: SP training with feedback, and 2) Group B: non-additional training. All the groups were assessed at three points: 1) pre-test, 2) intermediate, and 3) post-test. The normalized median score differences between and within the respective groups were analysed using non-parametric tests at 95% confidence intervals. Results: Groups with additional SP trainings (with and without feedback) showed a significantly higher normalized gain score than no training group (p<0.05). Conclusions: The SP training (with/without feedback) is a beneficial learning tool for history taking to students in audiology major.

The Influence of Feedback in the Simulated Patient Case-History Training among Audiology Students at the International Islamic University Malaysia

  • Dzulkarnain, Ahmad Aidil Arafat;Sani, Maryam Kamilah Ahmad;Rahmat, Sarah;Jusoh, Masnira
    • Korean Journal of Audiology
    • /
    • v.23 no.3
    • /
    • pp.121-128
    • /
    • 2019
  • Background and Objectives: There is a scant evidence on the use of simulations in audiology (especially in Malaysia) for case-history taking, although this technique is widely used for training medical and nursing students. Feedback is one of the important components in simulations training; however, it is unknown if feedback by instructors could influence the simulated patient (SP) training outcome for case-history taking among audiology students. Aim of the present study is to determine whether the SP training with feedback in addition to the standard role-play and seminar training is an effective learning tool for audiology case-history taking. Subjects and Methods: Twenty-six second-year undergraduate audiology students participated. A cross-over study design was used. All students initially attended two hours of seminar and role-play sessions. They were then divided into three types of training, 1) SP training (Group A), 2) SP with feedback (Group B), and 3) a non-additional training group (Group C). After two training sessions, the students changed their types of training to, 1) Group A and C: SP training with feedback, and 2) Group B: non-additional training. All the groups were assessed at three points: 1) pre-test, 2) intermediate, and 3) post-test. The normalized median score differences between and within the respective groups were analysed using non-parametric tests at 95% confidence intervals. Results: Groups with additional SP trainings (with and without feedback) showed a significantly higher normalized gain score than no training group (p<0.05). Conclusions: The SP training (with/without feedback) is a beneficial learning tool for history taking to students in audiology major.

Learner-Generated Digital Listening Materials Using Text-to-Speech for Self-Directed Listening Practice

  • Moon, Dosik
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.12 no.4
    • /
    • pp.148-155
    • /
    • 2020
  • This study investigated learners' perceptions of using self-generated listening materials based on Text to Speech. After taking an online training session to learn how to make listening materials for extensive listening practice outside the classroom, the learners were engaged in practice with self-generated listening materials for 10 weeks in a self-directed way. The results show that a majority of the learners found the TTS-based listening materials helpful to reduce anxiety toward listening and enhance self-confidence and motivation, with a positive effect on improving their listening ability. The learners' general satisfaction can be attributed to some beneficial features of TTS-based listening material, including freedom to choose what they want to learn, convenient accessibility to the material, availability of various native speakers' voices, and novelty of digital tools. This suggests that TTS-based digital listening materials can be a useful educational tool to support learners' self-directed listening practice outside the classroom in EFL settings.

Speech Enhancement Using Blind Signal Separation Combined With Null Beamforming

  • Nam Seung-Hyon;Jr. Rodrigo C. Munoz
    • The Journal of the Acoustical Society of Korea
    • /
    • v.25 no.4E
    • /
    • pp.142-147
    • /
    • 2006
  • Blind signal separation is known as a powerful tool for enhancing noisy speech in many real world environments. In this paper, it is demonstrated that the performance of blind signal separation can be further improved by combining with a null beamformer (NBF). Cascading the blind source separation with null beamforming is equivalent to the decomposition of the received signals into the direct parts and reverberant parts. Investigation of beam patterns of the null beamformer and blind signal separation reveals that directional null of NBF reduces mainly direct parts of the unwanted signals whereas blind signal separation reduces reverberant parts. Further, it is shown that the decomposition of received signals can be exploited to solve the local stability problem. Therefore, faster and improved separation can be obtained by removing the direct parts first by null beamforming. Simulation results using real office recordings confirm the expectation.

Characteristics of the Korean speakers' voice under easy Korean, difficult Korean and English reading situations (한국인의 쉬운 한국어, 어려운 한국어, 영어 읽기 상황에서의 음성 특성)

  • Kim, Ji-Eun
    • Phonetics and Speech Sciences
    • /
    • v.8 no.1
    • /
    • pp.1-7
    • /
    • 2016
  • The purpose of this study is to know the acoustic characteristics of voice under stressful and relaxed conditions. Ten undergraduate male students participated in this study and produced 아, 에, 이 vowels in English reading, difficult Korean reading under stressful conditions, and easy Korean reading under relaxed conditions. After that, F0, jitter, shimmer, NHR, F1, F2, and F3 values were measured and analyzed. The results of this study demonstrate that speech parameters related to stress are jitter, shimmer, and NHR in that these values are lower under relaxed situations (easy Korean reading) than that of stressful situations (English and difficult Korean reading). This study will be a foundation to verify that the analysis of acoustic characteristics can serve as a quantitative tool for measuring stress levels.

A Basic Performance Evaluation of the Speech Recognition APP of Standard Language and Dialect using Google, Naver, and Daum KAKAO APIs (구글, 네이버, 다음 카카오 API 활용앱의 표준어 및 방언 음성인식 기초 성능평가)

  • Roh, Hee-Kyung;Lee, Kang-Hee
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.7 no.12
    • /
    • pp.819-829
    • /
    • 2017
  • In this paper, we describe the current state of speech recognition technology and identify the basic speech recognition technology and algorithms first, and then explain the code flow of API necessary for speech recognition technology. We use the application programming interface (API) of Google, Naver, and Daum KaKao, which have the most famous search engine among the speech recognition APIs, to create a voice recognition app in the Android studio tool. Then, we perform a speech recognition experiment on people's standard words and dialects according to gender, age, and region, and then organize the recognition rates into a table. Experiments were conducted on the Gyeongsang-do, Chungcheong-do, and Jeolla-do provinces where the degree of tongues was severe. And Comparative experiments were also conducted on standardized dialects. Based on the resultant sentences, the accuracy of the sentence is checked based on spacing of words, final consonant, postposition, and words and the number of each error is represented by a number. As a result, we aim to introduce the advantages of each API according to the speech recognition rate, and to establish a basic framework for the most efficient use.

Fluency Scoring of English Speaking Tests for Nonnative Speakers Using a Native English Phone Recognizer

  • Jang, Byeong-Yong;Kwon, Oh-Wook
    • Phonetics and Speech Sciences
    • /
    • v.7 no.2
    • /
    • pp.149-156
    • /
    • 2015
  • We propose a new method for automatic fluency scoring of English speaking tests spoken by nonnative speakers in a free-talking style. The proposed method is different from the previous methods in that it does not require the transcribed texts for spoken utterances. At first, an input utterance is segmented into a phone sequence by using a phone recognizer trained by using native speech databases. For each utterance, a feature vector with 6 features is extracted by processing the segmentation results of the phone recognizer. Then, fluency score is computed by applying support vector regression (SVR) to the feature vector. The parameters of SVR are learned by using the rater scores for the utterances. In computer experiments with 3 tests taken by 48 Korean adults, we show that speech rate, phonation time ratio, and smoothed unfilled pause rate are best for fluency scoring. The correlation of between the rater score and the SVR score is shown to be 0.84, which is higher than the correlation of 0.78 among raters. Although the correlation is slightly lower than the correlation of 0.90 when the transcribed texts are given, it implies that the proposed method can be used as a preprocessing tool for fluency evaluation of speaking tests.