• 제목/요약/키워드: Speech function

검색결과 694건 처리시간 0.028초

다른 발화 속도의 또렷한 음성과 대화체로 발화한 영어문장 인지 (The perception of clear and casual English speech under different speed conditions)

  • 이서배
    • 말소리와 음성과학
    • /
    • 제10권2호
    • /
    • pp.33-37
    • /
    • 2018
  • Korean students with much exposure to the relatively slow and clear speech used in most English classes in Korea can be expected to have difficulty understanding the casual style that is common in the everyday speech of English speakers. This research attempted to investigate an effective way to utilize casual speech in English education, by exploring the way different speech styles (clear vs. casual) affect Korean learners' comprehension of spoken English. Twenty Korean university students and two native speakers of English participated in a listening session. The English utterances were produced in different speech styles (clear slow, casual slow, clear fast, and casual fast). The Korean students were divided into two groups by English proficiency level. The results showed that the Korean students achieved 69.4% comprehension accuracy, while the native speakers of English demonstrated almost perfect results. The Korean students (especially the low-proficiency group) had more problems perceiving function words than they did perceiving content words. Responding to the different speech styles, the high-proficiency group had more difficulty listening to utterances with phonological variation than they did listening to utterances produced at a faster speed. The low-proficiency group, however, struggled with utterances produced at a faster speed more than they did with utterances with phonological variation. The pedagogical implications of the results are discussed in the concluding section.

음성신호개선을 위한 임계대역 웨이블렛 패킷 기반의 스펙트럼 차감법 (Critical Banded Wavelet Packet-Based Spectral Subtractions for Speech Enhancement)

  • Chang, Sung-Wook;Yang, Sung-Il
    • The Journal of the Acoustical Society of Korea
    • /
    • 제23권4E호
    • /
    • pp.125-133
    • /
    • 2004
  • In this paper, we propose a critical banded wavelet packet-based spectral subtraction for speech enhancement. Critical banded wavelet packet, which reflects the human auditory system, may lead to minimization of intelligibility loss and quality improvement of the enhanced speech in the spectral domain, when combined with an appropriate spectral subtraction gain function. The proposed method shows better performance than the conventional one in comparative assessments. We also show that, for effective evaluation of enhanced speech, it is essential to consider the characteristics of speech quality measures.

Complexity Reduction Algorithm of Speech Coder(EVRC) for CDMA Digital Cellular System

  • Min, So-Yeon
    • 한국멀티미디어학회논문지
    • /
    • 제10권12호
    • /
    • pp.1551-1558
    • /
    • 2007
  • The standard of evaluating function of speech coder for mobile telecommunication can be shown in channel capacity, noise immunity, encryption, complexity and encoding delay largely. This study is an algorithm to reduce complexity applying to CDMA(Code Division Multiple Access) mobile telecommunication system, which has a benefit of keeping the existing advantage of telecommunication quality and low transmission rate. This paper has an objective to reduce the computing complexity by controlling the frequency band nonuniform during the changing process of LSP(Line Spectrum Pairs) parameters from LPC(Line Predictive Coding) coefficients used for EVRC(Enhanced Variable-Rate Coder, IS-127) speech coders. Its experimental result showed that when comparing the speech coder applied by the proposed algorithm with the existing EVRC speech coder, it's decreased by 45% at average. Also, the values of LSP parameters, Synthetic speech signal and Spectrogram test result were obtained same as the existing method.

  • PDF

음성인식용 인터페이스의 사용편의성 평가 방법론 (A Usability Evaluation Method for Speech Recognition Interfaces)

  • 한성호;김범수
    • 대한인간공학회지
    • /
    • 제18권3호
    • /
    • pp.105-125
    • /
    • 1999
  • As speech is the human being's most natural communication medium, using it gives many advantages. Currently, most user interfaces of a computer are using a mouse/keyboard type but the interface using speech recognition is expected to replace them or at least be used as a tool for supporting it. Despite the advantages, the speech recognition interface is not that popular because of technical difficulties such as recognition accuracy and slow response time to name a few. Nevertheless, it is important to optimize the human-computer system performance by improving the usability. This paper presents a set of guidelines for designing speech recognition interfaces and provides a method for evaluating the usability. A total of 113 guidelines are suggested to improve the usability of speech-recognition interfaces. The evaluation method consists of four major procedures: user interface evaluation; function evaluation; vocabulary estimation; and recognition speed/accuracy evaluation. Each procedure is described along with proper techniques for efficient evaluation.

  • PDF

Modality-Based Sentence-Final Intonation Prediction for Korean Conversational-Style Text-to-Speech Systems

  • Oh, Seung-Shin;Kim, Sang-Hun
    • ETRI Journal
    • /
    • 제28권6호
    • /
    • pp.807-810
    • /
    • 2006
  • This letter presents a prediction model for sentence-final intonations for Korean conversational-style text-to-speech systems in which we introduce the linguistic feature of 'modality' as a new parameter. Based on their function and meaning, we classify tonal forms in speech data into tone types meaningful for speech synthesis and use the result of this classification to build our prediction model using a tree structured classification algorithm. In order to show that modality is more effective for the prediction model than features such as sentence type or speech act, an experiment is performed on a test set of 970 utterances with a training set of 3,883 utterances. The results show that modality makes a higher contribution to the determination of sentence-final intonation than sentence type or speech act, and that prediction accuracy improves up to 25% when the feature of modality is introduced.

  • PDF

자율이동로봇의 명령 교시를 위한 HMM 기반 음성인식시스템의 구현 (Implementation of Hidden Markov Model based Speech Recognition System for Teaching Autonomous Mobile Robot)

  • 조현수;박민규;이민철
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 제어로봇시스템학회 2000년도 제15차 학술회의논문집
    • /
    • pp.281-281
    • /
    • 2000
  • This paper presents an implementation of speech recognition system for teaching an autonomous mobile robot. The use of human speech as the teaching method provides more convenient user-interface for the mobile robot. In this study, for easily teaching the mobile robot, a study on the autonomous mobile robot with the function of speech recognition is tried. In speech recognition system, a speech recognition algorithm using HMM(Hidden Markov Model) is presented to recognize Korean word. Filter-bank analysis model is used to extract of features as the spectral analysis method. A recognized word is converted to command for the control of robot navigation.

  • PDF

Metadiscourse in the Bank Negara Malaysia Governor's Speech Texts

  • Aziz, Roslina Abdul;Baharum, Norzie Diana
    • 아시아태평양코퍼스연구
    • /
    • 제2권2호
    • /
    • pp.1-15
    • /
    • 2021
  • The study aims to explore the use of metadiscourse in the Bank Negara Malaysia Governor's speeches based on Hyland's Interpersonal Model of Metadiscourse. The corpus data consist of 343 speech texts, which were extracted from the Malaysian Corpus of Financial English (MacFE), amounting to 688,778 tokens. Adopting both quantitative and qualitative approaches to data analysis the study investigates (1) the overall use of metadiscourse in the Bank Negara Governor's speech texts and (2) the functions of the most prominent metadiscourse resources used and their functions in the speech texts. The findings reveal that the Governor's speech texts to be interactional rather than interactive, revealing a rich distribution of interactional metadiscourse resources, namely engagement markers, self-mention, hedges, boosters and attitude markers throughout the texts. The interactional metadiscourse resources function to establish speaker-audience engagement and alignment of views, as well as to express degree of uncertainty and certainty and attitudes. The study concludes that the speech texts are not merely informational or propositional, but rather interpersonal.

어머니가 사용한 담화기능 및 질문유형과 영아의 언어발달과의 관계 (Relationship between Maternal Conversational Function and Question Type and Early Language Development)

  • 이귀옥
    • 한국지역사회생활과학회지
    • /
    • 제17권3호
    • /
    • pp.3-14
    • /
    • 2006
  • The purpose of this study was to investigate the relationship between conversational function and question type in mothers' utterances and their infant's language development. The subjects were 20 infants from 1;07 to 1;11 years of age in Yanji, China. Each child's spontaneous natural speech during interaction with his/her mother was videotaped for about 30 minutes. The children and their mother's spontaneous utterances were transcribed and coded for the number of type and token of word, grammatical morpheme conversational function and type of question in mother's language input to her child. The result showed that mothers used questions as the most frequent conversational function with their infants. The number of questions in conversational function in mothers' utterances positively correlated with the type of word, type of morpheme and grammatical morpheme in infants' utterance. However, there was no correlation between mothers' language input and infant early language development.

  • PDF

뇌성마비 아동의 구강운동 기능 특성 및 치료방법에 관한 고찰 (Consideration for therapy method and oral motor function character of children with cerebral palsy)

  • 임형원
    • 대한물리치료과학회지
    • /
    • 제13권2호
    • /
    • pp.121-127
    • /
    • 2006
  • Consideration for therapy method and oral motor function character of children with cerebral palsy. Therapists who treat for feeding disorder children owing the regression of oral motor function are necessary to gain knowledge about dysfunction of sensing, perception and cognition with baffling to eat and inhibition of primitive reflex, oral anatomy and function, and motor control (trunk, head, positioning of the upper limbs and the lower limbs and muscle tone). Oral motor function program is a comprehensive rehabilitation program which requires systematized enforcement and collaborated attempts to physiotherapists, occupational therapists, speech therapists, psychotherapists. Especially, physical therapists are not accustomed to oral motor program, hoping to provide diffusely and apply new therapy approach method for many areas (bell's palsy, respiratory failure, speech articulation). It will comprise to study owing to holistic approach with clinic.

  • PDF

연속 잡음 음성 인식을 위한 다 모델 기반 인식기의 성능 향상에 대한 연구 (Performance Improvement in the Multi-Model Based Speech Recognizer for Continuous Noisy Speech Recognition)

  • 정용주
    • 음성과학
    • /
    • 제15권2호
    • /
    • pp.55-65
    • /
    • 2008
  • Recently, the multi-model based speech recognizer has been used quite successfully for noisy speech recognition. For the selection of the reference HMM (hidden Markov model) which best matches the noise type and SNR (signal to noise ratio) of the input testing speech, the estimation of the SNR value using the VAD (voice activity detection) algorithm and the classification of the noise type based on the GMM (Gaussian mixture model) have been done separately in the multi-model framework. As the SNR estimation process is vulnerable to errors, we propose an efficient method which can classify simultaneously the SNR values and noise types. The KL (Kullback-Leibler) distance between the single Gaussian distributions for the noise signal during the training and testing is utilized for the classification. The recognition experiments have been done on the Aurora 2 database showing the usefulness of the model compensation method in the multi-model based speech recognizer. We could also see that further performance improvement was achievable by combining the probability density function of the MCT (multi-condition training) with that of the reference HMM compensated by the D-JA (data-driven Jacobian adaptation) in the multi-model based speech recognizer.

  • PDF