• 제목/요약/키워드: speech situation

검색결과 122건 처리시간 0.021초

텍스트의 의미 정보에 기반을 둔 음성컨트롤 태그에 관한 연구 (A Study of Speech Control Tags Based on Semantic Information of a Text)

  • 장문수;정경채;강선미
    • 음성과학
    • /
    • 제13권4호
    • /
    • pp.187-200
    • /
    • 2006
  • The speech synthesis technology is widely used and its application area is also being broadened to an automatic response service, a learning system for handicapped person, etc. However, the sound quality of the speech synthesizer has not yet reached to the satisfactory level of users. To make a synthesized speech, the existing synthesizer generates rhythms only by the interval information such as space and comma or by several punctuation marks such as a question mark and an exclamation mark so that it is not easy to generate natural rhythms of people even though it is based on mass speech database. To make up for the problem, there is a way to select rhythms after processing language from a higher level information. This paper proposes a method for generating tags for controling rhythms by analyzing the meaning of sentence with speech situation information. We use the Systemic Functional Grammar (SFG) [4] which analyzes the meaning of sentence with speech situation information considering the sentence prior to the given one, the situation of a conversation, the relationship among people in the conversation, etc. In this study, we generate Semantic Speech Control Tag (SSCT) by the result of SFG's meaning analysis and the voice wave analysis.

  • PDF

비대면 음성언어치료의 현황과 전망 (Current Status and Perspectives of Telepractice in Voice and Speech Therapy)

  • 이승진
    • 대한후두음성언어의학회지
    • /
    • 제33권3호
    • /
    • pp.130-141
    • /
    • 2022
  • Voice and speech therapy can be performed in various ways depending on the situation, although it is generally performed in a face-to-face manner. Telepractice refers to the provision of specialized voice and speech therapy by speech-language pathologists for assessment, therapy, and counseling by applying telecommunication technology from a remote location. Recently, due to the pandemic situation and the active use of non-face-to-face platforms, interest in telepractice of voice and speech therapy has increased. Moreover, a growing body of literature has been advocating its clinical usefulness and non-inferiority to traditional face-to-face intervention. In this review, the existing discussions, guidelines, and preliminary studies on non-face-toface voice and speech therapy were summarized, and recommendations on the tools for telepractice were provided.

중국어 회화문에 대한 의사소통 분석단위에 기초한 접근 (An Approach to Chinese Conversations in the Textbook based on Social Units of Communication)

  • 박찬욱
    • 비교문화연구
    • /
    • 제49권
    • /
    • pp.127-150
    • /
    • 2017
  • 본고는 Hymes(1972)가 제시한 의사소통의 사회적 분석단위 중 네 가지 - 말 공동체(speech community), 말 상황(speech situation), 말 사례(speech event), 말 행위(speech act) - 개념에 기대어 중국어 교재 상의 회화문을 고찰하고 그 결과를 교실 활동에서 활용할 수 있도록 숙고하는 데 목적이 있다. 이에 본고는 매 회화 단락을, 특정한 상황 맥락 하에서 여러 말 행위들로 구성된 말 사례의 정합으로 간주한다. 그리고 매 문장을, 발화로써 말 힘을 가질 수 있는 언어적 행위로 가정한다. 회화문을 Hymes(1972)의 단위에 기대어 분석하기에 앞서 본고는 분석 단위의 개별적 특징을 먼저 살펴본다. 그런 뒤, 그것의 단위 개념을 토대로, 교재 회화문을 형태 및 통사 단위의 결합이 아닌 행위의 결합이란 관점에서 분석한다. 나아가 복잡한 통사 단위들의 결합으로 여겨질 수 있는 회화 단락이 다소 한정된 소수의 분석단위들 간 결합일 수 있음을 보인다. 그리고 그 결과가 교실 활동에서 어떻게 사용될 수 있는지 제언을 덧붙인다.

자동차 환경에서 Oak DSP 코어 기반 음성 인식 시스템 실시간 구현 (A Real-Time Implementation of Speech Recognition System Using Oak DSP core in the Car Noise Environment)

  • 우경호;양태영;이충용;윤대희;차일환
    • 음성과학
    • /
    • 제6권
    • /
    • pp.219-233
    • /
    • 1999
  • This paper presents a real-time implementation of a speaker independent speech recognition system based on a discrete hidden markov model(DHMM). This system is developed for a car navigation system to design on-chip VLSI system of speech recognition which is used by fixed point Oak DSP core of DSP GROUP LTD. We analyze recognition procedure with C language to implement fixed point real-time algorithms. Based on the analyses, we improve the algorithms which are possible to operate in real-time, and can verify the recognition result at the same time as speech ends, by processing all recognition routines within a frame. A car noise is the colored noise concentrated heavily on the low frequency segment under 400 Hz. For the noise robust processing, the high pass filtering and the liftering on the distance measure of feature vectors are applied to the recognition system. Recognition experiments on the twelve isolated command words were performed. The recognition rates of the baseline recognizer were 98.68% in a stopping situation and 80.7% in a running situation. Using the noise processing methods, the recognition rates were enhanced to 89.04% in a running situation.

  • PDF

u-Green City 구현을 위한 상황인지기반 지능형 음성인식 시스템 (Intelligent Speech Recognition System based on Situation Awareness for u-Green City)

  • 조영임;장성순
    • 제어로봇시스템학회논문지
    • /
    • 제15권12호
    • /
    • pp.1203-1208
    • /
    • 2009
  • Green IT based u-City means that u-City having Green IT concept. If we adopt the situation awareness or not, the processing of Green IT may be reduced. For example, if we recognize a lot of speech sound on CCTV in u-City environment, it takes a lot of processing time and cost. However, if we want recognize emergency sound on CCTV, it takes a few reduced processing cost. So, for detecting emergency state dynamically through CCTV, we propose our advanced speech recognition system. For the purpose of that, we adopt HMM (Hidden Markov Model) for feature extraction. Also, we adopt Wiener filter technique for noise elimination in many information coming from on CCTV in u-City environment.

4-6세 이중언어아동의 비유창성 특성 연구 (Disfluency Characteristics in 4-6 Age Bilingual Children)

  • 이수복;심현섭;신문자
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2007년도 한국음성과학회 공동학술대회 발표논문집
    • /
    • pp.78-83
    • /
    • 2007
  • The purpose of present study was to investigate the characteristics of disfluency between the Korean-English bilingual and Korean monolingual children, matched by their chronological age with the bilingual children. Twenty-eight children, 14 bilingual children and 14 monolingual children participated in this study. The experimental tasks consisted of the play situation and the task situation. The conclusion is (a) The score of total disfluency of the bilingual was significantly higher than that of the monolingual. The score of normal disfluency of the bilingual was significantly higher than that of the monolingual. The most frequent type is Interjection in both groups. All shows higher score in the task situation than the play situation. The bilingual children have quantitative and qualitative differences in disfluency score and types from the monolingual. (b) The bilingual were divided into two groups such as 6 Korean-dominant bilingual and 8 English-dominant bilingual. All shows more disfluency in their non-dominant language. The most frequent type is Interjection in both groups. (c) The higher the chronological age and the expressive language test score is, the lower the disfluency score is. The earlier the exposure age to the 2nd language is, the higher the disfluency score is. There is no correlation between resident month at foreign country and the disfluency.

  • PDF

대본 내용에 의한 정서음성 수집과정의 정규화에 대하여 (Normalization in Collection Procedures of Emotional Speech by Scriptual Context)

  • 조철우
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2006년도 춘계 학술대회 발표논문집
    • /
    • pp.123-125
    • /
    • 2006
  • One of the biggest problems unsolved in emotional speech acquisition is how to make or find a situation which is close to natual or desired state from humans. We proposed a method to collect emotional speech data by scriptual context. Several contexts from the scripts of drama were chosen by the experts in the area. Context were divided into 6 classes according to the contents. Two actors, one male and one female, read the text after recognizing the emotional situations in the script.

  • PDF

언어장애인의 명료도에 영향을 미치는 말요인: 문헌연구 (The Role of Speech Factors in Speech Intelligibility: A Review)

  • 김수진
    • 대한음성학회지:말소리
    • /
    • 제43호
    • /
    • pp.25-44
    • /
    • 2002
  • The intelligibility of a spoken message is influenced by a number of factors. Intelligibility is a joint product of a speaker and a listener. In addition, intelligibility varies with the nature of the language context and the context of communication. Thus a single intelligibility score can not be ascribed to a given individual apart from listener and listening situation. But there is a clinical and research need to develop assessment measures of intelligibility that are quantitative and analytic. Before developing the index of intelligibility, the crucial factors need to be examined. Among them, the most significant in intelligibility is the speech factors of speakers. The following section reviews the literature dealing with the contribution of segmental and suprasegmental factors in speech intelligibility regarding the hearing impaired, alaryngeal, and motor disorders.

  • PDF

자율차량 안전을 위한 긴급상황 알림 및 운전자 반응 확인 시스템 설계 (A Design of the Emergency-notification and Driver-response Confirmation System(EDCS) for an autonomous vehicle safety)

  • 손수락;정이나
    • 한국정보전자통신기술학회논문지
    • /
    • 제14권2호
    • /
    • pp.134-139
    • /
    • 2021
  • 현재 자율주행차량 시장은 3레벨 자율주행차량을 상용화하고 있으나, 여전히 운전자의 주의를 필요로 한다. 3레벨 자율주행 이후 4레벨 자율주행차량에서 가장 주목되는 부분은 차량의 안정성이다. 3레벨과 다르게 4레벨 이후의 자율주행차량은 운전자의 부주의까지 포함하여 자율주행을 실시해야 하기 때문이다. 따라서 본 논문에서는 운전자가 부주의한 상황에서 긴급상황을 알리고 운전자의 반응을 인식하는 자율차량 안전을 위한 긴급상황 알림 및 운전자 반응 확인 시스템을 제안한다. 긴급상황 알림 및 운전자 반응 확인 시스템은 긴급상황 전달 모듈을 사용하여 긴급상황을 텍스트화하여 운전자에게 음성으로 전달하며 운전자 반응 확인 모듈을 사용하여 긴급상황에 대한 운전자의 반응을 인식하고 운전 권한을 운전자에게 넘길지 결정한다. 실험 결과, 긴급상황 전달 모듈의 HMM은 RNN보다 25%, LSTM보다 42.86% 빠른 속도로 음성을 학습했다. 운전자 반응 확인 모듈의 Tacotron2는 deep voice보다 약 20ms, deep mind 보다 약 50ms 더 빨리 텍스트를 음성으로 변환했다. 따라서 긴급상황 알림 및 운전자 반응 확인 시스템은 효율적으로 신경망 모델을 학습시키고, 실시간으로 운전자의 반응을 확인할 수 있다.

정상 성인 말속도의 청지각적/음향학적 평가에 관한 기초 연구: 지역에 따른 말속도 차이를 중심으로 (Preliminary study of the perceptual and acoustic analysis on the speech rate of normal adult: Focusing the differences of the speech rate according to the area)

  • 이현정
    • 말소리와 음성과학
    • /
    • 제6권3호
    • /
    • pp.73-77
    • /
    • 2014
  • The purpose of this study is to investigate the differences of the speech rate according to the area in the perceptual and acoustic analysis. This study examines regional variation in overall speech rate and articulation rate across speaking situations (picture description, free conversation and story retelling) with 14 normal adult (7 in Gyeongnam and 7 in Honam area). The result of an experimental investigation shows that the perceptual speech rate differs significantly between two regional varieties of Koreans with a picture description examined here. A group of Honam speakers spoke significantly faster than a group of Gyeongnam speakers. However, the result of the acoustic analysis shows that the speech rate of the two groups did not differ. And there were significant regional differences in the overall speech rate and articulation rate on the other two speaking situation, free conversation and story retelling. It suggest that we have to study perceptual evaluation with regard to the free conversation and story retelling in future research, and based on the results of this study, a variety of researches on the speech rate will be needed on the various conditions, including various area and SLPs who have wider background and experiences. It is necessary for SLPs to train and experience more to assess patients properly and reliably.