• 제목/요약/키워드: target utterances

검색결과 20건 처리시간 0.025초

Building a Sentential Model for Automatic Prosody Evaluation

  • 윤규철
    • 말소리와 음성과학
    • /
    • 제1권4호
    • /
    • pp.47-59
    • /
    • 2009
  • The purpose of this paper is to propose an automatic evaluation technique for the prosodic aspect of an English sentence uttered by Korean speakers learning English. The underlying hypothesis is that the consistency of the manual prosody scoring is reflected in an imaginary space of prosody evaluation model constructed out of the three physical properties of the prosody considered in this paper, namely: the fundamental frequency (F0) contour, the intensity contour, and the segmental durations. The evaluation proceeds first by building a prosody evaluation model for the sentence. For the creation of the model, utterances from native speakers of English and Korean learners for the target sentence are manually scored by either native teachers of English or Korean phoneticians in terms of their prosody. Multiple native utterances from the manual scoring are selected as the "model" native utterances against which all the other Korean learners' utterances as well as the model utterances themselves can be semi-automatically evaluated by comparison in terms of the three prosodic aspects [7]. Each learner utterance, when compared to the multiple model native utterances, produces multiple coordinates in a three-dimensional space of prosody evaluation, each axis of which corresponds to the three prosodic aspects. The 3D coordinates from all the comparisons form a prosody evaluation model for the particular sentence and the associated manual scores can display regions of particular scores. The model can then be used as a predictive model against which other Korean utterances of the target sentence can be evaluated. The model from a Korean phonetician appears to support the hypothesis.

  • PDF

Treatment Effect of a Modified Melodic Intonation Therapy (MMIT) in Korean Aphasics

  • Ko, Do-Heung;Jeong, Ok-Ran
    • 음성과학
    • /
    • 제4권2호
    • /
    • pp.91-102
    • /
    • 1998
  • The present study attempted to modify the conventional Melodic Intonation Therapy (MIT) in three aspects: number of syllables of adjacent target utterances (ATU), melody patterns of ATU, and initial listening of melody and intoned speech with the eyes closed. The modified Melodic Intonation Therapy (MMIT) was applied to two severe Korean aphasics. The patients exhibited a severely nonfluent aphasia resulting from a left CVA(Cerebrovascular Accident). The purpose of the modification was to avoid perseveration and improve reflective listening skills. First, the treatment program avoided ATU with the same number of syllables. Second, four different patterns of melody were developed: rising type, falling type, V-type, and inverted V-type. One type of prosodic pattern was preceded and followed by another type of melody. These two variations were to decrease perseverative behaviors. Finally, the patients kept their eyes closed when the clinician played and hummed a target melody at the initial stage of the program in order to improve reflective listening skills. A single-subject alternating treatment design was used. The effects of MMIT were compared to the conventional MIT. Differing the number of syllables and the type of melodic patterns decreased perseverative behaviors and produced more correct names. The initial listening of the target melody with the patients' eyes closed seemed to increase their attentiveness and result in a more fluent production of target utterances. Probable reasons for the effectiveness of MMIT were discussed.

  • PDF

주파수 특성 기저벡터 학습을 통한 특정화자 음성 복원 (Target Speaker Speech Restoration via Spectral bases Learning)

  • 박선호;유지호;최승진
    • 한국정보과학회논문지:소프트웨어및응용
    • /
    • 제36권3호
    • /
    • pp.179-186
    • /
    • 2009
  • 본 논문에서는 학습이 가능한 특정화자의 발화음성이 있는 경우, 잡음과 반향이 있는 실 환경에서의 스테레오 마이크로폰을 이용한 특정화자 음성복원 알고리즘을 제안한다. 이를 위해 반향이 있는 환경에서 음원들을 분리하는 다중경로 암묵음원분리(convolutive blind source separation, CBSS)와 이의 후처리 방법을 결합함으로써, 잡음이 섞인 다중경로 신호로부터 잡음과 반향을 제거하고 특정화자의 음성만을 복원하는 시스템을 제시한다. 즉, 비음수 행렬분해(non-negative matrix factorization, NMF) 방법을 이용하여 특정화자의 학습음성으로부터 주파수 특성을 보존하는 기저벡터들을 학습하고, 이 기저벡터들에 기반 한 두 단계의 후처리 기법들을 제안한다. 먼저 본 시스템의 중간단계인 CBSS가 다중경로 신호를 입력받아 독립음원들을(두 채널) 출력하고, 이 두 채널 중 특정화자의 음성에 보다 가까운 채널을 자동적으로 선택한다(채널선택 단계). 이후 앞서 선택된 채널의 신호에 남아있는 잡음과 다른 방해음원(interference source)을 제거하여 특정화자의 음성만을 복원, 최종적으로 잡음과 반향이 제거된 특정화자의 음성을 복원한다(복원 단계). 이 두 후처리 단계 모두 특정화자 음성으로부터 학습한 기저벡터들을 이용하여 동작하므로 특정화자의 음성이 가지는 고유의 주파수 특성 정보를 효율적으로 음성복원에 이용 할 수 있다. 이로써 본 논문은 CBSS에 음원의 사전정보를 결합하는 방법을 제시하고 기존의 CBSS의 분리 결과를 향상시키는 동시에 특정화자만의 음성을 복원하는 시스템을 제안한다. 실험을 통하여 본 제안 방법이 잡음과 반향 환경에서 특정화자의 음성을 성공적으로 복원함을 확인할 수 있다.

모방 발화의 음향음성학적 연구(3) -전문 성대 모사자의 자료를 중심으로- (An Acoustic Study on the Voice Imitation(3) - Based on a professional voice imitator′s speech -)

  • 안병섭;박미영
    • 대한음성학회지:말소리
    • /
    • 제52호
    • /
    • pp.1-14
    • /
    • 2004
  • In this study, we investigated acoustic characteristics of imitated utterances by a professional voice imitator, focusing on prosodic properties such as vowel formants and f0 distribution. To see the patterns of a voice imitation by a professional voice imitator, we compared the imitator's voice data with target speakers' voice data. The professional imitator, Mr. Bae produced utterances imitating the former President Kim's, the comedian Choi's, and the singer Bae's voices. Auditorily, the imitator was judged to imitate all the target speakers' voices successfully. However, acoustic examination showed that the imitator was better at imitating the singer Bae's voice in that the imitator's and the singer Bae's voices are more alike with respect to vowel formants and f0 distribution. We infer this is because the imitator's normal voice is very similar to the singer Bae's voice. On the other hand, the imitator's voice data showed that the patterns of vowel formants and f0 distribution found in the imitator's imitation voices of the other two target speakers were different from those of target speakers' voices.

  • PDF

운율교육을 위한 운율이식기술 개선 방안 연구 (Improvement of Prosody Transplantation Technology for English Prosody Education and Its Application)

  • 이서배
    • 대한음성학회지:말소리
    • /
    • 제61호
    • /
    • pp.49-62
    • /
    • 2007
  • This study focused on the improvement of prosody transplantation technology to be used for effective prosody education. Issues making the technology a less acceptable tool for prosody education were addressed. Instead of merely copying the target pitch onto a learner's utterances, the target pitch was resealed in semitone before the transplantation. In so doing, distortion of a signal was minimized and the transplanted utterance could have the quality of sound not different from the learner's utterances. Instead of manual transplantation, an automatic procedure was proposed to increase the reliability and the consistency of the outcome and enable real time processing. The perceptual performance of the automatic transplantation was evaluated by the perception experiment showing the automatic ransplantation was as good as the manual process.

  • PDF

COLT와 TALOS 활용 동영상 분석으로 살펴본 우수 초등영어수업의 의사소통성 양상 (Analysis of Communicative Features in an Excellent Elementary English Class Using COLT and TALOS)

  • 유희연;김정렬
    • 한국콘텐츠학회논문지
    • /
    • 제18권2호
    • /
    • pp.269-279
    • /
    • 2018
  • 본 연구는 초등 영어 우수 수업으로 선정되어 교육포털사이트에 업로드 되어 있는 수업 동영상을 전사하고, COLT 2부와 TALOS 저추론 분석법을 사용하여 이를 분석하여 의사소통성과 수업의 특징을 파악한 연구이다. 그동안 COLT를 사용하여 초등 영어 수업의 의사소통성을 파악한 연구는 많았지만, TALOS 분석을 한 연구는 부족했다. 또한, 대부분의 연구가 초등 영어 수업이 의사소통적이지 않다는 결론이 대다수였다. 이에, COLT 2부와 TALOS를 사용하여 초등 영어 수업을 분석하여 의사소통성을 파악하고, 초등영어의 성격과 관련하여 나타나는 특징을 알아보았다. 서울교육포털사이트에 우수 수업 동영상으로 업로드되어 있는 1개의 수업 동영상을 COLT(Communitive Orientation of Language Teaching)와 TALOS(Target Language Observation Scheme)를 활용하여 분석한 결과, 이 수업은 높은 학생 발화의 양과 질, 학생의 높은 참여, 주도적인 발화의 시작, 예상 불가 정보 발화의 높은 비율, 확장 비율이 높은 발화 지속의 양태 등으로 의사소통중심 수업임을 확인하였다. 수업에서 두드러지게 나타난 학생의 높은 참여, 정의적 요소의 중시, 재미있게 몰입하는 활동 속에서 무의식적으로 충분한 반복을 통한 내재화 등은 초등 영어 특성과 관련된 본 수업의 특징이다.

한국어 대화체 음성언어 번역시스템에서의 개념기반 번역시스템 (Concept-based Translation System in the Korean Spoken Language Translation System)

  • 최운천;한남용;김재훈
    • 한국정보처리학회논문지
    • /
    • 제4권8호
    • /
    • pp.2025-2037
    • /
    • 1997
  • 대화체 음성언어번역시스템의 일부인 개념기반 번역시스템은 음성인식의 결과를 이용하여 다른 언어로 번역해 주는 시스템이다. 본 논문은 여행계획 영역에 대해 한국어를 해석하여 영어, 일본어, 한국어로 번역해 주는 시스템에 대해 기술한 것이다. 개념기반 번역은 비정형 문장이 많은 대화체 문장을 처리하기 위해 형태소 분석 등의 구문정보를 이용하지 않고, 의미단위의 번역을 시도한 것으로 화자의 의도를 정확히 번역해 주는 것을 목표로 한다. 개념기반 번역은 280여개의 개념과 개념간의 계층구조에 의해, 인식결과를 개념구조로 변환한 후 다른 언어로 생성해 준다. 효율적인 한국어 처리를 위해 기준단어를 이용한 토큰분리기와 문법자동 수정기를 개발하였다. 그리고 자연스러운 생성문을 위해 각 언어에 대한 후처리기를 개발하였다.

  • PDF

화자 적응 기술을 이용한 한국어 화자 확인 (Korean Speaker Verification Using Speaker Adaptation Methods)

  • 최동진;오영환
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2006년도 춘계 학술대회 발표논문집
    • /
    • pp.139-142
    • /
    • 2006
  • Speaker verification systems can be implemented using speaker adaptation methods if the amount of speech available for each target speaker is too small to train the speaker model. This paper shows experimental results using well-known adaptation methods, namely Maximum A Posteriori (MAP) and Maximum Likelihood Linear Regression (MLLR). Experimental results using Korean speech show that MLLR is more effective than MAP for short enrollment utterances.

  • PDF

F0 변화율로 본 한국어 억양 패턴의 음향 특성 (Korean Intonation Patterns from the Viewpoint of F0 Percentage Change)

  • 이지연;이호영
    • 말소리와 음성과학
    • /
    • 제5권1호
    • /
    • pp.123-130
    • /
    • 2013
  • Previous researches on Korean intonation have been mainly focused on $F_0$ target frequencies, $F_0$ slope, and the duration of intonation patterns. This study investigated Korean intonation patterns, both boundary and phrasal tones, in relation to the $F_0$ percentage change between pitch targets. We measured the percentage change between the pitch targets of both boundary and phrasal tones. Additionally, the $F_0$ change between the preceding pitch target and the first pitch target of the boundary tone and the $F_0$ targets of the sequence of two LH phrasal tones ('LH + LH') were also measured. Two phrasal tones, LHLH and HLH, were compared with 'LH + LH' and the 'HLH' in the LHLH pattern respectively. We found that the percentage change between pitch targets in the phrasal tone is fixed to some extent. This helped explain why the slope of the phrasal tone is closely related to the number of syllables and the duration of the phrasal tone as discussed in previous studies. Since we analyzed the intonation patterns with the utterances from a large speech corpus, the results of this paper are expected to be used in building a larger annotated corpus of Korean.

Speech Rhythm Metrics for Automatic Scoring of English Speech by Korean EFL Learners

  • 장태엽
    • 대한음성학회지:말소리
    • /
    • 제66호
    • /
    • pp.41-59
    • /
    • 2008
  • Knowledge in linguistic rhythm of the target language plays a major role in foreign language proficiency. This study attempts to discover valid rhythm features that can be utilized in automatic assessment of non-native English pronunciation. Eight previously proposed and two novel rhythm metrics are investigated with 360 English read speech tokens obtained from 27 Korean learners and 9 native speakers. It is found that some of the speech-rate normalized interval measures and above-word level metrics are effective enough to be further applied for automatic scoring as they are significantly correlated with speakers' proficiency levels. It is also shown that metrics need to be dynamically selected depending upon the structure of target sentences. Results from a preliminary auto-scoring experiment through a Multi Regression analysis suggest that appropriate control of unexpected input utterances is also desirable for better performance.

  • PDF