• Title/Summary/Keyword: whole sentences

Search Result 47, Processing Time 0.02 seconds

Designing a large recording script for open-domain English speech synthesis

  • Kim, Sunhee;Kim, Hojeong;Lee, Yooseop;Kim, Boryoung;Won, Yongkook;Kim, Bongwan
    • Phonetics and Speech Sciences
    • /
    • v.13 no.3
    • /
    • pp.65-70
    • /
    • 2021
  • This paper proposes a method for designing a large recording script for open domain English speech synthesis. For read-aloud style text, 12 domains and 294 sub-domains were designed using text contained in five different news media publications. For conversational style text, 4 domains and 36 sub-domains were designed using movie subtitles. The final script consists of 43,013 sentences, 27,085 read-aloud style sentences, and 15,928 conversational style sentences, consisting of 549,683 tokens and 38,356 types. The completed script is analyzed using four criteria: word coverage (type coverage and token coverage), high-frequency vocabulary coverage, phonetic coverage (diphone coverage and triphone coverage), and readability. The type coverage of our script reaches 36.86% despite its low token coverage of 2.97%. The high-frequency vocabulary coverage of the script is 73.82%, and the diphone coverage and triphone coverage of the whole script is 86.70% and 38.92%, respectively. The average readability of whole sentences is 9.03. The results of analysis show that the proposed method is effective in producing a large recording script for English speech synthesis, demonstrating good coverage in terms of unique words, high-frequency vocabulary, phonetic units, and readability.

The Prosodic Characteristics of Children with Cochlear Implant with Respect to the Articulation Rate, Pause, and Duration (인공와우이식 아동의 운율 특성 - 조음속도와 쉼, 지속시간을 중심으로 -)

  • Oh, Soonyoung;Seong, Cheoljae
    • Phonetics and Speech Sciences
    • /
    • v.4 no.4
    • /
    • pp.117-127
    • /
    • 2012
  • This research reports the prosodic characteristics (including articulation speech rate, pause characteristics, duration) of children with cochlear implants with reference to those of children with normal hearing. Subjects are 8-to 10-year-old children, balancing each number of gender as 24. Dialogue speech data are comprised of four types of sentence patterns. Results show that 1) there's a statistically meaningful difference on articulation speech rate between the two groups. 2) On pauses, they are not observed in exclamatory and declarative sentences in normal children. While imperative sentences show no statistical difference on the number of pauses between the two groups, interrogative sentences do. 3) Declarative, exclamatory, and interrogative sentences reveal statistical difference between the two groups in terms of the sentence's final two-syllable word duration, showing no difference on imperative sentences. 4) When it comes to the RFP (duration ratio of sentence final syllable to penultimate syllable), we no statistically meaningful difference between the two groups in all types of sentences exists. 5) Lastly, RWS (the ratio of sentence final two syllable word duration to that of whole sentence duration) shows statistical difference between two groups in imperative sentences, but not in all the rest types.

An Analysis of $H^*$ Production by Korean Learners of English according to the Focus of English Sentences in Comparison with Native Speakers of English and Its Pedagogical Implications (영어 원어민과 비교한 한국인 학습자의 영어 문장 초점에 따른 영어 고성조 구현의 분석과 억양교육에 대한 시사점)

  • Yi, So-Pae
    • Phonetics and Speech Sciences
    • /
    • v.3 no.3
    • /
    • pp.57-62
    • /
    • 2011
  • Focused items in English sentences are usually accompanied by changes in acoustic manifestation. This paper investigates the acoustic characteristics of $H^*$ in English utterances produced by natives speakers of English and Korean learners of English. To obtain more reliable results, the changes of the acoustic feature values (F0, intensity, syllable duration) were normalized by a median value and a whole duration of each utterance. Acoustic values of sentences with no focused words were compared with those of sentences with focused words within each group (Americans vs. Koreans). Sentences with focused words were compared between the two groups, too. In the instances in which a significant Group x Focus Location (initial, middle and final of a sentence) interaction was obtained, further analysis testing the effect of Group on each Focus Location was conducted. The analysis revealed that Korean learners of English produced focused words with lower F0, lower intensity and shorter syllable duration than native speakers of English. However, the effect of intensity change caused by focus was not significant within each group. Further analysis examining the interaction of Group and Focus Location showed that the change in F0 produced by Korean group was significantly lower in the middle and the final positions of sentences than by American group. Implications for the intonation training were also discussed.

  • PDF

A Study on Ways to Environmental Values Education from Appropriateness Expression Analysis of Sentences on Environmental Education Textbooks (환경교과서의 당위적 표현 분석을 통한 환경 가치 교육 방안에 대한 고찰)

  • Cho, Seong-Hoa;Choi, Don-Hyung
    • Hwankyungkyoyuk
    • /
    • v.24 no.3
    • /
    • pp.26-33
    • /
    • 2011
  • In this study, we discuss ways to environmental values education from sentences on environmental education textbooks. Values education is very important territory in environmental education. But school environmental education has some different idea about how to teach that. One way is to teach directly. The other way is to teach in a roundabout way. These days many researches said that roundabout way is better. The purpose of this study is to find out good ways to values education in environmental education. So we analyzed sentences on four environmental education textbooks of middle school. The analysis is limited body of textbooks. And we found immediate appropriateness expression for values education. The study results are as follows. There are 420 immediate expression in textbooks. That is 11% of whole expressions. And 6 big Units of textbooks have not difference about that expression. Most last sentences of class are immediate expressions. So authors of textbook will have mind about good values education method. One method is to use many interrogative sentences. Interrogative sentences help that student have values of themselves.

  • PDF

Opinion Extraction based on Syntactic Pieces

  • Aoki, Suguru;Yamamoto, Kazuhide
    • Proceedings of the Korean Society for Language and Information Conference
    • /
    • 2007.11a
    • /
    • pp.76-85
    • /
    • 2007
  • This paper addresses a task of opinion extraction from given documents and its positive/negative classification. We propose a sentence classification method using a notion of syntactic piece. Syntactic piece is a minimum unit of structure, and is used as an alternative processing unit of n-gram and whole tree structure. We compute its semantic orientation, and classify opinion sentences into positive or negative. We have conducted an experiment on more than 5000 opinion sentences of multiple domains, and have proven that our approach attains high performance at 91% precision.

  • PDF

Meeting Minutes Summarization using Two-step Sentence Extraction (2단계 문장 추출 방법을 이용한 회의록 요약)

  • Lee, Jae-Kul;Park, Seong-Bae;Lee, Sang-Jo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.20 no.6
    • /
    • pp.741-747
    • /
    • 2010
  • These days many meeting minutes of various organizations are publicly available and the interest in these documents by people is increasing. However, it is time-consuming and tedious to read and understand whole documents even if the documents can be accessed easily. In addition, what most people want from meeting minutes is to catch the main issues of the meeting and understand its contexts rather than to know whole discussions of the meetings. This paper proposes a novel method for summarizing documents considering the characteristics of the meeting minutes. It first extracts the sentences which are addressing the main issues. For each issues expressed in the extracted sentences, the sentences related with the issue are then extracted in the next step. Then, by transforming the extracted sentences into a tree-structure form, the results of the proposed method can be understood better than existing methods. In the experiments, the proposed method shows remarkable improvement in performance and this result implies that the proposed method is plausible for summarizing meeting minutes.

Statistical Analysis Between Size and Balance of Text Corpus by Evaluation of the effect of Interview Sentence in Language Modeling (언어모델 인터뷰 영향 평가를 통한 텍스트 균형 및 사이즈간의 통계 분석)

  • Jung Eui-Jung;Lee Youngjik
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • spring
    • /
    • pp.87-90
    • /
    • 2002
  • This paper analyzes statistically the relationship between size and balance of text corpus by evaluation of the effect of interview sentences in language model for Korean broadcast news transcription system. Our Korean broadcast news transcription system's ultimate purpose is to recognize not interview speech, but the anchor's and reporter's speech in broadcast news show. But the gathered text corpus for constructing language model consists of interview sentences a portion of the whole, $15\%$ approximately. The characteristic of interview sentence is different from the anchor's and the reporter's in one thing or another. Therefore it disturbs the anchor and reporter oriented language modeling. In this paper, we evaluate the effect of interview sentences in language model for Korean broadcast news transcription system and analyze statistically the relationship between size and balance of text corpus by making an experiment as the same procedure according to varying the size of corpus.

  • PDF

Document Summarization Considering Entailment Relation between Sentences (문장 수반 관계를 고려한 문서 요약)

  • Kwon, Youngdae;Kim, Noo-ri;Lee, Jee-Hyong
    • Journal of KIISE
    • /
    • v.44 no.2
    • /
    • pp.179-185
    • /
    • 2017
  • Document summarization aims to generate a summary that is consistent and contains the highly related sentences in a document. In this study, we implemented for document summarization that extracts highly related sentences from a whole document by considering both similarities and entailment relations between sentences. Accordingly, we proposed a new algorithm, TextRank-NLI, which combines a Recurrent Neural Network based Natural Language Inference model and a Graph-based ranking algorithm used in single document extraction-based summarization task. In order to evaluate the performance of the new algorithm, we conducted experiments using the same datasets as used in TextRank algorithm. The results indicated that TextRank-NLI showed 2.3% improvement in performance, as compared to TextRank.

A Study on the Theory of Stomach and Lung in Suwen·Kailun (『소문(素問)·해론(欬論)』의 '취어위(聚於胃), 관어폐(關於肺)' 조문(條文)에 대한 고찰(考察))

  • Baik, Yousang;Kim, Jong-hyun
    • Journal of Korean Medical classics
    • /
    • v.30 no.3
    • /
    • pp.167-180
    • /
    • 2017
  • Objectives : The purpose of this paper is to extinguish the debates surrounding the sentences found in Suwen Kailun that deal with flocking to stomach and closure in lung. The paper seeks to do this through studying the assertions of historical doctors and their theories regarding the topic. Methods : The interpretations of annotators regarding these sentences were studied, and text DB was searched to collect and analyze materials related to the theories of the relationship between stomach and lung. Results : The sentences of flocking to stomach and closure in lung, judging from their contexts, seem to be related to the symptom of Sanjiao ke or Liufu ke. However, they may be pointing to internal organs' ke as a whole based on the close relationship between stomach and lung. They could mean either the abnormality of mechanism of stomach and lung could cuase ke or that Zhuoqi could accumulate inside of stomach to cause phlegm-fluids thereby blocking thorax and causing cough. Theory of Warm disease, too, provides a number of treatment suggestions for stomach, lung damages such as supporting Yin and dispersing dampness. Conclusions : The study of the sentences regarding flocking to tomach and closure in lung is expected to not only provide an analysis of the sentences, but also provide a perspective and a method for clinical treatments.

A Prosodic Study of Focus in English Relative Sentences (영어 관계사 문장의 초점에 관한 운율 연구)

  • Ahn, Gil-Soon;Jeon, Pyung-Man;Kim, Hyun-Gee
    • Speech Sciences
    • /
    • v.8 no.4
    • /
    • pp.207-214
    • /
    • 2001
  • This study describes the focus in nine structure types of English relative clauses (SS, SO, SP, PS, PO, PP, OS, OO, OP), classified according to the grammatical role of both the head that the relative clause modifies and the gap within the relative clause. The informants for this study are 2 middle school students, 4 high school students in four formal classroom in Korea and 2 native speakers. To obtain the accurate intonation patterns, Visi-Pitch II Model 3300 was used for data analyses. Major findings are as follows: (1) The feature of the intonation in English relative clauses showed prosodic prominence at the head, but the English learners in Korea didn't show prosodic prominence; (2) the fact that all heads have prosodic prominence says that the head in relative clauses has prosodic focus; (3) in the fact that the English learners have flat pitch in the whole sentences, the problem of intonation education is found out.

  • PDF