• Title/Summary/Keyword: Topic Sentence

Search Result 58, Processing Time 0.02 seconds

Method of Extracting the Topic Sentence Considering Sentence Importance based on ELMo Embedding (ELMo 임베딩 기반 문장 중요도를 고려한 중심 문장 추출 방법)

  • Kim, Eun Hee;Lim, Myung Jin;Shin, Ju Hyun
    • Smart Media Journal
    • /
    • v.10 no.1
    • /
    • pp.39-46
    • /
    • 2021
  • This study is about a method of extracting a summary from a news article in consideration of the importance of each sentence constituting the article. We propose a method of calculating sentence importance by extracting the probabilities of topic sentence, similarity with article title and other sentences, and sentence position as characteristics that affect sentence importance. At this time, a hypothesis is established that the Topic Sentence will have a characteristic distinct from the general sentence, and a deep learning-based classification model is trained to obtain a topic sentence probability value for the input sentence. Also, using the pre-learned ELMo language model, the similarity between sentences is calculated based on the sentence vector value reflecting the context information and extracted as sentence characteristics. The topic sentence classification performance of the LSTM and BERT models was 93% accurate, 96.22% recall, and 89.5% precision, resulting in high analysis results. As a result of calculating the importance of each sentence by combining the extracted sentence characteristics, it was confirmed that the performance of extracting the topic sentence was improved by about 10% compared to the existing TextRank algorithm.

A Comparative Study on Teaching Chinese and Korean Topic Sentences (주제문을 통한 한국학생의 중국어 학습지도 연구 - 중·한 주제문의 비교를 중심으로)

  • Choo, Chui-Lan
    • Cross-Cultural Studies
    • /
    • v.19
    • /
    • pp.389-409
    • /
    • 2010
  • Chinese is a topic-prominent language, so when we learn Chinese we should know the discourse function of the Chinese language. Most of the Korean student think Chinese sentences should appear in the order of S-V-O and they always make mistakes when they use Chinese. I think Korean is very similar with Chinese in the discourse function. Hence, in this paper, I try to find a method of teaching Chinese topic sentence. It does so by comparing Chinese with Korean in the light of discourse function. I think when Korean student know how to use Korean topic sentence to explain the discourse functions of the Chinese language, they will not make similar mistakes. With this understanding in mind, chapter 2 tries to show various topic sentences to prove that 'topic' is very important in Chinese sentences. This is why we say Chinese is a topic-prominent language. In chapter 3, I analysis the sentences that students made, and highlight the reasons why they made mistake. The result lies in the reason whereby they always think Chinese should appear in the order of S-V-O. They do not understand why some sentences appear in the order of O-(S)V or S-O-V. It show that they do not know what is topic sentence and do not know how to make topic sentences. Sometime I have them translate them into Korean, but they also make Korean sentences like in the order of Chinese S-V-O. Therefore, I think, under this circumstance, to let them to translate and to speak in Korean in topic sentence, get some feelings about Chinese topic sentences, and tell and make Chinese topic sentences are naturally critical in their training.

Discourse-level Prosody Produced by Korean Learners of English

  • Kim, Boram
    • Phonetics and Speech Sciences
    • /
    • v.6 no.4
    • /
    • pp.67-77
    • /
    • 2014
  • This study investigated (1) whether Korean learners of English use discourse-level prosody in L2 production as native speakers of English do, and (2) whether discourse-level prosody is also found in the Korean language, as is evident in the prosody of native speakers of English. The study compared the production of the same 15 sentences in two types of reading materials, sentence-level and discourse-level. This study analyzed the onset pitch, sentence mean pitch and pause length to examine the paratone (intonational paragraph) realization in discourse-level speech. The results showed that in L2 discourse-level prosody, the Korean speakers were limited in displaying paratone and did not made significant difference between sentence-level and discourse-level prosody. On the other hand, in L1 discourse-level text, both English and Korean participants demonstrated paratone using pitch. However, there were differences in using prosodic cues between two groups. In using pauses, the ES group paused longer before both the orthographically marked and not marked topic sentences. The KS group paused longer only before the orthographically marked topic sentence in both L1 and L2 text reading. In the comparison of sentence-level and discourse-level prosody, the topic sentences were marked by different prosodic cues. English participants used higher sentence mean pitch, and the Korean participants used higher onset pitch.

A Topic Classification System Based on Clue Expressions for Person-Related Questions and Passages (단서표현 기반의 인물관련 질의-응답문 문장 주제 분류 시스템)

  • Lee, Gyoung Ho;Lee, Kong Joo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.4 no.12
    • /
    • pp.577-584
    • /
    • 2015
  • In general, Q&A system retrieves passages by matching terms of a question in order to find an answer to the question. However it is difficult for Q&A system to find a correct answer because too many passages are retrieved and matching using terms is not enough to rank them according to their relevancy to a question. To alleviate this problem, we introduce a topic for a sentence, and adopt it for ranking in Q&A system. We define a set of person-related topic class and a clue expression which can indicate a topic of a sentence. A topic classification system proposed in this paper can determine a target topic for an input sentence by using clue expressions, which are manually collected from a corpus. We explain an architecture of the topic classification system and evaluate the performance of the components of this system.

Sentence ion : Sentence Revision with Concept ion (문장추상화 : 개념추상화를 도입한 문장교열)

  • Kim, Gon;Yang, Jaegun;Bae, Jaehak;Lee, Jonghyuk
    • The KIPS Transactions:PartB
    • /
    • v.11B no.5
    • /
    • pp.563-572
    • /
    • 2004
  • Sentence ion is a simplification of a sentence preserving its communicative function. It accomplishes sentence revision and concept ion simultaneously. Sentence revision is a method that resolves the discrepancy between human's thoughts and its expressed semantic in sentences. Concept ion is an expression of general ideas acquired from the common elements of concepts. Sentence ion selects the main constituents of given sentences and describes the upper concepts of them with detecting their semantic information. This enables sen fence revision and concept ion simultaneously. In this paper, a syntactic parser LGPI+ and an ontology OfN are utilized for sentence ion. Sentence abstracter SABOT makes use of LGPI+ and OfN. SABOT processes the result of parsing and selects the candidate words for sentence ion. This paper computes the sentence recall of the main sentences and the topic hit ratio of the selected sentences with the text understanding system using sentence ion. The sources are 58 paragraphs in 23 stories. As a result of it, the sentence recall is about .54 ~ 72% and the topic hit ratio is about 76 ~ 86%. This paper verified that sentence ion enables sentence revision that can select the topic sentences of a given text efficiently and concept ion that can improve the depth of text understanding.

Corpus-based analysis of the usage of Korean markers -(n)un and -i/ka in editorial texts

  • Kim, Kyoung-Young
    • Language and Information
    • /
    • v.19 no.2
    • /
    • pp.19-36
    • /
    • 2015
  • The aim of this paper is to investigate the usage of Korean markers -(n)un and -i/ka in editorial texts focusing on information structure. Noun phrases ending with the markers -(n)un and -i/ka were annotated semi-automatically using a corpus obtained from an online newspaper. Two important factors to determine the choice of markers were examined with the annotated data: referential givenness/newness and position in a sentence. Referential givenness and newness were adopted as indicators of information structure, topic and focus respectively. In addition to quantitative analysis, qualitative analysis was conducted on the selected data. The results suggest that both the marker -(n)un and -i/ka could carry a topic and a focus reading. Sentence position also played a crucial role in determining the marker, and the marker -i/ka was used more frequently in a later position of a sentence than the marker -(n)un.

  • PDF

Focus, Contrastive Topic and Theories of Focus

  • Wee, Hae-Kyung
    • Language and Information
    • /
    • v.5 no.1
    • /
    • pp.87-105
    • /
    • 2001
  • This paper categorizes currently available theories of focus into two major types a 'discourse structure approach'(DSA) and a 'sentence structure approach'(SSA) The former DSA is intended to refer to a type of approach that analyzes focus only in terms of the discourse structure in which a focused sentence occurs. The alternative semantics approach which is the most widely available theory of focus belongs to this The latter SSA is meant to refer to a type of theory that analyzes focus in terms of sentence-internal structure, This study supports the SSA be revealing some empirical problems of the DSA that arise is analyzing two different kinds of focus, the A-accented focus and the B-accented focus (contrastive topic), and provides a brief sketch of a comprehensive analysis of focus and contrastive topic.

  • PDF

Building a Korean-English Parallel Corpus by Measuring Sentence Similarities Using Sequential Matching of Language Resources and Topic Modeling (언어 자원과 토픽 모델의 순차 매칭을 이용한 유사 문장 계산 기반의 위키피디아 한국어-영어 병렬 말뭉치 구축)

  • Cheon, JuRyong;Ko, YoungJoong
    • Journal of KIISE
    • /
    • v.42 no.7
    • /
    • pp.901-909
    • /
    • 2015
  • In this paper, to build a parallel corpus between Korean and English in Wikipedia. We proposed a method to find similar sentences based on language resources and topic modeling. We first applied language resources(Wiki-dictionary, numbers, and online dictionary in Daum) to match word sequentially. We construct the Wiki-dictionary using titles in Wikipedia. In order to take advantages of the Wikipedia, we used translation probability in the Wiki-dictionary for word matching. In addition, we improved the accuracy of sentence similarity measuring method by using word distribution based on topic modeling. In the experiment, a previous study showed 48.4% of F1-score with only language resources based on linear combination and 51.6% with the topic modeling considering entire word distributions additionally. However, our proposed methods with sequential matching added translation probability to language resources and achieved 9.9% (58.3%) better result than the previous study. When using the proposed sequential matching method of language resources and topic modeling after considering important word distributions, the proposed system achieved 7.5%(59.1%) better than the previous study.

Analysis of Question and Sentence in High Environmental Science Textbook (고등학교 환경과학 교과서의 질문과 문장 내용 분석)

  • Lee, Bong-Hun;Moon, Seong-Bae;Moon, Jung-Dae
    • Journal of Environmental Science International
    • /
    • v.6 no.3
    • /
    • pp.213-218
    • /
    • 1997
  • The question style In high school enoronmental science textbook was examined in terms of the placement, frequency, and type of question, and then analyzed the kind of scientific Inquiry process elicited by the question In the topic of textbook using the Tektbook guestioning Strategy Assessment Instrument (TQSAI). The average number of question per topic was only 0.6. The number of all Question In the high school enororunental science textbook was very little : the number of non-experiential Question was 8 and that of experiential one was 3. The total number of sentence was 1,236 and the ratio of the number of Question to that of sentence was 0.9% . The frequency of non-experlential question was higher than that of experiential one. In action part of the textbook, there were more kinds of Question styles than In the matin part.

  • PDF

Passage Retrieval and Calculation Method of Topic Field by Using Field-Associated Terms (분야연상어를 이용한 화제분야의 계산방법과 단락검색)

  • Lee Samuel-Sangkon
    • The KIPS Transactions:PartB
    • /
    • v.12B no.1 s.97
    • /
    • pp.57-68
    • /
    • 2005
  • It is important to segment a text, which is independent upon any text-embedded auxiliary information. This paper presents a technique for dividing the text into field-coherent passages. The presented method is based upon extracting field-associated terms from the text measuring how the topics grow, shrink and shift from sentence to sentence. We propose measures of topic continuity and of topic transition and suggest how those could be used to find the boundaries among passages. After collecting 12,500 documents, we obtain $88{\%}$ for average precision and $78{\%}$ for recall in Korean training set.