• 제목/요약/키워드: news paper articles

검색결과 149건 처리시간 0.022초

Entity Linking For Tweets Using User Model and Real-time News Stream (유저 모델과 실시간 뉴스 스트림을 사용한 트윗 개체 링킹)

  • Jeong, Soyoon;Park, Youngmin;Kang, Sangwoo;Seo, Jungyun
    • Korean Journal of Cognitive Science
    • /
    • 제26권4호
    • /
    • pp.435-452
    • /
    • 2015
  • Recent researches on Entity Linking(EL) have attempted to disambiguate entities by using a knowledge base to handle the semantic relatedness and up-to-date information. However, EL for tweets using a knowledge base is still unsatisfactory, mainly because the tweet data are mostly composed of short and noisy contexts and real-time issues. The EL system the present work builds up links ambiguous entities to the corresponding entries in a given knowledge base via exploring the news articles and the user history. Using news articles, the system can overcome the problem of Wikipedia coverage (i.e., not handling real-time issues). In addition, given that users usually post tweets related to their particular interests, the current system referring to the user history robustly and effectively works with a small size of tweet data. In this paper, we propose an approach to building an EL system that links ambiguous entities to the corresponding entries in a given knowledge base through the news articles and the user history. We created a dataset of Korean tweets including ambiguous entities randomly selected from the extracted tweets over a seven-day period and evaluated the system using this dataset. We use accuracy index(number of correct answer given by system/number of data set) The experimental results show that our system achieves a accuracy of 67.7% and outperforms the EL methods that exclusively use a knowledge base.

Critical Discourse Analysis of '5.18' in 'Honam' and 'Yeongnam' Local Newspapers by Using Corpus (코퍼스를 이용한 '호남'과 '영남' 지역신문에서의 '5.18'에 대한 비판적 담화분석)

  • Lee, Sukeui;Jin, Duhyeon
    • Korean Linguistics
    • /
    • 제76권
    • /
    • pp.83-112
    • /
    • 2017
  • In this paper, newspaper articles were collected through '5.18' keyword search results and the news corpus was constructed from the collected data. In the articles of local newspapers 'Honam' and 'Yeongnam', the ideological differences regarding '5.18' were investigated. The ideological differences of local newspaper discourse through objective figures was analyzed.. The subjects of the newspaper articles, the frequency of nouns and predicates were analyzed. The use and meaning of the intended vocabulary were examined. As a result of analyzing the title of the newspaper article, the discourse written in 'Honam' emphasized the necessity of re - recognition of 5.18. In both regions, the word "Gwangju" is often used. However, 'Gwangju' in 'Honam' newspaper means spiritual space, not physical space. In Honam regional newspapers, there are many vocabularies describing the events such as 'shoot' and 'fire', this calls for recollection and memory of '5.18'. In the analysis of newspaper discourse, the analysis of the contrast between the local newspapers was very insignificant, but, this study was conducted to analyze the discourse among local newspapers.

Design of a Korean Question-Answering System for News Item Retrieval (우리말 신문기사 검색을 위한 질문응답시스템 구현에 관한 연구)

  • Chung, Young-Mee
    • Journal of the Korean Society for information Management
    • /
    • 제4권1호
    • /
    • pp.3-23
    • /
    • 1987
  • This paper describes a question-answering system that can automatically analyze input texts and questions in Korean natural language. The particular texts used for the research were newspaper articles in the specific domain of sports news. The system consists of a set of Cobol programs and an associated set of data files containing lexicon, case grammar, linguistic rules. and data base. This system employs two retrieval functions of fact retrieval and passage retrieval. Therefore input questions can be answered in forms of either sentence or factual data.

  • PDF

Sentence Compression of Headline-style Abstract for Displaying in Small Devices (작은 화면 기기에서의 출력을 위한 신문기사 헤드라인 형식의 문장 축약 시스템)

  • Lee, Kong-Joo
    • The KIPS Transactions:PartB
    • /
    • 제12B권6호
    • /
    • pp.691-696
    • /
    • 2005
  • In this paper, we present a pilot system that tn compress a Korean sentence automatically using knowledge extracted from news articles and their headlines. A sot of compressed sentences can be presented as an abstraction of a document. As a compressed sentence is of headline-style, it could be easily displayed on small devices, such as mobile phones and other handhold devices. Our compressing system has shown to be promising through a preliminary experiment.

Company Name Discrimination in Tweets using Topic Signatures Extracted from News Corpus

  • Hong, Beomseok;Kim, Yanggon;Lee, Sang Ho
    • Journal of Computing Science and Engineering
    • /
    • 제10권4호
    • /
    • pp.128-136
    • /
    • 2016
  • It is impossible for any human being to analyze the more than 500 million tweets that are generated per day. Lexical ambiguities on Twitter make it difficult to retrieve the desired data and relevant topics. Most of the solutions for the word sense disambiguation problem rely on knowledge base systems. Unfortunately, it is expensive and time-consuming to manually create a knowledge base system, resulting in a knowledge acquisition bottleneck. To solve the knowledge-acquisition bottleneck, a topic signature is used to disambiguate words. In this paper, we evaluate the effectiveness of various features of newspapers on the topic signature extraction for word sense discrimination in tweets. Based on our results, topic signatures obtained from a snippet feature exhibit higher accuracy in discriminating company names than those from the article body. We conclude that topic signatures extracted from news articles improve the accuracy of word sense discrimination in the automated analysis of tweets.

Requirement Analysis of Korean Public Alert Service using News Data (뉴스 데이터를 활용한 재난문자 요구사항 분석)

  • Lee, Hyunji;Byun, Yoonkwan;Chang, Sekchin;Choi, Seong Jong
    • Journal of Broadcast Engineering
    • /
    • 제25권6호
    • /
    • pp.994-1003
    • /
    • 2020
  • In this paper, we investigated the current issues on the KPAS(Korean Public Alert Service) by News analysis. News articles, from May 15, 2005 to April 30, 2020, were collected with the key word of 'KPAS' through the News Big-Data System provided by the Korea Press Foundation. The results of the content analysis are as follows. First, the issues on alert presentation were categorized by alarm sound, message content, alert level, transmission frequency, delay, reception range, time of alert, and language. Issues on inability to receive KPAS messages were categorized into authority, mobile, sending standard, mobile communication infra, etc. For the last two to three years, news on the inability issues had decreased, while news on the presentation issues had increased. This tells us that the public demand for improvement in the KPAS lies in the presentation issues. The demand for societal resolutions to the presentation issues especially on message content, transmission frequency, and reception range has soared.

A Korean Text Summarization System Using Aggregate Similarity (도합유사도를 이용한 한국어 문서요약 시스템)

  • 김재훈;김준홍
    • Korean Journal of Cognitive Science
    • /
    • 제12권1_2호
    • /
    • pp.35-42
    • /
    • 2001
  • In this paper. a document is represented as a weighted graph called a text relationship map. In the graph. a node represents a vector of nouns in a sentence, an edge completely connects other nodes. and a weight on the edge is a value of the similarity between two nodes. The similarity is based on the word overlap between the corresponding nodes. The importance of a node. called an aggregate similarity in this paper. is defined as the sum of weights on the links connecting it to other nodes on the map. In this paper. we present a Korean text summarization system using the aggregate similarity. To evaluate our system, we used two test collection, one collection (PAPER-InCon) consists of 100 papers in the field of computer science: the other collection (NEWS) is composed of 105 articles in the newspapers and had built by KOROlC. Under the compression rate of 20%. we achieved the recall of 46.6% (PAPER-InCon) and 30.5% (NEWS) and the precision of 76.9% (PAPER-InCon) and 42.3% (NEWS).

  • PDF

Relative Clauses in a Modern Diachronic Corpus of Singapore English

  • Lee, Kit Mun
    • Asia Pacific Journal of Corpus Research
    • /
    • 제1권1호
    • /
    • pp.31-60
    • /
    • 2020
  • This paper investigates changes in relativization in Singapore English broadsheet newspapers from 1993 to 2016. One of the first diachronic studies in Singapore English (SgE), it also explores corresponding data from the diachronic Siena-Bologna (SiBol) news corpus. As SgE is in the endonormative stabilization phase in Schneider's (2007) Dynamic Model of postcolonial Englishes, divergence from British English (BrE) is to be expected. In this study, the dataset is a new Singapore English Newspaper (SEN) corpus compiled from local news articles in 1993, 2005 and 2016, and the corpus tool employed is Sketch Engine. The results reveal changes in relativization practices in SEN over the given period, many of which occur in a similar pattern as those identified in SiBol, albeit at varying rates of change. Most significant of these include a sharp decline in the which relativizer in restrictive relative clauses with non-animate antecedents, complemented by a rise in that. The change has been so rapid that although which relative clauses were more common than that clauses in 1993, that has subsequently overtaken which for both the corpora. One shift in SEN that is different from SiBol is the increase in frequency of non-restrictive relative clauses in SgE. The likely motivators for the changes in the two varieties are identified as colloquialization, densification and prescriptivism. The effect each of these factors could have had on the varieties are discussed, as well as the implications that the findings have on our understanding of the evolutionary status of SgE as a postcolonial variety.

News Data Analysis Using Acoustic Model Output of Continuous Speech Recognition (연속음성인식의 음향모델 출력을 이용한 뉴스 데이터 분석)

  • Lee, Kyong-Rok
    • The Journal of the Korea Contents Association
    • /
    • 제6권10호
    • /
    • pp.9-16
    • /
    • 2006
  • In this paper, the acoustic model output of CSR(Continuous Speech Recognition) was used to analyze news data News database used in this experiment was consisted of 2,093 articles. Due to the low efficiency of language model, conventional Korean CSR is not appropriate to the analysis of news data. This problem could be handled successfully by introducing post-processing work of recognition result of acoustic model. The acoustic model more robust than language model in Korean environment. The result of post-processing work was made into KIF(Keyword information file). When threshold of acoustic model's output level was 100, 86.9% of whole target morpheme was included in post-processing result. At the same condition, applying length information based normalization, 81.25% of whole target morpheme was recognized. The purpose of normalization was to compensate long-length morpheme. According to experiment result, 75.13% of whole target morpheme was recognized KIF(314MB) had been produced from original news data(5,040MB). The decrease rate of absolute information met was approximately 93.8%.

  • PDF

Interpreting Discourse Metaphors in Media: Focusing on News Coverage of Election Campaign

  • Ban, Hyun;Noh, Bokyung
    • International Journal of Advanced Culture Technology
    • /
    • 제10권3호
    • /
    • pp.104-110
    • /
    • 2022
  • This paper aims to analyze discourse metaphors by paying attention to Seoul mayoral by-election, mainly focusing on election campaign and its related news articles. The 2021 Seoul mayoral by-election was held because the former mayor died in an apparent suicide after he was accused of years of sexual harassment to a former secretary. But in the run-up to the by-election, the newly coined word 'alleged victim' from the ruling party caused a big controversy because the party attempted to deny the authenticity of the secretary's claim by calling her "an alleged victim," instead of "a victim" to defend the former mayor who is a member of the ruling party, implying that the woman's claim is just an allegation with no proof. Thus, this paper has analyzed how news stories were reported with regard to the word 'alleged victim' poser on news stories in two Korean quality newspapers, a conservative newspaper (Chosun Ilbo) and a liberal newspaper (Hankyoreh) from March 1 to April 1, 2021 and analyzed them with the framework of Lakoff and Johnson's Conceptual Metaphor Theory(1980). The findings are as follows: (i) the conservative newspaper reports this issue much more than the liberal newspaper; (ii) both quality newspapers follow the metaphor principles by Conceptual Metaphor Theory; (iii) the conservative newspaper is more likely to follow the Strick Father model (a conservative model) while the liberal newspaper is to follow the Nurturant Parent model (a liberal model), thus indicating that each newspaper's ideology is well represented by the models of Conceptual Metaphor Theory