• Title/Summary/Keyword: news topic

Search Result 232, Processing Time 0.03 seconds

Topic Modeling of News Article about International Construction Market Using Latent Dirichlet Allocation (Latent Dirichlet Allocation 기법을 활용한 해외건설시장 뉴스기사의 토픽 모델링(Topic Modeling))

  • Moon, Seonghyeon;Chung, Sehwan;Chi, Seokho
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.38 no.4
    • /
    • pp.595-599
    • /
    • 2018
  • Sufficient understanding of oversea construction market status is crucial to get profitability in the international construction project. Plenty of researchers have been considering the news article as a fine data source for figuring out the market condition, since the data includes market information such as political, economic, and social issue. Since the text data exists in unstructured format with huge size, various text-mining techniques were studied to reduce the unnecessary manpower, time, and cost to summarize the data. However, there are some limitations to extract the needed information from the news article because of the existence of various topics in the data. This research is aimed to overcome the problems and contribute to summarization of market status by performing topic modeling with Latent Dirichlet Allocation. With assuming that 10 topics existed in the corpus, the topics included projects for user convenience (topic-2), private supports to solve poverty problems in Africa (topic-4), and so on. By grouping the topics in the news articles, the results could improve extracting useful information and summarizing the market status.

An Exploratory Study of Health Inequality Discourse Using Korean Newspaper Articles: A Topic Modeling Approach

  • Kim, Jin-Hwan
    • Journal of Preventive Medicine and Public Health
    • /
    • v.52 no.6
    • /
    • pp.384-392
    • /
    • 2019
  • Objectives: This study aimed to explore the health inequality discourse in the Korean press by analyzing newspaper articles using a relatively new content analysis technique. Methods: This study used the search term "health inequality" to collect articles containing that term that were published between 2000 and 2018. The collected articles went through pre-processing and topic modeling, and the contents and temporal trends of the extracted topics were analyzed. Results: A total of 1038 articles were identified, and 5 topics were extracted. As the number of studies on health inequality has increased over the past 2 decades, so too has the number of news articles regarding health inequality. The extracted topics were public health policies, social inequalities in health, inequality as a social problem, healthcare policies, and regional health gaps. The total number of occurrences of each topic increased every year, and the trend observed for each theme was influenced by events related to its contents, such as elections. Finally, the frequency of appearance of each topic differed depending on the type of news source. Conclusions: The results of this study can be used as preliminary data for future attempts to address health inequality in Korea. To make addressing health inequality part of the public agenda, the media's perspective and discourse regarding health inequality should be monitored to facilitate further strategic action.

Topic and Source Diversity of the Front Page in the New York Times, Chicago Tribune and the Los Angeles Times from 1950 to 2000 (20세기 하반기의 미 신문 1면 보도에 대한 다양성 분석: 뉴스 토픽과 정보원의 분포를 중심으로)

  • Shim, Hoon
    • Korean journal of communication and information
    • /
    • v.30
    • /
    • pp.175-201
    • /
    • 2005
  • This study investigates the diversity of news topic and source of the New York Times, Chicago Tribune, and the Los Angeles Times in the second half of the twentieth century. In probing the conventional traits of the contemporary press, the researcher traced the changing patterns and trends of news values in terms of news-gathering routine in order to evaluate the journalistic role conception in terms of social responsibility theory. Findings indicated that the American press as a neutral transmitter has been consistently violated by source and topic bias without any significant changes during the last five decades. The data, however, revealed the evident shift of the contemporary press from the heavy reliance of official source to the business/economic source. In addition, news topics such as business, health, and education have replaced the conventional popular topics such as crime and accidents. By contrast, it was revealed that the unconventional topics such as poverty, labor and minority still fail to receive the large attention from the target papers.

  • PDF

Topic Modeling and Keyword Network Analysis of News Articles Related to Nurses before and after "the Thanks to You Challenge" during the COVID-19 Pandemic (COVID-19 '덕분에 챌린지' 전후 간호사 관련 뉴스 기사의 토픽 모델링 및 키워드 네트워크 분석)

  • Yun, Eun Kyoung;Kim, Jung Ok;Byun, Hye Min;Lee, Guk Geun
    • Journal of Korean Academy of Nursing
    • /
    • v.51 no.4
    • /
    • pp.442-453
    • /
    • 2021
  • Purpose: This study was conducted to assess public awareness and policy challenges faced by practicing nurses. Methods: After collecting nurse-related news articles published before and after 'the Thanks to You Challenge' campaign (between December 31, 2019, and July 15, 2020), keywords were extracted via preprocessing. A three-step method keyword analysis, latent Dirichlet allocation topic modeling, and keyword network analysis was used to examine the text and the structure of the selected news articles. Results: Top 30 keywords with similar occurrences were collected before and after the campaign. The five dominant topics before the campaign were: pandemic, infection of medical staff, local transmission, medical resources, and return of overseas Koreans. After the campaign, the topics 'infection of medical staff' and 'return of overseas Koreans' disappeared, but 'the Thanks to You Challenge' emerged as a dominant topic. A keyword network analysis revealed that the word of nurse was linked with keywords like thanks and campaign, through the word of sacrifice. These words formed interrelated domains of 'the Thanks to You Challenge' topic. Conclusion: The findings of this study can provide useful information for understanding various issues and social perspectives on COVID-19 nursing. The major themes of news reports lagged behind the real problems faced by nurses in COVID-19 crisis. While the press tends to focus on heroism and whole society, issues and policies mutually beneficial to public and nursing need to be further explored and enhanced by nurses.

A Study on Children's Images during the Liberation Period Using Topic Modeling: With a focus on The Children's News (토픽 모델링을 이용한 해방기 아동상 연구 - 「어린이신문」을 중심으로 -)

  • Jang, Seok-Eun;Lee, Hye-Eun
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.33 no.3
    • /
    • pp.157-178
    • /
    • 2022
  • This study explores children's images in The Children's News, a children's newspaper during the Liberation period. For this purpose, frequency analysis, topic modeling, and time series analysis were performed from the first issue of December 1, 1945 to the 86 issue of December 13, 1947, except for No. 34, which was not passed down. As a result of frequency analysis, keywords related to country, school, and family appeared frequently, and through topic modeling, children's images were observed in these topics, including children with patriotism, children with scientific literacy, children with artistic refinement, and children as social beings. The time series analysis results show that the percentage of patriotism-related topics was high during the early days of the Liberation period when The Children's News were published, but as the ratio of topics such as science and art gradually increased, it was confirmed that the image of children was diversified.

Trend Analysis of Pet Plants Before and After COVID-19 Outbreak Using Topic Modeling: Focusing on Big Data of News Articles from 2018 to 2021

  • Park, Yumin;Shin, Yong-Wook
    • Journal of People, Plants, and Environment
    • /
    • v.24 no.6
    • /
    • pp.563-572
    • /
    • 2021
  • Background and objective: The ongoing COVID-19 pandemic restricted daily life, forcing people to spend time indoors. With the growing interest in mental health issues and residential environments, 'pet plants' have been receiving attention during the unprecedented social distancing measures. This study aims to analyze the change in trends of pet plants before and during the COVID-19 pandemic and provide basic data for studies related to pet plants and directions of future development. Methods: A total of 2,016 news articles using the keyword 'pet plants' were collected on Naver News from January 1, 2018 to August 15, 2019 (609 articles) and January 1, 2020 to August 15, 2021 (1,407 articles). The texts were tokenized into words using KoNLPy package, ultimately coming up with 63,597 words. The analyses included frequency of keywords and topic modeling based on Latent Dirichlet Allocation (LDA) to identify the inherent meanings of related words and each topic. Results: Topic modeling generated three topics in each period (before and during the COVID-19), and the results showed that pet plants in daily life have become the object of 'emotional support' and 'healing' during social distancing. In particular, pet plants, which had been distributed as a solution to prevent solitary deaths and depression among seniors living alone, are now expanded to help resolve the social isolation of the general public suffering from COVID-19. The new term 'plant butler' became a new trend, and there was a change in the trend in which people shared their hobbies and information about pet plants and communicated with others in online. Conclusion: Based on these findings, the trend data of pet plants before and after the outbreak of COVID-19 can provide the basis for activating research on pet plants and setting the direction for development of related industries considering the continuous popularity and trend of indoor gardening and green hobby.

Tweets analysis using a Dynamic Topic Modeling : Focusing on the 2019 Koreas-US DMZ Summit (트윗의 타임 시퀀스를 활용한 DTM 분석 : 2019 남북미정상회동 이벤트를 중심으로)

  • Ko, EunJi;Choi, SunYoung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.2
    • /
    • pp.308-313
    • /
    • 2021
  • In this study, tweets about the 2019 Koreas-US DMZ Summit were collected along with a time sequence and analyzed by a sequential topic modeling method, Dynamic Topic Modeling(DTM). In microblogging services such as Twitter, unstructured data that mixes news and an opinion about a single event occurs at the same time on a large scale, and information and reactions are produced in the same message format. Therefore, to grasp a topic trend, the contextual meaning can be found only by performing pattern analysis reflecting the characteristics of sequential data. As a result of calculating the DTM after obtaining the topic coherence score and evaluating the Latent Dirichlet Allocation(LDA), 30 topics related to news reports and opinions were derived, and the probability of occurrence of each topic and keywords were dynamically evolving. In conclusion, the study found that DTM is a suitable model for analyzing the trend of integrated topics in a specific event over time.

Keyword Reorganization Techniques for Improving the Identifiability of Topics (토픽 식별성 향상을 위한 키워드 재구성 기법)

  • Yun, Yeoil;Kim, Namgyu
    • Journal of Information Technology Services
    • /
    • v.18 no.4
    • /
    • pp.135-149
    • /
    • 2019
  • Recently, there are many researches for extracting meaningful information from large amount of text data. Among various applications to extract information from text, topic modeling which express latent topics as a group of keywords is mainly used. Topic modeling presents several topic keywords by term/topic weight and the quality of those keywords are usually evaluated through coherence which implies the similarity of those keywords. However, the topic quality evaluation method based only on the similarity of keywords has its limitations because it is difficult to describe the content of a topic accurately enough with just a set of similar words. In this research, therefore, we propose topic keywords reorganizing method to improve the identifiability of topics. To reorganize topic keywords, each document first needs to be labeled with one representative topic which can be extracted from traditional topic modeling. After that, classification rules for classifying each document into a corresponding label are generated, and new topic keywords are extracted based on the classification rules. To evaluated the performance our method, we performed an experiment on 1,000 news articles. From the experiment, we confirmed that the keywords extracted from our proposed method have better identifiability than traditional topic keywords.

Keyword Extraction from News Corpus using Modified TF-IDF (TF-IDF의 변형을 이용한 전자뉴스에서의 키워드 추출 기법)

  • Lee, Sung-Jick;Kim, Han-Joon
    • The Journal of Society for e-Business Studies
    • /
    • v.14 no.4
    • /
    • pp.59-73
    • /
    • 2009
  • Keyword extraction is an important and essential technique for text mining applications such as information retrieval, text categorization, summarization and topic detection. A set of keywords extracted from a large-scale electronic document data are used for significant features for text mining algorithms and they contribute to improve the performance of document browsing, topic detection, and automated text classification. This paper presents a keyword extraction technique that can be used to detect topics for each news domain from a large document collection of internet news portal sites. Basically, we have used six variants of traditional TF-IDF weighting model. On top of the TF-IDF model, we propose a word filtering technique called 'cross-domain comparison filtering'. To prove effectiveness of our method, we have analyzed usefulness of keywords extracted from Korean news articles and have presented changes of the keywords over time of each news domain.

  • PDF

Topic Modeling on the Adolescent Problem Using Text Mining (텍스트 마이닝을 이용한 청소년 문제 토픽 모델링)

  • Cho, Ju-Yeon;Cho, Kyoung Won
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.22 no.12
    • /
    • pp.1589-1595
    • /
    • 2018
  • The purpose of this research is to search for and identify trends in adolescent problems on internet news sites. Among the domestic internet news sites, 8,110 articles on adolescent problems from 1993 to 2018 were analyzed for the top three top-ranked 'The Chosunilbo', 'The Dong-A Ilbo', and 'Korea Joongang Daily' news sites. As a result of this study, we have been able to understand the topic of adolescent problems in internet news sites for the last 26 years and find out that the trend of articles has been changed considering the environment, policies and culture related to adolescent problems. This study is meaningful to start from the method to examine the social trends of existing adolescent problems, to expand the scope of adolescent problems and counseling, to use quantitative analysis methods and to provide new information to consider diversity.