• Title/Summary/Keyword: news text

Search Result 380, Processing Time 0.029 seconds

Text Mining and Visualization of Papers Reviews Using R Language

  • Li, Jiapei;Shin, Seong Yoon;Lee, Hyun Chang
    • Journal of information and communication convergence engineering
    • /
    • v.15 no.3
    • /
    • pp.170-174
    • /
    • 2017
  • Nowadays, people share and discuss scientific papers on social media such as the Web 2.0, big data, online forums, blogs, Twitter, Facebook and scholar community, etc. In addition to a variety of metrics such as numbers of citation, download, recommendation, etc., paper review text is also one of the effective resources for the study of scientific impact. The social media tools improve the research process: recording a series online scholarly behaviors. This paper aims to research the huge amount of paper reviews which have generated in the social media platforms to explore the implicit information about research papers. We implemented and shown the result of text mining on review texts using R language. And we found that Zika virus was the research hotspot and association research methods were widely used in 2016. We also mined the news review about one paper and derived the public opinion.

News based Stock Market Sentiment Lexicon Acquisition Using Word2Vec (Word2Vec을 활용한 뉴스 기반 주가지수 방향성 예측용 감성 사전 구축)

  • Kim, Daye;Lee, Youngin
    • The Journal of Bigdata
    • /
    • v.3 no.1
    • /
    • pp.13-20
    • /
    • 2018
  • Stock market prediction has been long dream for researchers as well as the public. Forecasting ever-changing stock market, though, proved a Herculean task. This study proposes a novel stock market sentiment lexicon acquisition system that can predict the growth (or decline) of stock market index, based on economic news. For this purpose, we have collected 3-year's economic news from January 2015 to December 2017 and adopted Word2Vec model to consider the context of words. To evaluate the result, we performed sentiment analysis to collected news data with the automated constructed lexicon and compared with closings of the KOSPI (Korea Composite Stock Price Index), the South Korean stock market index based on economic news.

Political Information Filtering on Online News Comment (정보 중립성 확보를 위한 인터넷 뉴스 댓글의 정치성향 분석)

  • Choi, Hyebong;Kim, Jaehong;Lee, Jihyun;Lee, Mingu
    • The Journal of the Convergence on Culture Technology
    • /
    • v.6 no.4
    • /
    • pp.575-582
    • /
    • 2020
  • We proposes a method to estimate political preference of users who write comments on internet news. We collected and analyzed a massive amount of new comment data from internet news to extract features that effectively characterizes political preference of users. We expect that it helps user to obtain unbiased information from internet news and online discussion by providing estimated political stance of news comment writer. Through comprehensive tests we prove the effectiveness of two proposed methods, lexicon-based algorithm and similarity-based algorithm.

AI-based system for automatically detecting food risk information from news data (뉴스 데이터로부터 식품위해정보 자동 추출을 위한 인공지능 기술)

  • Baek, Yujin;Lee, Jihyeon;Kim, Nam Hee;Lee, Hunjoo;Choo, Jaegul
    • Food Science and Industry
    • /
    • v.54 no.3
    • /
    • pp.160-170
    • /
    • 2021
  • A recent advance in communication technologies accelerates the spread of food safety issues once presented by the news media. To respond to those safety issues and take steps in a timely manner, automatically detecting related information from the news data matters. This work presents an AI-based system that detects risk information within a food-related news article. Experts in food safety areas participated in labeling risk information from the food-related news articles; we acquired 43,527 articles in which food names and risk information are marked as labels. Based on the news document, our system automatically detects food names and risk information by analyzing similarities between words within a text by leveraging learned word embedding vectors. Our AI-based system shows higher detection accuracy scores over a non-AI rule-based system: achieving an absolute gain of +32.94% in F1 for the food name category and +41.53% for the risk information category.

How Content Affects Clicks: A Dynamic Model of Online Content Consumption

  • Inyoung Chae;Da Young Kim
    • Asia pacific journal of information systems
    • /
    • v.31 no.4
    • /
    • pp.606-632
    • /
    • 2021
  • With many consumers being exposed to news via social media platforms, news organizations are challenged to attract visitors and generate revenue during visits to their websites. They therefore need detailed information on how to write articles and headlines to increase visitors' engagement with the content to drive advertising revenues. For those news organizations whose business model depends mainly on advertisements, rather than subscriptions, it is particularly crucial to understand what makes the website attractive to their visitors, what drives users to stay on the website, and what factors affect a user's exit decision. The current research examines individual news consumers' choices to find patterns of increase or decrease in user engagement relative to a variety of topics, as well as to the mood or tone of the content. Using clickstream data from a major news organization, the authors develop a user-level dynamic model of clickstream behavior that takes into account the content of both headlines and stories that visitors read. The authors find that readers appear to exhibit state dependence in the tone of the articles that they read. They also show how the topics expressed in headlines can affect the amount of content readers consume when visiting the news organization to a much larger degree than the topics expressed in the content of the article. Online publishers can make use of such findings to present visitors with content that is likely to maintain and/or increase their engagement and consequently drive advertising revenue.

Topic Modeling of News Article about International Construction Market Using Latent Dirichlet Allocation (Latent Dirichlet Allocation 기법을 활용한 해외건설시장 뉴스기사의 토픽 모델링(Topic Modeling))

  • Moon, Seonghyeon;Chung, Sehwan;Chi, Seokho
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.38 no.4
    • /
    • pp.595-599
    • /
    • 2018
  • Sufficient understanding of oversea construction market status is crucial to get profitability in the international construction project. Plenty of researchers have been considering the news article as a fine data source for figuring out the market condition, since the data includes market information such as political, economic, and social issue. Since the text data exists in unstructured format with huge size, various text-mining techniques were studied to reduce the unnecessary manpower, time, and cost to summarize the data. However, there are some limitations to extract the needed information from the news article because of the existence of various topics in the data. This research is aimed to overcome the problems and contribute to summarization of market status by performing topic modeling with Latent Dirichlet Allocation. With assuming that 10 topics existed in the corpus, the topics included projects for user convenience (topic-2), private supports to solve poverty problems in Africa (topic-4), and so on. By grouping the topics in the news articles, the results could improve extracting useful information and summarizing the market status.

Crisis Prediction of Regional Industry Ecosystem based on Text Sentiment Analysis Using News Data - Focused on the Automobile Industry in Gwangju - (뉴스 데이터를 활용한 텍스트 감성분석에 따른 지역 산업생태계 위기 예측 - 광주 지역 자동차 산업을 중심으로 -)

  • Kim, Hyun-Ji;Kim, Sung-Jin;Kim, Han-Gook
    • The Journal of the Korea Contents Association
    • /
    • v.20 no.8
    • /
    • pp.1-9
    • /
    • 2020
  • As the aging problem of the regional industry ecosystem has gradually become serious, research to measure and regenerate the regional industry ecosystem decline has been actively conducted. However, little research has been done on regional industry ecosystem crises. Crisis emerges radically over a short period of time, and it is often impossible to respond by post-response, so you must respond before the crisis occurs. In other words, it is more necessary and required when looking at the crisis early and taking a proactive response from a long-term perspective. Therefore, it is necessary to develop a predictive model that can proactively recognize and respond to the crisis in the regional industry ecosystem. Therefore, this study checked the possibility of predicting the risk of regional industry and market according to the emotional score of the news by using large-scale news data. News sentiment analysis was performed using the Google sentiment analysis API, and this was organized by month to check the correlation between actual events.

Analysis of the Relations between Social Issues and Prices Using Text Mining - Avian Influenza and Egg Prices - (뉴스기사 분석을 통한 사회이슈와 가격에 관한 연구 - 조류인플루엔자와 달걀가격 중심으로 -)

  • Han, Mu Moung Cho;Kim, Yangsok;Lee, Choong Kwon
    • Smart Media Journal
    • /
    • v.7 no.1
    • /
    • pp.45-51
    • /
    • 2018
  • Avian influenza (AI) is notorious for its rapid infection rate, and has a serious impact on consumers and producers alike, especially in poultry farms. The AI outbreak, which occurred nationwide at the end of 2016, devastated the livestock farming industries. As a result, the prices of eggs and egg products had skyrocketed, and the event was reported by the media with heavy emphasis. The purpose of this study was to investigate the correlation between the egg price fluctuation and the keyword changes in online news articles reflecting social issues. To this end, we analyzed 682 cases of AI-related online news articles for fourteen weeks from November 2016 in South Korea. The results of this study are expected to contribute to understanding the relationship between the actual price of eggs and the keywords from news articles related to social issues.

Comparison Between Optimal Features of Korean and Chinese for Text Classification (한중 자동 문서분류를 위한 최적 자질어 비교)

  • Ren, Mei-Ying;Kang, Sinjae
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.25 no.4
    • /
    • pp.386-391
    • /
    • 2015
  • This paper proposed the optimal attributes for text classification based on Korean and Chinese linguistic features. The experiments committed to discover which is the best feature among n-grams which is known as language independent, morphemes that have language dependency and some other feature sets consisted with n-grams and morphemes showed best results. This paper used SVM classifier and Internet news for text classification. As a result, bi-gram was the best feature in Korean text categorization with the highest F1-Measure of 87.07%, and for Chinese document classification, 'uni-gram+noun+verb+adjective+idiom', which is the combined feature set, showed the best performance with the highest F1-Measure of 82.79%.

Quantitative Text Mining for Social Science: Analysis of Immigrant in the Articles (사회과학을 위한 양적 텍스트 마이닝: 이주, 이민 키워드 논문 및 언론기사 분석)

  • Yi, Soo-Jeong;Choi, Doo-Young
    • The Journal of the Korea Contents Association
    • /
    • v.20 no.5
    • /
    • pp.118-127
    • /
    • 2020
  • The paper introduces trends and methodological challenges of quantitative Korean text analysis by using the case studies of academic and news media articles on "migration" and "immigration" within the periods of 2017-2019. The quantitative text analysis based on natural language processing technology (NLP) and this became an essential tool for social science. It is a part of data science that converts documents into structured data and performs hypothesis discovery and verification as the data and visualize data. Furthermore, we examed the commonly applied social scientific statistical models of quantitative text analysis by using Natural Language Processing (NLP) with R programming and Quanteda.