• Title/Summary/Keyword: news text

Search Result 379, Processing Time 0.036 seconds

Text Mining and Network Analysis of News Articles for Deriving Socio-Economic Damage Types of Heat Wave Events in Korea: 2012~2016 Cases (뉴스 기사 텍스트 마이닝과 네트워크 분석을 통한 폭염의 사회·경제적 영향 유형 도출: 2012~2016년 사례)

  • Jung, Jae In;Lee, Kyoungjun;Kim, Seungbum
    • Atmosphere
    • /
    • v.30 no.3
    • /
    • pp.237-248
    • /
    • 2020
  • In order to effectively prepare for damage caused by weather events, it is important to proactively identify the possible impacts of weather phenomena on the domestic society and economy. Text mining and Network analysis are used in this paper to build a database of damage types and levels caused by heat wave. We collect news articles about heat wave from the SBS news website and determine the primary and secondary effects of that through network analysis. In addition to that, based on the frequency with which each impact keyword is mentioned, we estimate how much influence each factor has. As a result, the types of impacts caused by heat wave are efficiently derived. Among these types of impacts, we find that people in South Korea are mainly interested in algae and heat-related illness. Since this technique of analysis can be applied not only to news articles but also to social media contents, such as Twitter and Facebook, it is expected to be used as a useful tool for building weather impact databases.

Performance Evaluations of Text Ranking Algorithms

  • Kim, Myung-Hwi;Jang, Beakcheol
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.2
    • /
    • pp.123-131
    • /
    • 2020
  • The text ranking algorithm is a representative method for keyword extraction, and its importance is emphasized highly. In this paper, we compare the performance of recent research and experiments with TF-IDF, SMART, INQUERY and CCA algorithms, which are used in text ranking algorithm.. After explaining each algorithm, we compare the performance of each algorithm based on the data collected from news and Twitter. Experimental results show that all of four algorithms can extract specific words from news data equally. However, in the case of Twitter, CCA has the best performance to extract specific words, and INQUERY shows the worst performance. We also analyze the accuracy of the algorithm through six comparison metrics. The experimental results present that CCA shows the best accuracy in the news data. In case of Twitter, TF-IDF and CCA show similar performance and demonstrate good performance.

CNN-based Skip-Gram Method for Improving Classification Accuracy of Chinese Text

  • Xu, Wenhua;Huang, Hao;Zhang, Jie;Gu, Hao;Yang, Jie;Gui, Guan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.12
    • /
    • pp.6080-6096
    • /
    • 2019
  • Text classification is one of the fundamental techniques in natural language processing. Numerous studies are based on text classification, such as news subject classification, question answering system classification, and movie review classification. Traditional text classification methods are used to extract features and then classify them. However, traditional methods are too complex to operate, and their accuracy is not sufficiently high. Recently, convolutional neural network (CNN) based one-hot method has been proposed in text classification to solve this problem. In this paper, we propose an improved method using CNN based skip-gram method for Chinese text classification and it conducts in Sogou news corpus. Experimental results indicate that CNN with the skip-gram model performs more efficiently than CNN-based one-hot method.

A News Video Mining based on Multi-modal Approach and Text Mining (멀티모달 방법론과 텍스트 마이닝 기반의 뉴스 비디오 마이닝)

  • Lee, Han-Sung;Im, Young-Hee;Yu, Jae-Hak;Oh, Seung-Geun;Park, Dai-Hee
    • Journal of KIISE:Databases
    • /
    • v.37 no.3
    • /
    • pp.127-136
    • /
    • 2010
  • With rapid growth of information and computer communication technologies, the numbers of digital documents including multimedia data have been recently exploded. In particular, news video database and news video mining have became the subject of extensive research, to develop effective and efficient tools for manipulation and analysis of news videos, because of their information richness. However, many research focus on browsing, retrieval and summarization of news videos. Up to date, it is a relatively early state to discover and to analyse the plentiful latent semantic knowledge from news videos. In this paper, we propose the news video mining system based on multi-modal approach and text mining, which uses the visual-textual information of news video clips and their scripts. The proposed system systematically constructs a taxonomy of news video stories in automatic manner with hierarchical clustering algorithm which is one of text mining methods. Then, it multilaterally analyzes the topics of news video stories by means of time-cluster trend graph, weighted cluster growth index, and network analysis. To clarify the validity of our approach, we analyzed the news videos on "The Second Summit of South and North Korea in 2007".

Development of a Fake News Detection Model Using Text Mining and Deep Learning Algorithms (텍스트 마이닝과 딥러닝 알고리즘을 이용한 가짜 뉴스 탐지 모델 개발)

  • Dong-Hoon Lim;Gunwoo Kim;Keunho Choi
    • Information Systems Review
    • /
    • v.23 no.4
    • /
    • pp.127-146
    • /
    • 2021
  • Fake news isexpanded and reproduced rapidly regardless of their authenticity by the characteristics of modern society, called the information age. Assuming that 1% of all news are fake news, the amount of economic costs is reported to about 30 trillion Korean won. This shows that the fake news isvery important social and economic issue. Therefore, this study aims to develop an automated detection model to quickly and accurately verify the authenticity of the news. To this end, this study crawled the news data whose authenticity is verified, and developed fake news prediction models using word embedding (Word2Vec, Fasttext) and deep learning algorithms (LSTM, BiLSTM). Experimental results show that the prediction model using BiLSTM with Word2Vec achieved the best accuracy of 84%.

Study on Effective Extraction of New Coined Vocabulary from Political Domain Article and News Comment (정치 도메인에서 신조어휘의 효과적인 추출 및 의미 분석에 대한 연구)

  • Lee, Jihyun;Kim, Jaehong;Cho, Yesung;Lee, Mingu;Choi, Hyebong
    • The Journal of the Convergence on Culture Technology
    • /
    • v.7 no.2
    • /
    • pp.149-156
    • /
    • 2021
  • Text mining is one of the useful tools to discover public opinion and perception regarding political issues from big data. It is very common that users of social media express their opinion with newly-coined words such as slang and emoji. However, those new words are not effectively captured by traditional text mining methods that process text data using a language dictionary. In this study, we propose effective methods to extract newly-coined words that connote the political stance and opinion of users. With various text mining techniques, I attempt to discover the context and the political meaning of the new words.

Analysis of News Regarding New Southeastern Airport Using Text Mining Techniques (텍스트 마이닝 기법을 활용한 동남권 신공항 신문기사 분석)

  • Han, Mu Moung Cho;Kim, Yang Sok;Lee, Choong Kwon
    • Smart Media Journal
    • /
    • v.6 no.1
    • /
    • pp.47-53
    • /
    • 2017
  • Social issues are important factors that decide government policy and newspapers are critical channels that reflect them. Analysing news articles can contribute to understanding social issues, but it is very difficult to analyse the unstructured large volumes of news data manually. Therefore, this study aims to analyze the different views among stakeholders of a specific social issue by using text analysis, word cloud analysis and associative analysis methods, which systematically transform unstructured news data into structured one. We analyzed a total of 115 news articles and a total of 6,772 comments, collected from the selected newspapers (Chosun-Il-bo, Joongang-Il-bo, Donga-Il-bo, Maeil Newspaper, Busan-Il-bo) for two weeks. We found that there are significant differences in tone between newspapers. While nation-wide daily newspapers focus on political relations with local areas, local daily newspapers tend to write articles to represent local governments' interests.

Emerging Gender Issues in Korean Online Media: A Temporal Semantic Network Analysis Approach

  • Lee, Young-Joo;Park, Ji-Young
    • Journal of Contemporary Eastern Asia
    • /
    • v.18 no.2
    • /
    • pp.118-141
    • /
    • 2019
  • In South Korea, as awareness of gender equality increased since the 1990s, policies for gender equality and social awareness of equality have been established. Until recently, however, the gap between men and women in social and economic activities has not reached the globally desired level and led to social conflict throughout the country. In this study, we analyze the content of online news comments to understand the public perception of gender equality and the details of gender conflict and to grasp the emergence and diffusion process of emerging issues on gender equality. We collected text data from the online news that included the word 'gender equality' posted from January 2012 to June 2017 and also collected comments on each selected news item. Through text mining and the temporal semantic network analysis, we tracked the changes in discourse on gender equality and conflict. Results revealed that gender conflicts are increasing in the online media, and the focus of conflict is shifting from 'position and role inequality' to 'opportunity inequality'.

Development of Real Time Information Service Model Using Smart Phone Lock Screen (스마트 폰 잠금 화면을 통한 실시간 정보제공 서비스 모델의 개발)

  • Oh, Sung-Jin;Jang, Jin-Wook
    • Journal of Information Technology Services
    • /
    • v.13 no.3
    • /
    • pp.323-331
    • /
    • 2014
  • This research is based on real-time service model that uses lock screen of smart devices which is mostly exposed to device users. The potential for lock screen space is immense due to their exposing time for user. The effect can be maximized by offering useful information contents on lock screen. This service model offers real-time keyword with abridged sentence. They match real-time keyword with news by using text matching algorithm and extracts kernel sentence from news to provide short sentence to user. News from the lock screen to match real-time query sentence, and then only to the original core of the ability to move a user evaluation was conducted after adding. The report provided a key statement users feel the lack of original Not if you go to an average of 5.71%. Most algorithms allow only real-time zoom key sentence extracted keywords can accurately determine the reason for that was confirmed.

Pilot Experiment for Named Entity Recognition of Construction-related Organizations from Unstructured Text Data

  • Baek, Seungwon;Han, Seung H.;Jung, Wooyong;Kim, Yuri
    • International conference on construction engineering and project management
    • /
    • 2022.06a
    • /
    • pp.847-854
    • /
    • 2022
  • The aim of this study is to develop a Named Entity Recognition (NER) model to automatically identify construction-related organizations from news articles. This study collected news articles using web crawling technique and construction-related organizations were labeled within a total of 1,000 news articles. The Bidirectional Encoder Representations from Transformers (BERT) model was used to recognize clients, constructors, consultants, engineers, and others. As a pilot experiment of this study, the best average F1 score of NER was 0.692. The result of this study is expected to contribute to the establishment of international business strategies by collecting timely information and analyzing it automatically.

  • PDF