• Title/Summary/Keyword: Co-occurrence Networks

Search Result 56, Processing Time 0.02 seconds

An Investigation on the Periodical Transition of News related to North Korea using Text Mining (텍스트마이닝을 활용한 북한 관련 뉴스의 기간별 변화과정 고찰)

  • Park, Chul-Soo
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.3
    • /
    • pp.63-88
    • /
    • 2019
  • The goal of this paper is to investigate changes in North Korea's domestic and foreign policies through automated text analysis over North Korea represented in South Korean mass media. Based on that data, we then analyze the status of text mining research, using a text mining technique to find the topics, methods, and trends of text mining research. We also investigate the characteristics and method of analysis of the text mining techniques, confirmed by analysis of the data. In this study, R program was used to apply the text mining technique. R program is free software for statistical computing and graphics. Also, Text mining methods allow to highlight the most frequently used keywords in a paragraph of texts. One can create a word cloud, also referred as text cloud or tag cloud. This study proposes a procedure to find meaningful tendencies based on a combination of word cloud, and co-occurrence networks. This study aims to more objectively explore the images of North Korea represented in South Korean newspapers by quantitatively reviewing the patterns of language use related to North Korea from 2016. 11. 1 to 2019. 5. 23 newspaper big data. In this study, we divided into three periods considering recent inter - Korean relations. Before January 1, 2018, it was set as a Before Phase of Peace Building. From January 1, 2018 to February 24, 2019, we have set up a Peace Building Phase. The New Year's message of Kim Jong-un and the Olympics of Pyeong Chang formed an atmosphere of peace on the Korean peninsula. After the Hanoi Pease summit, the third period was the silence of the relationship between North Korea and the United States. Therefore, it was called Depression Phase of Peace Building. This study analyzes news articles related to North Korea of the Korea Press Foundation database(www.bigkinds.or.kr) through text mining, to investigate characteristics of the Kim Jong-un regime's South Korea policy and unification discourse. The main results of this study show that trends in the North Korean national policy agenda can be discovered based on clustering and visualization algorithms. In particular, it examines the changes in the international circumstances, domestic conflicts, the living conditions of North Korea, the South's Aid project for the North, the conflicts of the two Koreas, North Korean nuclear issue, and the North Korean refugee problem through the co-occurrence word analysis. It also offers an analysis of South Korean mentality toward North Korea in terms of the semantic prosody. In the Before Phase of Peace Building, the results of the analysis showed the order of 'Missiles', 'North Korea Nuclear', 'Diplomacy', 'Unification', and ' South-North Korean'. The results of Peace Building Phase are extracted the order of 'Panmunjom', 'Unification', 'North Korea Nuclear', 'Diplomacy', and 'Military'. The results of Depression Phase of Peace Building derived the order of 'North Korea Nuclear', 'North and South Korea', 'Missile', 'State Department', and 'International'. There are 16 words adopted in all three periods. The order is as follows: 'missile', 'North Korea Nuclear', 'Diplomacy', 'Unification', 'North and South Korea', 'Military', 'Kaesong Industrial Complex', 'Defense', 'Sanctions', 'Denuclearization', 'Peace', 'Exchange and Cooperation', and 'South Korea'. We expect that the results of this study will contribute to analyze the trends of news content of North Korea associated with North Korea's provocations. And future research on North Korean trends will be conducted based on the results of this study. We will continue to study the model development for North Korea risk measurement that can anticipate and respond to North Korea's behavior in advance. We expect that the text mining analysis method and the scientific data analysis technique will be applied to North Korea and unification research field. Through these academic studies, I hope to see a lot of studies that make important contributions to the nation.

Construction of Event Networks from Large News Data Using Text Mining Techniques (텍스트 마이닝 기법을 적용한 뉴스 데이터에서의 사건 네트워크 구축)

  • Lee, Minchul;Kim, Hea-Jin
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.183-203
    • /
    • 2018
  • News articles are the most suitable medium for examining the events occurring at home and abroad. Especially, as the development of information and communication technology has brought various kinds of online news media, the news about the events occurring in society has increased greatly. So automatically summarizing key events from massive amounts of news data will help users to look at many of the events at a glance. In addition, if we build and provide an event network based on the relevance of events, it will be able to greatly help the reader in understanding the current events. In this study, we propose a method for extracting event networks from large news text data. To this end, we first collected Korean political and social articles from March 2016 to March 2017, and integrated the synonyms by leaving only meaningful words through preprocessing using NPMI and Word2Vec. Latent Dirichlet allocation (LDA) topic modeling was used to calculate the subject distribution by date and to find the peak of the subject distribution and to detect the event. A total of 32 topics were extracted from the topic modeling, and the point of occurrence of the event was deduced by looking at the point at which each subject distribution surged. As a result, a total of 85 events were detected, but the final 16 events were filtered and presented using the Gaussian smoothing technique. We also calculated the relevance score between events detected to construct the event network. Using the cosine coefficient between the co-occurred events, we calculated the relevance between the events and connected the events to construct the event network. Finally, we set up the event network by setting each event to each vertex and the relevance score between events to the vertices connecting the vertices. The event network constructed in our methods helped us to sort out major events in the political and social fields in Korea that occurred in the last one year in chronological order and at the same time identify which events are related to certain events. Our approach differs from existing event detection methods in that LDA topic modeling makes it possible to easily analyze large amounts of data and to identify the relevance of events that were difficult to detect in existing event detection. We applied various text mining techniques and Word2vec technique in the text preprocessing to improve the accuracy of the extraction of proper nouns and synthetic nouns, which have been difficult in analyzing existing Korean texts, can be found. In this study, the detection and network configuration techniques of the event have the following advantages in practical application. First, LDA topic modeling, which is unsupervised learning, can easily analyze subject and topic words and distribution from huge amount of data. Also, by using the date information of the collected news articles, it is possible to express the distribution by topic in a time series. Second, we can find out the connection of events in the form of present and summarized form by calculating relevance score and constructing event network by using simultaneous occurrence of topics that are difficult to grasp in existing event detection. It can be seen from the fact that the inter-event relevance-based event network proposed in this study was actually constructed in order of occurrence time. It is also possible to identify what happened as a starting point for a series of events through the event network. The limitation of this study is that the characteristics of LDA topic modeling have different results according to the initial parameters and the number of subjects, and the subject and event name of the analysis result should be given by the subjective judgment of the researcher. Also, since each topic is assumed to be exclusive and independent, it does not take into account the relevance between themes. Subsequent studies need to calculate the relevance between events that are not covered in this study or those that belong to the same subject.

Semantic Network Analysis of Presidential Debates in 2007 Election in Korea (제17대 대통령 후보 합동 토론 언어네트워크 분석 - 북한 관련 이슈를 중심으로)

  • Park, Sung-Hee
    • Korean journal of communication and information
    • /
    • v.45
    • /
    • pp.220-254
    • /
    • 2009
  • Presidential TV debates serve as an important instrument for the general viewers to evaluate the candidates’ character, to examine their policy, and finally to make an important political decisions to cast ballots. Every words candidates utter in the course of entire election campaign exert influence of a certain significance by delivering their ideas and by creating clashes with their respective opponents. This study focuses on the conceptual venue, coined as ‘stasis’ by ancient rhetoricians, in which the clashes take place, and examines the words selection made by each candidates, the manners in which they form stasis, call for evidence, educate the public, and finally create a legitimate form of political argumentation. The study applied computer based content analysis using KrKwic and UCINET software to analyze semantic networks among the candidates. The results showed three major candidates, namely Lee Myung Bak, Jung Dong Young, and Lee Hoi Chang, displayed separate patterns in their use of language, by selecting the words that are often neglected by their opponents. Apparently, the absence of stasis and the lack of speaking mutual language significantly undermined the effects of debates. Central questions regarding issues of North Korea failed to meet basic requirements, and the respondents failed to engage in effective argumentation process.

  • PDF

Using Text Mining for the Analysis of Research Trends Related to Laws Under the Ministry of Oceans and Fisheries (텍스트 마이닝을 활용한 해양수산부 법률 관련 연구동향 분석연구)

  • Hwang, Kyu Won;Lee, Moon Suk;Yun, So Ra
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.28 no.4
    • /
    • pp.549-566
    • /
    • 2022
  • Recently, artificial intelligence (AI) technology has progressed rapidly, and industries using this technology are significantly increasing. Further, analysis research using text mining, which is an application of artificial intelligence, is being actively developed in the field of social science research. About 125 laws, including joint laws, have been enacted under the Ministry of Oceans and Fisheries in various sectors including marine environment, fisheries, ships, fishing villages, ports, etc. Research on the laws under the Ministry of Oceans and Fisheries has been progressively conducted, and is steadily increasing quantitatively. In this study, the domestic research trends were analyzed through text mining, targeting the research papers related to laws of the Ministry of Oceans and Fisheries. As part of this research method, first, topic modeling which is a type of text mining was performed to identify potential topics. Second, co-occurrence network analysis was performed, focusing on the keywords in the research papers dealing with specific laws to derive the key themes covered. Finally, author network analysis was performed to explore social networks among authors. The results showed that key topics have been changed by period, and subjects were explored by targeting Ship Safety Law, Marine Environment Management Law, Fisheries Law, etc. Furthermore, in this study, core researchers were selected based on author network analysis, and the tendency for joint research performed by authors was identified. Through this study, changes in the topics for research related to the laws of the Ministry of Oceans and Fisheries were identified up to date, and it is expected that future research topics will be further diversified, and there will be growth of quantitative and qualitative research in the field of oceans and fisheries.

Operation Measures of Sea Fog Observation Network for Inshore Route Marine Traffic Safety (연안항로 해상교통안전을 위한 해무관측망 운영방안에 관한 연구)

  • Joo-Young Lee;Kuk-Jin Kim;Yeong-Tae Son
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.29 no.2
    • /
    • pp.188-196
    • /
    • 2023
  • Among marine accidents caused by bad weather, visibility restrictions caused by sea fog occurrence cause accidents such as ship strand and ship bottom damage, and at the same time involve casualties caused by accidents, which continue to occur every year. In addition, low visibility at sea is emerging as a social problem such as causing considerable inconvenience to islanders in using transportation as passenger ships are collectively delayed and controlled even if there are local differences between regions. Moreover, such measures are becoming more problematic as they cannot objectively quantify them due to regional deviations or different criteria for judging observations from person to person. Currently, the VTS of each port controls the operation of the ship if the visibility distance is less than 1km, and in this case, there is a limit to the evaluation of objective data collection to the extent that the visibility of sea fog depends on the visibility meter or visual observation. The government is building a marine weather signal sign and sea fog observation networks for sea fog detection and prediction as part of solving these obstacles to marine traffic safety, but the system for observing locally occurring sea fog is in a very insufficient practical situation. Accordingly, this paper examines domestic and foreign policy trends to solve social problems caused by low visibility at sea and provides basic data on the need for government support to ensure maritime traffic safety due to sea fog by factually investigating and analyzing social problems. Also, this aims to establish a more stable maritime traffic operation system by blocking marine safety risks that may ultimately arise from sea fog in advance.

The Framework of Research Network and Performance Evaluation on Personal Information Security: Social Network Analysis Perspective (개인정보보호 분야의 연구자 네트워크와 성과 평가 프레임워크: 소셜 네트워크 분석을 중심으로)

  • Kim, Minsu;Choi, Jaewon;Kim, Hyun Jin
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.1
    • /
    • pp.177-193
    • /
    • 2014
  • Over the past decade, there has been a rapid diffusion of electronic commerce and a rising number of interconnected networks, resulting in an escalation of security threats and privacy concerns. Electronic commerce has a built-in trade-off between the necessity of providing at least some personal information to consummate an online transaction, and the risk of negative consequences from providing such information. More recently, the frequent disclosure of private information has raised concerns about privacy and its impacts. This has motivated researchers in various fields to explore information privacy issues to address these concerns. Accordingly, the necessity for information privacy policies and technologies for collecting and storing data, and information privacy research in various fields such as medicine, computer science, business, and statistics has increased. The occurrence of various information security accidents have made finding experts in the information security field an important issue. Objective measures for finding such experts are required, as it is currently rather subjective. Based on social network analysis, this paper focused on a framework to evaluate the process of finding experts in the information security field. We collected data from the National Discovery for Science Leaders (NDSL) database, initially collecting about 2000 papers covering the period between 2005 and 2013. Outliers and the data of irrelevant papers were dropped, leaving 784 papers to test the suggested hypotheses. The co-authorship network data for co-author relationship, publisher, affiliation, and so on were analyzed using social network measures including centrality and structural hole. The results of our model estimation are as follows. With the exception of Hypothesis 3, which deals with the relationship between eigenvector centrality and performance, all of our hypotheses were supported. In line with our hypothesis, degree centrality (H1) was supported with its positive influence on the researchers' publishing performance (p<0.001). This finding indicates that as the degree of cooperation increased, the more the publishing performance of researchers increased. In addition, closeness centrality (H2) was also positively associated with researchers' publishing performance (p<0.001), suggesting that, as the efficiency of information acquisition increased, the more the researchers' publishing performance increased. This paper identified the difference in publishing performance among researchers. The analysis can be used to identify core experts and evaluate their performance in the information privacy research field. The co-authorship network for information privacy can aid in understanding the deep relationships among researchers. In addition, extracting characteristics of publishers and affiliations, this paper suggested an understanding of the social network measures and their potential for finding experts in the information privacy field. Social concerns about securing the objectivity of experts have increased, because experts in the information privacy field frequently participate in political consultation, and business education support and evaluation. In terms of practical implications, this research suggests an objective framework for experts in the information privacy field, and is useful for people who are in charge of managing research human resources. This study has some limitations, providing opportunities and suggestions for future research. Presenting the difference in information diffusion according to media and proximity presents difficulties for the generalization of the theory due to the small sample size. Therefore, further studies could consider an increased sample size and media diversity, the difference in information diffusion according to the media type, and information proximity could be explored in more detail. Moreover, previous network research has commonly observed a causal relationship between the independent and dependent variable (Kadushin, 2012). In this study, degree centrality as an independent variable might have causal relationship with performance as a dependent variable. However, in the case of network analysis research, network indices could be computed after the network relationship is created. An annual analysis could help mitigate this limitation.