• Title/Summary/Keyword: Co-occurrence Networks

Search Result 56, Processing Time 0.027 seconds

A Study on Differences of Contents and Tones of Arguments among Newspapers Using Text Mining Analysis (텍스트 마이닝을 활용한 신문사에 따른 내용 및 논조 차이점 분석)

  • Kam, Miah;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.3
    • /
    • pp.53-77
    • /
    • 2012
  • This study analyses the difference of contents and tones of arguments among three Korean major newspapers, the Kyunghyang Shinmoon, the HanKyoreh, and the Dong-A Ilbo. It is commonly accepted that newspapers in Korea explicitly deliver their own tone of arguments when they talk about some sensitive issues and topics. It could be controversial if readers of newspapers read the news without being aware of the type of tones of arguments because the contents and the tones of arguments can affect readers easily. Thus it is very desirable to have a new tool that can inform the readers of what tone of argument a newspaper has. This study presents the results of clustering and classification techniques as part of text mining analysis. We focus on six main subjects such as Culture, Politics, International, Editorial-opinion, Eco-business and National issues in newspapers, and attempt to identify differences and similarities among the newspapers. The basic unit of text mining analysis is a paragraph of news articles. This study uses a keyword-network analysis tool and visualizes relationships among keywords to make it easier to see the differences. Newspaper articles were gathered from KINDS, the Korean integrated news database system. KINDS preserves news articles of the Kyunghyang Shinmun, the HanKyoreh and the Dong-A Ilbo and these are open to the public. This study used these three Korean major newspapers from KINDS. About 3,030 articles from 2008 to 2012 were used. International, national issues and politics sections were gathered with some specific issues. The International section was collected with the keyword of 'Nuclear weapon of North Korea.' The National issues section was collected with the keyword of '4-major-river.' The Politics section was collected with the keyword of 'Tonghap-Jinbo Dang.' All of the articles from April 2012 to May 2012 of Eco-business, Culture and Editorial-opinion sections were also collected. All of the collected data were handled and edited into paragraphs. We got rid of stop-words using the Lucene Korean Module. We calculated keyword co-occurrence counts from the paired co-occurrence list of keywords in a paragraph. We made a co-occurrence matrix from the list. Once the co-occurrence matrix was built, we used the Cosine coefficient matrix as input for PFNet(Pathfinder Network). In order to analyze these three newspapers and find out the significant keywords in each paper, we analyzed the list of 10 highest frequency keywords and keyword-networks of 20 highest ranking frequency keywords to closely examine the relationships and show the detailed network map among keywords. We used NodeXL software to visualize the PFNet. After drawing all the networks, we compared the results with the classification results. Classification was firstly handled to identify how the tone of argument of a newspaper is different from others. Then, to analyze tones of arguments, all the paragraphs were divided into two types of tones, Positive tone and Negative tone. To identify and classify all of the tones of paragraphs and articles we had collected, supervised learning technique was used. The Na$\ddot{i}$ve Bayesian classifier algorithm provided in the MALLET package was used to classify all the paragraphs in articles. After classification, Precision, Recall and F-value were used to evaluate the results of classification. Based on the results of this study, three subjects such as Culture, Eco-business and Politics showed some differences in contents and tones of arguments among these three newspapers. In addition, for the National issues, tones of arguments on 4-major-rivers project were different from each other. It seems three newspapers have their own specific tone of argument in those sections. And keyword-networks showed different shapes with each other in the same period in the same section. It means that frequently appeared keywords in articles are different and their contents are comprised with different keywords. And the Positive-Negative classification showed the possibility of classifying newspapers' tones of arguments compared to others. These results indicate that the approach in this study is promising to be extended as a new tool to identify the different tones of arguments of newspapers.

Student Group Division Algorithm based on Multi-view Attribute Heterogeneous Information Network

  • Jia, Xibin;Lu, Zijia;Mi, Qing;An, Zhefeng;Li, Xiaoyong;Hong, Min
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.12
    • /
    • pp.3836-3854
    • /
    • 2022
  • The student group division is benefit for universities to do the student management based on the group profile. With the widespread use of student smart cards on campus, especially where students living in campus residence halls, students' daily activities on campus are recorded with information such as smart card swiping time and location. Therefore, it is feasible to depict the students with the daily activity data and accordingly group students based on objective measuring from their campus behavior with some regular student attributions collected in the management system. However, it is challenge in feature representation due to diverse forms of the student data. To effectively and comprehensively represent students' behaviors for further student group division, we proposed to adopt activity data from student smart cards and student attributes as input data with taking account of activity and attribution relationship types from different perspective. Specially, we propose a novel student group division method based on a multi-view student attribute heterogeneous information network (MSA-HIN). The network nodes in our proposed MSA-HIN represent students with their multi-dimensional attribute information. Meanwhile, the edges are constructed to characterize student different relationships, such as co-major, co-occurrence, and co-borrowing books. Based on the MSA-HIN, embedded representations of students are learned and a deep graph cluster algorithm is applied to divide students into groups. Comparative experiments have been done on a real-life campus dataset collected from a university. The experimental results demonstrate that our method can effectively reveal the variability of student attributes and relationships and accordingly achieves the best clustering results for group division.

Discovering Meaningful Trends in the Inaugural Addresses of North Korean Leader Via Text Mining (텍스트마이닝을 활용한 북한 지도자의 신년사 및 연설문 트렌드 연구)

  • Park, Chul-Soo
    • Journal of Information Technology Applications and Management
    • /
    • v.26 no.3
    • /
    • pp.43-59
    • /
    • 2019
  • The goal of this paper is to investigate changes in North Korea's domestic and foreign policies through automated text analysis over North Korean new year addresses, one of most important and authoritative document publicly announced by North Korean government. Based on that data, we then analyze the status of text mining research, using a text mining technique to find the topics, methods, and trends of text mining research. We also investigate the characteristics and method of analysis of the text mining techniques, confirmed by analysis of the data. We propose a procedure to find meaningful tendencies based on a combination of text mining, cluster analysis, and co-occurrence networks. To demonstrate applicability and effectiveness of the proposed procedure, we analyzed the inaugural addresses of Kim Jung Un of the North Korea from 2017 to 2019. The main results of this study show that trends in the North Korean national policy agenda can be discovered based on clustering and visualization algorithms. We found that uncovered semantic structures of North Korean new year addresses closely follow major changes in North Korean government's positions toward their own people as well as outside audience such as USA and South Korea.

Knowledge Structure of the Korean Journal of Occupational Health Nursing through Network Analysis (네트워크분석을 통한 직업건강간호학회지 논문의 지식구조 분석)

  • Kwon, Sun Young;Park, Eun Jung
    • Korean Journal of Occupational Health Nursing
    • /
    • v.24 no.2
    • /
    • pp.76-85
    • /
    • 2015
  • Purpose: The purpose of this study was to identify knowledge structure of the Korean Journal of Occupational Health Nursing from 1991 to 2014. Methods: 400 articles between 1991 and 2014 were collected. 1,369 keywords as noun phrases were extracted from articles and standardized for analysis. Co-occurrence matrix was generated via a cosine similarity measure, then the network was analyzed and visualized using PFNet. Also NodeXL was applied to visualize intellectual interchanges among keywords. Results: According to the results of the content analysis and the cluster analysis of author keywords from the Korean Journal of Occupational Health Nursing articles, 7 most important research topics of the journal were 'Workers & Work-related Health Problem', 'Recognition & Preventive Health Behaviors', 'Health Promotion & Quality of Life', 'Occupational Health Nursing & Management', 'Clinical Nursing Environment', 'Caregivers and Social Support', and 'Job Satisfaction, Stress & Performance'. Newly emerging topics for 4-year period units were observed as research trends. Conclusion: Through this study, the knowledge structure of the Korean Journal of Occupational Health Nursing was identified. The network analysis of this study will be useful for identifying the knowledge structure as well as finding general view and current research trends. Furthermore, The results of this study could be utilized to seek the research direction in the Korean Journal of Occupational Health Nursing.

Research Trends of Studies Related to the Nature of Science in Korea Using Semantic Network Analysis (언어 네트워크 분석을 이용한 과학의 본성에 관한 국내연구 동향)

  • Lee, Sang-Gyun
    • Journal of the Korean Society of Earth Science Education
    • /
    • v.9 no.1
    • /
    • pp.65-87
    • /
    • 2016
  • The purpose of this study is to examine Korean journals related to science education in order to analyze research trends into Nature of science in Korea. The subject of the study is the level of Korean Citation Index (KCI-listed, KCI listing candidates), that can be searched by the key phrase, "Nature of science" in Korean language through the RISS service. In this study, the Descriptive Statistical Analysis Method is utilized to discover the number of research articles, classifying them by year and by journal. Also, the Sementic Network Analysis was conducted to Word Cloud Analysis the frequency of key words, Centrality Analysis, co-occurrence and Cluster Dendrogram Analysis throughout a variety of research articles. The results show that 91 research papers were published in 25 journals from 1991 to 2015. Specifically, the 2 major journals published more than 50% of the total papers. In relation to research fields., In addition, key phrases, such as 'Analysis', 'recognition', 'lessons', 'science textbook', 'History of Science' and 'influence' are the most frequently used among the research studies. Finally, there are small language networks that appear concurrently as below: [Nature of science - high school student - recognize], [Explicit - lesson - effect], [elementary school - science textbook - analysis]. Research topic have been gradually diversified. However, many studies still put their focus on analysis and research aspects, and there have been little research on the Teaching and learning methods.

The Dynamics of Research Output by Indonesian Scientist, Period of 1945-2021

  • Prakoso Bhairawa, Putera;Ida, Widianingsih;Sinta, Ningrum;Suryanto, Suryanto;Yan, Rianto
    • Asian Journal of Innovation and Policy
    • /
    • v.11 no.3
    • /
    • pp.397-420
    • /
    • 2022
  • This research was conducted by applying a bibliometric analysis to determine the dynamics of research topics from ten percent of research output (international publications) generated by Indonesian scientists from the period of 1945-2021. This study utilizes VOSviewers version 1.6.18 for analysis and visualization of bibliometric networks. The research results indicate that 50.24% of Indonesian international publications are published in the form of articles, with subjects such as: Agricultural and Biological Sciences, Medicine, and Earth and Planetary Sciences as the most dominating subject areas. Regarding the author, Tjia, MO from Bandung Institute of Technology was acknowledged as the top author in terms of the number of publications produced for two periods. The article entitled "Global, regional, and national prevalence of overweight and obesity in children and adults during 1980-2013: A systematic analysis for the Global Burden of Disease Study 2013" (Ng et al., 2014) became the most cited one.

Investigation of AI-based dual-model strategy for monitoring cyanobacterial blooms from Sentinel-3 in Korean inland waters

  • Hoang Hai Nguyen;Dalgeun Lee;Sunghwa Choi;Daeyun Shin
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2023.05a
    • /
    • pp.168-168
    • /
    • 2023
  • The frequent occurrence of cyanobacterial harmful algal blooms (CHABs) in inland waters under climate change seriously damages the ecosystem and human health and is becoming a big problem in South Korea. Satellite remote sensing is suggested for effective monitoring CHABs at a larger scale of water bodies since the traditional method based on sparse in-situ networks is limited in space. However, utilizing a standalone variable of satellite reflectances in common CHABs dual-models, which relies on both chlorophyll-a (Chl-a) and phycocyanin or cyanobacteria cells (Cyano-cell), is not fully beneficial because their seasonal variation is highly impacted by surrounding meteorological and bio-environmental factors. Along with the development of Artificial Intelligence (AI), monitoring CHABs from space with analyzing the effects of environmental factors is accessible. This study aimed to investigate the potential application of AI in the dual-model strategy (Chl-a and Cyano-cell are output parameters) for monitoring seasonal dynamics of CHABs from satellites over Korean inland waters. The Sentinel-3 satellite was selected in this study due to the variety of spectral bands and its unique band (620 nm), which is sensitive to cyanobacteria. Via the AI-based feature selection, we analyzed the relationships between two output parameters and major parameters (satellite water-leaving reflectances at different spectral bands), together with auxiliary (meteorological and bio-environmental) parameters, to select the most important ones. Several AI models were then employed for modelling Chl-a and Cyano-cell concentration from those selected important parameters. Performance evaluation of the AI models and their comparison to traditional semi-analytical models were conducted to demonstrate whether AI models (using water-leaving reflectances and environmental variables) outperform traditional models (using water-leaving reflectances only) and which AI models are superior for monitoring CHABs from Sentinel-3 satellite over a Korean inland water body.

  • PDF

Analyzing Research Trends of Domestic Artificial Intelligence Research Using Network Analysis and Dynamic Topic Modelling (네트워크 분석과 동적 토픽모델링을 활용한 국내 인공지능 분야 연구동향 분석)

  • Jung, Woojin;Oh, Chanhee;Zhu, Yongjun
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.55 no.4
    • /
    • pp.141-157
    • /
    • 2021
  • In this study, we aimed to understand research trends of domestic artificial intelligence research. To achieve the goal, we applied network analysis and dynamic topic modeling to domestic research papers on artificial intelligence. Among the papers that have been indexed in KCI (Korean Journal of Citation Index) by 2020, metadata and abstracts of 2,552 papers where the titles or indexed keywords include 'artificial intelligence' both in Korean and English were collected. Keyword, affiliation, subject field, and abstract were extracted and preprocessed for further analyses. We identified main keywords in the field by analyzing keyword co-occurrence networks as well as the degree and characteristics of research collaboration between domestic and foreign institutions and between industry and university by analyzing institutional collaboration networks. Dynamic topic modeling was performed on 1845 abstracts written in Korean, and 13 topics were obtained from the labeling process. This study broadens the understanding of domestic artificial intelligence research by identifying research trends through dynamic topic modeling from abstracts as well as the degree and characteristics of research collaboration through institutional collaboration networks from author affiliation information. In addition, the results of this study can be used by governmental institutions for making policies in accordance with artificial intelligence era.

Analysis of News Agenda Using Text mining and Semantic Network Analysis: Focused on COVID-19 Emotions (텍스트 마이닝과 의미 네트워크 분석을 활용한 뉴스 의제 분석: 코로나 19 관련 감정을 중심으로)

  • Yoo, So-yeon;Lim, Gyoo-gun
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.1
    • /
    • pp.47-64
    • /
    • 2021
  • The global spread of COVID-19 around the world has not only affected many parts of our daily life but also has a huge impact on many areas, including the economy and society. As the number of confirmed cases and deaths increases, medical staff and the public are said to be experiencing psychological problems such as anxiety, depression, and stress. The collective tragedy that accompanies the epidemic raises fear and anxiety, which is known to cause enormous disruptions to the behavior and psychological well-being of many. Long-term negative emotions can reduce people's immunity and destroy their physical balance, so it is essential to understand the psychological state of COVID-19. This study suggests a method of monitoring medial news reflecting current days which requires striving not only for physical but also for psychological quarantine in the prolonged COVID-19 situation. Moreover, it is presented how an easier method of analyzing social media networks applies to those cases. The aim of this study is to assist health policymakers in fast and complex decision-making processes. News plays a major role in setting the policy agenda. Among various major media, news headlines are considered important in the field of communication science as a summary of the core content that the media wants to convey to the audiences who read it. News data used in this study was easily collected using "Bigkinds" that is created by integrating big data technology. With the collected news data, keywords were classified through text mining, and the relationship between words was visualized through semantic network analysis between keywords. Using the KrKwic program, a Korean semantic network analysis tool, text mining was performed and the frequency of words was calculated to easily identify keywords. The frequency of words appearing in keywords of articles related to COVID-19 emotions was checked and visualized in word cloud 'China', 'anxiety', 'situation', 'mind', 'social', and 'health' appeared high in relation to the emotions of COVID-19. In addition, UCINET, a specialized social network analysis program, was used to analyze connection centrality and cluster analysis, and a method of visualizing a graph using Net Draw was performed. As a result of analyzing the connection centrality between each data, it was found that the most central keywords in the keyword-centric network were 'psychology', 'COVID-19', 'blue', and 'anxiety'. The network of frequency of co-occurrence among the keywords appearing in the headlines of the news was visualized as a graph. The thickness of the line on the graph is proportional to the frequency of co-occurrence, and if the frequency of two words appearing at the same time is high, it is indicated by a thick line. It can be seen that the 'COVID-blue' pair is displayed in the boldest, and the 'COVID-emotion' and 'COVID-anxiety' pairs are displayed with a relatively thick line. 'Blue' related to COVID-19 is a word that means depression, and it was confirmed that COVID-19 and depression are keywords that should be of interest now. The research methodology used in this study has the convenience of being able to quickly measure social phenomena and changes while reducing costs. In this study, by analyzing news headlines, we were able to identify people's feelings and perceptions on issues related to COVID-19 depression, and identify the main agendas to be analyzed by deriving important keywords. By presenting and visualizing the subject and important keywords related to the COVID-19 emotion at a time, medical policy managers will be able to be provided a variety of perspectives when identifying and researching the regarding phenomenon. It is expected that it can help to use it as basic data for support, treatment and service development for psychological quarantine issues related to COVID-19.

Supragingival Plaque Microbial Community Analysis of Children with Halitosis

  • Ren, Wen;Zhang, Qun;Liu, Xuenan;Zheng, Shuguo;Ma, Lili;Chen, Feng;Xu, Tao;Xu, Baohua
    • Journal of Microbiology and Biotechnology
    • /
    • v.26 no.12
    • /
    • pp.2141-2147
    • /
    • 2016
  • As one of the most complex human-associated microbial habitats, the oral cavity harbors hundreds of bacteria. Halitosis is a prevalent oral condition that is typically caused by bacteria. The aim of this study was to analyze the microbial communities and predict functional profiles in supragingival plaque from healthy individuals and those with halitosis. Ten preschool children were enrolled in this study; five with halitosis and five without. Supragingival plaque was isolated from each participant and 16S rRNA gene pyrosequencing was used to identify the microbes present. Samples were primarily composed of Actinobacteria, Bacteroidetes, Proteobacteria, Firmicutes, Fusobacteria, and Candidate phylum TM7. The ${\alpha}$ and ${\beta}$ diversity indices did not differ between healthy and halitosis subjects. Fifteen operational taxonomic units (OTUs) were identified with significantly different relative abundances between healthy and halitosis plaques, and included the phylotypes of Prevotella sp., Leptotrichia sp., Actinomyces sp., Porphyromonas sp., Selenomonas sp., Selenomonas noxia, and Capnocytophaga ochracea. We suggest that these OTUs are candidate halitosis-associated pathogens. Functional profiles were predicted using PICRUSt, and nine level-3 KEGG Orthology groups were significantly different. Hub modules of co-occurrence networks implied that microbes in halitosis dental plaque were more highly conserved than microbes of healthy individuals' plaque. Collectively, our data provide a background for the oral microbiota associated with halitosis from supragingival plaque, and help explain the etiology of halitosis.