• Title/Summary/Keyword: 키워드 그래프

Search Result 51, Processing Time 0.026 seconds

Analysis of News Agenda Using Text mining and Semantic Network Analysis: Focused on COVID-19 Emotions (텍스트 마이닝과 의미 네트워크 분석을 활용한 뉴스 의제 분석: 코로나 19 관련 감정을 중심으로)

  • Yoo, So-yeon;Lim, Gyoo-gun
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.1
    • /
    • pp.47-64
    • /
    • 2021
  • The global spread of COVID-19 around the world has not only affected many parts of our daily life but also has a huge impact on many areas, including the economy and society. As the number of confirmed cases and deaths increases, medical staff and the public are said to be experiencing psychological problems such as anxiety, depression, and stress. The collective tragedy that accompanies the epidemic raises fear and anxiety, which is known to cause enormous disruptions to the behavior and psychological well-being of many. Long-term negative emotions can reduce people's immunity and destroy their physical balance, so it is essential to understand the psychological state of COVID-19. This study suggests a method of monitoring medial news reflecting current days which requires striving not only for physical but also for psychological quarantine in the prolonged COVID-19 situation. Moreover, it is presented how an easier method of analyzing social media networks applies to those cases. The aim of this study is to assist health policymakers in fast and complex decision-making processes. News plays a major role in setting the policy agenda. Among various major media, news headlines are considered important in the field of communication science as a summary of the core content that the media wants to convey to the audiences who read it. News data used in this study was easily collected using "Bigkinds" that is created by integrating big data technology. With the collected news data, keywords were classified through text mining, and the relationship between words was visualized through semantic network analysis between keywords. Using the KrKwic program, a Korean semantic network analysis tool, text mining was performed and the frequency of words was calculated to easily identify keywords. The frequency of words appearing in keywords of articles related to COVID-19 emotions was checked and visualized in word cloud 'China', 'anxiety', 'situation', 'mind', 'social', and 'health' appeared high in relation to the emotions of COVID-19. In addition, UCINET, a specialized social network analysis program, was used to analyze connection centrality and cluster analysis, and a method of visualizing a graph using Net Draw was performed. As a result of analyzing the connection centrality between each data, it was found that the most central keywords in the keyword-centric network were 'psychology', 'COVID-19', 'blue', and 'anxiety'. The network of frequency of co-occurrence among the keywords appearing in the headlines of the news was visualized as a graph. The thickness of the line on the graph is proportional to the frequency of co-occurrence, and if the frequency of two words appearing at the same time is high, it is indicated by a thick line. It can be seen that the 'COVID-blue' pair is displayed in the boldest, and the 'COVID-emotion' and 'COVID-anxiety' pairs are displayed with a relatively thick line. 'Blue' related to COVID-19 is a word that means depression, and it was confirmed that COVID-19 and depression are keywords that should be of interest now. The research methodology used in this study has the convenience of being able to quickly measure social phenomena and changes while reducing costs. In this study, by analyzing news headlines, we were able to identify people's feelings and perceptions on issues related to COVID-19 depression, and identify the main agendas to be analyzed by deriving important keywords. By presenting and visualizing the subject and important keywords related to the COVID-19 emotion at a time, medical policy managers will be able to be provided a variety of perspectives when identifying and researching the regarding phenomenon. It is expected that it can help to use it as basic data for support, treatment and service development for psychological quarantine issues related to COVID-19.

Design and Implemantation of Information Retrieval System based on Semantic Information (의미정보기반 검색시스템의 설계 및 구현)

  • Park, Chang-Keun;Yang, Gi-Chul
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2004.11a
    • /
    • pp.265-268
    • /
    • 2004
  • Keyword matching technique which is used in most information retrieval systems is unfit for efficient processing of geometrically increasing information. The problem can be solved by using semantic information and an efficient method of semantic processing is introduced in this paper. The technique uses conceptual graph to represent the semantic information and apply it for information retrieval. The implemented system can perform exact matching and partial matching. Partial matching has two different types. One is syntactic partial matching and the other is semantic partial matching. The semantic semilaries are measured by the subclass relations in the ontology. The introduced technique can be used not only information retrieval but also in various applications such as an implementation of dynamic hyperlinks.

  • PDF

Development of Ontology Viewer System for the Oriental Medicine (한의학 약재 온톨로지 뷰어 시스템 개발)

  • Ryu, Dong-Ho;Cha, Seung-Jun;Yu, Jeong-Youn;Song, Mi-Young;Lee, Kyu-Chul
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2008.06c
    • /
    • pp.154-158
    • /
    • 2008
  • 시간이 지날수록 처리해야 하는 정보가 점점 늘어나고 있어서 각 분야에서는 온톨로지를 구축하여 그것을 기반으로 보다 정확한 결과를 얻으려는 시도를 하고 있다. 한의학 분야에서도 온톨로지를 이용한 약재 정보 관리를 시도하고 있다. 하지만 한의학 약재 온톨로지에서는 약재 사이의 연관성 파악이 중요하지만, 기존의 검색은 키워드 기반의 검색으로 약재 사이의 연관성을 확인하게 어려움이 있다. 온톨로지의 구조적 내용을 파악하기 위한 기존의 온톨로지 뷰어들이 존재하지만 약재 온톨로지가 가지는 계층구조 위주의 탐색이 어렵고, 다양한 속성이 속성에 관계없이 그래프 상에 고르게 분포하기 때문에 속성에 따른 약재의 구분이 어렵다는 문제점이 존재한다. 따라서 기존의 뷰어를 수정 및 보완하여 한의학 약재 온톨로지에서 계층구조 파악 및 속성 별 약재 분류를 파악할 수 있는 뷰어를 개발하였다. 이러한 뷰어시스템을 통해 향후 한의학 전반적인 분야의 자원을 단계별로 체계화하여 관리함으로써 사용자 중심의 통합되고 현대화된 전통 의학 정보의 서비스의 기초시스템으로 활용될 수 있을 것이다.

  • PDF

Text-mining Based Graph Model for Keyword Extraction from Patent Documents (특허 문서로부터 키워드 추출을 위한 위한 텍스트 마이닝 기반 그래프 모델)

  • Lee, Soon Geun;Leem, Young Moon;Um, Wan Sup
    • Journal of the Korea Safety Management & Science
    • /
    • v.17 no.4
    • /
    • pp.335-342
    • /
    • 2015
  • The increasing interests on patents have led many individuals and companies to apply for many patents in various areas. Applied patents are stored in the forms of electronic documents. The search and categorization for these documents are issues of major fields in data mining. Especially, the keyword extraction by which we retrieve the representative keywords is important. Most of techniques for it is based on vector space model. But this model is simply based on frequency of terms in documents, gives them weights based on their frequency and selects the keywords according to the order of weights. However, this model has the limit that it cannot reflect the relations between keywords. This paper proposes the advanced way to extract the more representative keywords by overcoming this limit. In this way, the proposed model firstly prepares the candidate set using the vector model, then makes the graph which represents the relation in the pair of candidate keywords in the set and selects the keywords based on this relationship graph.

Korean Collective Intelligence in Sharing Economy Using R Programming: A Text Mining and Time Series Analysis Approach (R프로그래밍을 활용한 공유경제의 한국인 집단지성: 텍스트 마이닝 및 시계열 분석)

  • Kim, Jae Won;Yun, You Dong;Jung, Yu Jin;Kim, Ki Youn
    • Journal of Internet Computing and Services
    • /
    • v.17 no.5
    • /
    • pp.151-160
    • /
    • 2016
  • The purpose of this research is to investigate Korean popular attitudes and social perceptions of 'sharing economy' terminology at the current moment from a creative or socio-economic point of view. In Korea, this study discovers and interprets the objective and tangible annual changes and patterns of sociocultural collective intelligence that have taken place over the last five years by applying text mining in the big data analysis approach. By crawling and Googling, this study collected a significant amount of time series web meta-data with regard to the theme of the sharing economy on the world wide web from 2010 to 2014. Consequently, huge amounts of raw data concerning sharing economy are processed into the value-added meaningful 'word clouding' form of graphs or figures by using the function of word clouding with R programming. Till now, the lack of accumulated data or collective intelligence about sharing economy notwithstanding, it is worth nothing that this study carried out preliminary research on conducting a time-series big data analysis from the perspective of knowledge management and processing. Thus, the results of this study can be utilized as fundamental data to help understand the academic and industrial aspects of future sharing economy-related markets or consumer behavior.

Analysis of Waterpark Status and Recognition Using Big Data Analysis (빅데이터 분석을 활용한 워터파크 현황 및 인식 분석)

  • Kim, Jae-Hwan;Lee, Jae-Moon
    • Journal of Digital Convergence
    • /
    • v.15 no.10
    • /
    • pp.525-535
    • /
    • 2017
  • The purpose of this study aims to examine consumer perception and current status of water park. The Naver and Daum were used for data collection channels and the keyword 'water park' was used for data retrieval. The data analysis period was limited to the study period from January 1, 2015 to December 31, 2016 for a total of two years. First, as a result of the frequency analysis, hidden cameras, Lotte water park, arrests, suspects, gimhae were in top 5 in 2015, Lotte water park, swimming, summer, opening, admission ticket were in top 5 in 2016. Second, as a result of the connection degree central analysis, hidden camera, arrest, suspect, female, shower room were in top 5 in 2015, swimming, Lotte water park, summer and One Mount, admission ticket were in top 5 in 2016. Third, as a result of the N-GRAM network graph, the water park/hidden camera, the hidden camera/hidden camera, the suspect/arrest, the Gimhae/Lotte water park, water park/suspect were in top 5 in 2015, and One Mount/water park, Gimhae/Lotte water park, water park/admission ticket, water park/water park, water park/opening were in top 5 in 2016. Fourth, as a result of the CONCOR analysis, three groups in 2015 and two groups in 2016 were formed.

Presidential Candidate's Speech based on Network Analysis : Mainly on the Visibility of the Words and the Connectivity between the Words (18대 대통령 선거 후보자의 연설문 네트워크 분석: 단어의 가시성(visibility)과 단어 간 연결성(connectivity)을 중심으로)

  • Hong, Ju-Hyun;Yun, Hae-Jin
    • The Journal of the Korea Contents Association
    • /
    • v.14 no.9
    • /
    • pp.24-44
    • /
    • 2014
  • This study explores the political meaning of candidate's speech and statement who run for the 18th presidential election in the viewpoint of communication. The visibility of the words and the connectivity between the words are analyzed in the viewpoint of structural aspect and the vision, policy. The visibility of the words is analyzed based on the frequency of the words mentioned in the speech or the statement. The connectivity between the words are analyzed based on the network analysis and expressed by graph. In the case of candidate Park, the key word is the happiness of the people and appointment. The key word for candidate Moon is regime change and the Korean Peninsula and the key word for candidate Ahn is the people and change. This study contributes positively to the study of candidate's discourse in the viewpoint of methodology by using network analysis and exploring scientifically the connectivity of the words. In the theoretical aspect this study uses the results of network analysis for revealing what is the leadership components in the speech and the statement. In conclusion, this study highlights the extension of the communication studies.

GO Guide : Browser & Query Translation for Biological Ontology (GO Guide : 생물학 온톨로지를 위한 브라우저 및 질의 변환)

  • Jung Jun-Won;Park Hyoung-Woo;Im Dong-Hhyuk;Lee Kang-Pyo;Kim Hyoung-Joo
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.12 no.3
    • /
    • pp.183-191
    • /
    • 2006
  • As genetic research is getting more active, data construction of genes are needed in the field of biology. Therefore, Gene Ontology Consortium has constructed genetic information by OWL, which is Ontology description language published by W3C. However, previous browsers for Gene Ontology only support simple searching mechanisms based on keyword, tree, and graph, but it is not able to search high quality information considering various relationships. In this paper, we suggest browsing technique which integratesvarious searching methods to support researchers who are doing actually experiment in biology field. Also, instead of typing a query, we propose querv generation technique which constructs query while browsing and query translation technique which translate generated query into SeRQL query It is convenient for user and enables user to obtain high quality information. And by this GO Guide browser, it has been shown that the information of Gene Ontology could be used efficiently.

Analysis of Educational Issues through Topic Modeling of National Petitions Text (국민청원글의 토픽 모델링을 통한 교육이슈 분석)

  • Shim, Jaekwoun
    • Journal of The Korean Association of Information Education
    • /
    • v.25 no.4
    • /
    • pp.633-640
    • /
    • 2021
  • Education related issues are social problems in which various groups and situations are intricately linked to each other. It is difficult to find issues by analyzing social phenomena related to education. Korean based text analysis can be analyzed in a quantitative. With the development of text analysis techniques, research results have been recently achieved, and it can be fully utilized to derive educational issues from text data in Korean. In this study, petition articles in the field of childcare/education were collected on the online-board of the Blue House National Petition website, and text analysis was used to derive issues in the education world. The analysis derived 6 topics through Latent Dirichlet Allocation(LDA) among topic modeling techniques. The association rules of major keywords were analyzed and visualized as graphs. In addition to deriving educational issues through the existing questionnaire, it can provide implications for future research directions and policies in that issues can be sufficiently discovered through text-based analysis methods.

Twitter Issue Tracking System by Topic Modeling Techniques (토픽 모델링을 이용한 트위터 이슈 트래킹 시스템)

  • Bae, Jung-Hwan;Han, Nam-Gi;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.109-122
    • /
    • 2014
  • People are nowadays creating a tremendous amount of data on Social Network Service (SNS). In particular, the incorporation of SNS into mobile devices has resulted in massive amounts of data generation, thereby greatly influencing society. This is an unmatched phenomenon in history, and now we live in the Age of Big Data. SNS Data is defined as a condition of Big Data where the amount of data (volume), data input and output speeds (velocity), and the variety of data types (variety) are satisfied. If someone intends to discover the trend of an issue in SNS Big Data, this information can be used as a new important source for the creation of new values because this information covers the whole of society. In this study, a Twitter Issue Tracking System (TITS) is designed and established to meet the needs of analyzing SNS Big Data. TITS extracts issues from Twitter texts and visualizes them on the web. The proposed system provides the following four functions: (1) Provide the topic keyword set that corresponds to daily ranking; (2) Visualize the daily time series graph of a topic for the duration of a month; (3) Provide the importance of a topic through a treemap based on the score system and frequency; (4) Visualize the daily time-series graph of keywords by searching the keyword; The present study analyzes the Big Data generated by SNS in real time. SNS Big Data analysis requires various natural language processing techniques, including the removal of stop words, and noun extraction for processing various unrefined forms of unstructured data. In addition, such analysis requires the latest big data technology to process rapidly a large amount of real-time data, such as the Hadoop distributed system or NoSQL, which is an alternative to relational database. We built TITS based on Hadoop to optimize the processing of big data because Hadoop is designed to scale up from single node computing to thousands of machines. Furthermore, we use MongoDB, which is classified as a NoSQL database. In addition, MongoDB is an open source platform, document-oriented database that provides high performance, high availability, and automatic scaling. Unlike existing relational database, there are no schema or tables with MongoDB, and its most important goal is that of data accessibility and data processing performance. In the Age of Big Data, the visualization of Big Data is more attractive to the Big Data community because it helps analysts to examine such data easily and clearly. Therefore, TITS uses the d3.js library as a visualization tool. This library is designed for the purpose of creating Data Driven Documents that bind document object model (DOM) and any data; the interaction between data is easy and useful for managing real-time data stream with smooth animation. In addition, TITS uses a bootstrap made of pre-configured plug-in style sheets and JavaScript libraries to build a web system. The TITS Graphical User Interface (GUI) is designed using these libraries, and it is capable of detecting issues on Twitter in an easy and intuitive manner. The proposed work demonstrates the superiority of our issue detection techniques by matching detected issues with corresponding online news articles. The contributions of the present study are threefold. First, we suggest an alternative approach to real-time big data analysis, which has become an extremely important issue. Second, we apply a topic modeling technique that is used in various research areas, including Library and Information Science (LIS). Based on this, we can confirm the utility of storytelling and time series analysis. Third, we develop a web-based system, and make the system available for the real-time discovery of topics. The present study conducted experiments with nearly 150 million tweets in Korea during March 2013.