• Title/Summary/Keyword: Text mining analysis

Search Result 1,200, Processing Time 0.031 seconds

A Study of Information Literacy Curriculum Using Topic Modeling (토픽모델링을 활용한 정보활용교육 연구주제 분석 및 교육내용 제안)

  • Jihye, Yun;Yoo Kyung, Jeong
    • Journal of the Korean Society for information Management
    • /
    • v.39 no.4
    • /
    • pp.1-21
    • /
    • 2022
  • The aim of this study is to identify the research topics and suggest an information literacy curriculum by analyzing research articles on information literacy. For this purpose, we applied the topic modeling technique to 97 scientific articles and identified the core contents of information literacy education, such as media literacy, information literacy instruction, and the use of information resources. Based on the analysis results, we suggested an information literacy curriculum by considering the Big 6 model, information literacy standards of American Association of School Library, and Association of College and Research Libraries's information literacy competencies. This study is significant in that it considered 'use of information resources' and 'information ethics' to suggest information literacy education.

Analysis of Waterpark Status and Recognition Using Big Data Analysis (빅데이터 분석을 활용한 워터파크 현황 및 인식 분석)

  • Kim, Jae-Hwan;Lee, Jae-Moon
    • Journal of Digital Convergence
    • /
    • v.15 no.10
    • /
    • pp.525-535
    • /
    • 2017
  • The purpose of this study aims to examine consumer perception and current status of water park. The Naver and Daum were used for data collection channels and the keyword 'water park' was used for data retrieval. The data analysis period was limited to the study period from January 1, 2015 to December 31, 2016 for a total of two years. First, as a result of the frequency analysis, hidden cameras, Lotte water park, arrests, suspects, gimhae were in top 5 in 2015, Lotte water park, swimming, summer, opening, admission ticket were in top 5 in 2016. Second, as a result of the connection degree central analysis, hidden camera, arrest, suspect, female, shower room were in top 5 in 2015, swimming, Lotte water park, summer and One Mount, admission ticket were in top 5 in 2016. Third, as a result of the N-GRAM network graph, the water park/hidden camera, the hidden camera/hidden camera, the suspect/arrest, the Gimhae/Lotte water park, water park/suspect were in top 5 in 2015, and One Mount/water park, Gimhae/Lotte water park, water park/admission ticket, water park/water park, water park/opening were in top 5 in 2016. Fourth, as a result of the CONCOR analysis, three groups in 2015 and two groups in 2016 were formed.

A Study on the Perception of Pit and Fissure Sealant using Unstructured Big Data (비정형 빅데이터를 이용한 치면열구전색(치아홈메우기)에 대한 인식분석)

  • Han-A Cho
    • Journal of Korean Dental Hygiene Science
    • /
    • v.6 no.2
    • /
    • pp.101-114
    • /
    • 2023
  • Background: This study aimed to explore the overall perception of pit and fissure sealants and suggest methods to revitalize their current stagnation. Methods: To determine the social perception of the change in coverage policy for pit and fissure sealants, we categorized them into five time periods. The first period (December 1, 2009 to November 30, 2010), the second period (December 1, 2010 to September 30, 2012), the third period (October 1, 2012 to May 5, 2013), the fourth period (May 6, 2013 to September 30, 2017), and the fifth period (October 1, 2017 to December 31, 2022). We utilized text mining, an unstructured big data analysis method. Keywords were collected and analyzed using Textom, and the frequency analysis of the top 30 keywords, structural features of the semantic network, centrality analysis, QAP correlation analysis, and co-occurrence analysis were conducted. Results: The frequency analysis showed that the top keywords for each time period were 'Cavities', 'Treatment', and 'Children'. In the structural features of the semantic network of pit and fissure sealants by time period, the density index was found to be around 1.00 for all time periods. The QAP correlation analysis showed the highest correlation between the first and second periods and the fourth and fifth periods with a correlation coefficient of 0.834. The co-occurrence analysis showed that 'cavities' and 'prevention were the top two words across all time periods. Conclusion: This study showed that pit and fissure sealants are well accepted by the society as a preventive treatment for caries. However, the awareness of health education related to these sealants was found to be low. Efforts to revitalize stagnant pit and fissure sealants need to be strengthened with effective education.

International Research Trend on Mountainous Sediment-related Disasters Induced by Earthquakes (지진 유발 산지토사재해 관련 국외 연구동향 분석)

  • Lee, Sang-In;Seo, Jung-Il;Kim, Jin-Hak;Ryu, Dong-Seop;Seo, Jun-Pyo;Kim, Dong-Yeob;Lee, Chang-Woo
    • Journal of Korean Society of Forest Science
    • /
    • v.106 no.4
    • /
    • pp.431-440
    • /
    • 2017
  • The 2016 Gyeongju Earthquake ($M_L$ 5.8) (occurred on September 12, 2016) and the 2017 Pohang Earthquake ($M_L$ 5.4) (occurred on November 15, 2017) caused unprecedented damages in South Korea. It is necessary to establish basic data related to earthquake-induced mountainous sediment-related disasters over worldwide. In this study, we analyzed previous international studies on the earthquake-induced mountainous sediment-related disasters, then classified research areas according to research themes using text-mining and co-word analysis in VOSviewer program, and finally examined spatio-temporal research trends by research area. The result showed that the related-researches have been rapidly increased since 2005, which seems to be affected by recent large-scale earthquakes occurred in China, Taiwan and Japan. In addition, the research area related to mountainous sediment-related disasters induced by earthquakes was classified into four subjects: (i) mechanisms of disaster occurrence; (ii) rainfall parameters controlling disaster occurrence; (iii) prediction of potential disaster area using aerial and satellite photographs; and (iv) disaster risk mapping through the modeling of disaster occurrence. These research areas are considered to have a strong correlation with each other. On the threshold year (i.e., 2012-2013), when cumulative number of research papers was reached 50% of total research papers published since 1987, proportions per unit year of all research areas should increase. Especially, the proportion of the research areas related to prediction of potential disaster area using aerial and satellite photographs is highly increased compared to other three research areas. These trends are responsible for the rapidly increasing research papers with study sites in China, and the research papers examined in Taiwan, Japan, and the United States have also contributed to increases in all research areas. The results are could be used as basic data to present future research direction related to mountainous sediment-related disasters induced by earthquakes in South Korea.

A Study on Sentiment Score of Healthcare Service Quality on the Hospital Rating (의료 서비스 리뷰의 감성 수준이 병원 평가에 미치는 영향 분석)

  • Jee-Eun Choi;Sodam Kim;Hee-Woong Kim
    • Information Systems Review
    • /
    • v.20 no.2
    • /
    • pp.111-137
    • /
    • 2018
  • Considering the increase in health insurance benefits and the elderly population of the baby boomer generation, the amount consumed by health care in 2020 is expected to account for 20% of US GDP. As the healthcare industry develops, competition among the medical services of hospitals intensifies, and the need of hospitals to manage the quality of medical services increases. In addition, interest in online reviews of hospitals has increased as online reviews have become a tool to predict hospital quality. Consumers tend to refer to online reviews even when choosing healthcare service providers and after evaluating service quality online. This study aims to analyze the effect of sentiment score of healthcare service quality on hospital rating with Yelp hospital reviews. This study classifies large amount of text data collected online primarily into five service quality measurement indexes of SERVQUAL theory. The sentiment scores of reviews are then derived by SERVQUAL dimensions, and an econometric analysis is conducted to determine the sentiment score effects of the five service quality dimensions on hospital reviews. Results shed light on the means of managing online hospital reputation to benefit managers in the healthcare and medical industry.

Complexity Metrics for Analysis Classes in the Unified Software Development Process (Unified Process의 분석 클래스에 대한 복잡도 척도)

  • 김유경;박재년
    • The KIPS Transactions:PartD
    • /
    • v.8D no.1
    • /
    • pp.71-80
    • /
    • 2001
  • Object-Oriented (OO) methodology to use the concept like encapsulation, inheritance, polymorphism, and message passing demands metrics that are different from structured methodology. There are many studies for OO software metrics such as program complexity or design metrics. But the metrics for the analysis class need to decrease the complexity in the analysis phase so that greatly reduce the effort and the cost of system development. In this paper, we propose new metrics to measure the complexity of analysis classes which draw out in the analysis phase based on Unified Process. By the collaboration complexity, is denoted by CC, we mean the maximum number of the collaborations can be achieved with each of the collaborator and detennine the potential complexity. And the interface complexity, is denoted by IC, shows the difficulty related to understand the interface of collaborators each other. We prove mathematically that the suggested metrics satisfy OO characteristics such as class size and inheritance. And we verify it theoretically for Weyuker' s nine properties. Moreover, we show the computation results for analysis classes of the system which automatically respond to questions of the it's user using the text mining technique. As we compared CC and IC to CBO and WMC, the complexity can be represented by CC and IC more than CBO and WMC. We expect to develop the cost-effective OO software by reviewing the complexity of analysis classes in the first stage of SDLC (Software Development Life Cycle).

  • PDF

Study on U-City Service Issue and Trends based Text Mining - Using the Network Analysis and Information Measure Method - (텍스트 마이닝에 기반한 U-City 서비스 이슈 및 동향분석 - 네트워크분석 및 정보량계측기법을 활용하여 -)

  • Jeong, Dawoon;Yoo, Jisong;Yi, Mi-Sook;Shin, Dong Bin
    • Spatial Information Research
    • /
    • v.23 no.3
    • /
    • pp.35-44
    • /
    • 2015
  • Recently, the government aims to discover and provide services to citizens on the development strategy for activating the U-City. So, this study aims to offer a service discovery direction by analyzing the service issues and trends. The target is newspaper article about U-City Service from 2009 to 2014. Prepared 723 newspaper article for analysis. Next step is frequency analysis of keyword and used that result for Network analysis and measure of information. Network analysis can show result through "Degree Centrality", "Betweenness Centrality" and "Closeness Centrality". As a result, "Information", "IT", "Environment", "Technology", "Center" is higher than another. These 5 keywords are important factors for driving the U-City the past six years. Information measurement results, Already U-City were put an emphasis on building the infrastructure and able to identify a trend that provided the center of the public service. Those Service field are "Tour(2009)", "Crime prevention and Disaster Prevention(2010)", "Facility Management(2011)", "administration(2012)" and "Facility Management(2013, 2014)". Result of this study found implications what on citizen participation. So, services field on the existing infrastructure should be discovered and provided. Finally, this study can expected to be a reference in the local government planning for U-City.

The Identification Framework for source code author using Authorship Analysis and CNN (작성자 분석과 CNN을 적용한 소스 코드 작성자 식별 프레임워크)

  • Shin, Gun-Yoon;Kim, Dong-Wook;Hong, Sung-sam;Han, Myung-Mook
    • Journal of Internet Computing and Services
    • /
    • v.19 no.5
    • /
    • pp.33-41
    • /
    • 2018
  • Recently, Internet technology has developed, various programs are being created and therefore various codes are being made through many authors. On this aspect, some author deceive a program or code written by other particular author as they make it themselves and use other writers' code indiscriminately, or not indicating the exact code which has been used. Due to this makes it more and more difficult to protect the code. In this paper, we propose author identification framework using Authorship Analysis theory and Natural Language Processing(NLP) based on Convolutional Neural Network(CNN). We apply Authorship Analysis theory to extract features for author identification in the source code, and combine them with the features being used text mining to perform author identification using machine learning. In addition, applying CNN based natural language processing method to source code for code author classification. Therefore, we propose a framework for the identification of authors using the Authorship Analysis theory and the CNN. In order to identify the author, we need special features for identifying the authors only, and the NLP method based on the CNN is able to apply language with a special system such as source code and identify the author. identification accuracy based on Authorship Analysis theory is 95.1% and identification accuracy applied to CNN is 98%.

Trend Forecasting and Analysis of Quantum Computer Technology (양자 컴퓨터 기술 트렌드 예측과 분석)

  • Cha, Eunju;Chang, Byeong-Yun
    • Journal of the Korea Society for Simulation
    • /
    • v.31 no.3
    • /
    • pp.35-44
    • /
    • 2022
  • In this study, we analyze and forecast quantum computer technology trends. Previous research has been mainly focused on application fields centered on technology for quantum computer technology trends analysis. Therefore, this paper analyzes important quantum computer technologies and performs future signal detection and prediction, for a more market driven technical analysis and prediction. As analyzing words used in news articles to identify rapidly changing market changes and public interest. This paper extends conference presentation of Cha & Chang (2022). The research is conducted by collecting domestic news articles from 2019 to 2021. First, we organize the main keywords through text mining. Next, we explore future quantum computer technologies through analysis of Term Frequency - Inverse Document Frequency(TF-IDF), Key Issue Map(KIM), and Key Emergence Map (KEM). Finally, the relationship between future technologies and supply and demand is identified through random forests, decision trees, and correlation analysis. As results of the study, the interest in artificial intelligence was the highest in frequency analysis, keyword diffusion and visibility analysis. In terms of cyber-security, the rate of mention in news articles is getting overwhelmingly higher than that of other technologies. Quantum communication, resistant cryptography, and augmented reality also showed a high rate of increase in interest. These results show that the expectation is high for applying trend technology in the market. The results of this study can be applied to identifying areas of interest in the quantum computer market and establishing a response system related to technology investment.

Information types and characteristics within the Wireless Emergency Alert in COVID-19: Focusing on Wireless Emergency Alerts in Seoul (코로나 19 하에서 재난문자 내의 정보유형 및 특성: 서울특별시 재난문자를 중심으로)

  • Yoon, Sungwook;Nam, Kihwan
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.45-68
    • /
    • 2022
  • The central and local governments of the Republic of Korea provided information necessary for disaster response through wireless emergency alerts (WEAs) in order to overcome the pandemic situation in which COVID-19 rapidly spreads. Among all channels for delivering disaster information, wireless emergency alert is the most efficient, and since it adopts the CBS(Cell Broadcast Service) method that broadcasts directly to the mobile phone, it has the advantage of being able to easily access disaster information through the mobile phone without the effort of searching. In this study, the characteristics of wireless emergency alerts sent to Seoul during the past year and one month (January 2020 to January 2021) were derived through various text mining methodologies, and various types of information contained in wireless emergency alerts were analyzed. In addition, it was confirmed through the population mobility by age in the districts of Seoul that what kind of influence it had on the movement behavior of people. After going through the process of classifying key words and information included in each character, text analysis was performed so that individual sent characters can be used as an analysis unit by applying a document cluster analysis technique based on the included words. The number of WEAs sent to the Seoul has grown dramatically since the spread of Covid-19. In January 2020, only 10 WEAs were sent to the Seoul, but the number of the WEAs increased 5 times in March, and 7.7 times over the previous months. Since the basic, regional local government were authorized to send wireless emergency alerts independently, the sending behavior of related to wireless emergency alerts are different for each local government. Although most of the basic local governments increased the transmission of WEAs as the number of confirmed cases of Covid-19 increases, the trend of the increase in WEAs according to the increase in the number of confirmed cases of Covid-19 was different by region. By using structured econometric model, the effect of disaster information included in wireless emergency alerts on population mobility was measured by dividing it into baseline effect and accumulating effect. Six types of disaster information, including date, order, online URL, symptom, location, normative guidance, were identified in WEAs and analyzed through econometric modelling. It was confirmed that the types of information that significantly change population mobility by age are different. Population mobility of people in their 60s and 70s decreased when wireless emergency alerts included information related to date and order. As date and order information is appeared in WEAs when they intend to give information about Covid-19 confirmed cases, these results show that the population mobility of higher ages decreased as they reacted to the messages reporting of confirmed cases of Covid-19. Online information (URL) decreased the population mobility of in their 20s, and information related to symptoms reduced the population mobility of people in their 30s. On the other hand, it was confirmed that normative words that including the meaning of encouraging compliance with quarantine policies did not cause significant changes in the population mobility of all ages. This means that only meaningful information which is useful for disaster response should be included in the wireless emergency alerts. Repeated sending of wireless emergency alerts reduces the magnitude of the impact of disaster information on population mobility. It proves indirectly that under the prolonged pandemic, people started to feel tired of getting repetitive WEAs with similar content and started to react less. In order to effectively use WEAs for quarantine and overcoming disaster situations, it is necessary to reduce the fatigue of the people who receive WEA by sending them only in necessary situations, and to raise awareness of WEAs.