• Title/Summary/Keyword: 빈도 기반 텍스트 분석

Search Result 106, Processing Time 0.026 seconds

Knowledge Trend Analysis of Uncertainty in Biomedical Scientific Literature (생의학 학술 문헌의 불확실성 기반 지식 동향 분석에 관한 연구)

  • Heo, Go Eun;Song, Min
    • Journal of the Korean Society for information Management
    • /
    • v.36 no.2
    • /
    • pp.175-199
    • /
    • 2019
  • Uncertainty means incomplete stages of knowledge of propositions due to the lack of consensus of information and existing knowledge. As the amount of academic literature increases exponentially over time, new knowledge is discovered as research develops. Although the flow of time may be an important factor to identify patterns of uncertainty in scientific knowledge, existing studies have only identified the nature of uncertainty based on the frequency in a particular discipline, and they did not take into consideration of the flow of time. Therefore, in this study, we identify and analyze the uncertainty words that indicate uncertainty in the scientific literature and investigate the stream of knowledge. We examine the pattern of biomedical knowledge such as representative entity pairs, predicate types, and entities over time. We also perform the significance testing using linear regression analysis. Seven pairs out of 17 entity pairs show the significant decrease pattern statistically and all 10 representative predicates decrease significantly over time. We analyze the relative importance of representative entities by year and identify entities that display a significant rising and falling pattern.

A Study on the Factors of Well-aging through Big Data Analysis : Focusing on Newspaper Articles (빅데이터 분석을 활용한 웰에이징 요인에 관한 연구 : 신문기사를 중심으로)

  • Lee, Chong Hyung;Kang, Kyung Hee;Kim, Yong Ha;Lim, Hyo Nam;Ku, Jin Hee;Kim, Kwang Hwan
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.22 no.5
    • /
    • pp.354-360
    • /
    • 2021
  • People hope to live a healthy and happy life achieving satisfaction by striking a good work-life balance. Therefore, there is a growing interest in well-aging which means living happily to a healthy old age without worry. This study identified important factors related to well-aging by analyzing news articles published in Korea. Using Python-based web crawling, 1,199 articles were collected on the news service of portal site Daum till November 2020, and 374 articles were selected which matched the subject of the study. The frequency analysis results of text mining showed keywords such as 'elderly', 'health', 'skin', 'well-aging', 'product', 'person', 'aging', 'female', 'domestic' and 'retirement' as important keywords. Besides, a social network analysis with 45 important keywords revealed strong connections in the order of 'skin-wrinkle', 'skin-aging' and 'old-health'. The result of the CONCOR analysis showed that 45 main keywords were composed of eight clusters of 'life and happiness', 'disease and death', 'nutrition and exercise', 'healing', 'health', and 'elderly services'.

Developing an Intelligent System for the Analysis of Signs Of Disaster (인적재난사고사례기반의 새로운 재난전조정보 등급판정 연구)

  • Lee, Young Jai
    • Journal of Korean Society of societal Security
    • /
    • v.4 no.2
    • /
    • pp.29-40
    • /
    • 2011
  • The objective of this paper is to develop an intelligent decision support system that is able to advise disaster countermeasures and degree of incidents on the basis of the collected and analyzed signs of disasters. The concepts derived from ontology, text mining and case-based reasoning are adapted to design the system. The functions of this system include term-document matrix, frequency normalization, confidency, association rules, and criteria for judgment. The collected qualitative data from signs of new incidents are processed by those functions and are finally compared and reasoned to past similar disaster cases. The system provides the varying degrees of how dangerous the new signs of disasters are and the few countermeasures to the disaster for the manager of disaster management. The system will be helpful for the decision-maker to make a judgment about how much dangerous the signs of disaster are and to carry out specific kinds of countermeasures on the disaster in advance. As a result, the disaster will be prevented.

  • PDF

Study on Application of Big Data in Packaging (패키징(Packaging) 분야에서의 빅데이터(Big data) 적용방안 연구)

  • Kang, WookGeon;Ko, Euisuk;Shim, Woncheol;Lee, Hakrae;Kim, Jaineung
    • KOREAN JOURNAL OF PACKAGING SCIENCE & TECHNOLOGY
    • /
    • v.23 no.3
    • /
    • pp.201-209
    • /
    • 2017
  • The Big Data, the element of the Fourth Industrial Revolution, is drawing attention as the 4th Industrial Revolution is mentioned in the 2016 World Economic Forum. Big Data is being used in various fields because it predicts the near future and can create new business. However, utilization and research in the field of packaging are lacking. Today packaging has been demanded marketing elements that effect on consumer choice. Big data is actively used in marketing. In the marketing field, big data can be used to analyze sales information and consumer reactions to produce meaningful results. Therefore, this study proposed a method of applying big data in the field of packaging focusing on marketing. In this study suggest that try to utilize the private data and community data to analyze interaction between consumers and products. Using social big data will enable to understand the preferred packaging and consumer perceptions and emotions in the same product line. It can also be used to analyze the effects of packaging among various components of the product. Packaging is one of the many components of the product. Therefore, it is not easy to understand the impact of a single packaging element. However, this study presents the possibility of using Big Data to analyze the perceptions and feelings of consumers about packaging.

Analyzing the Effect of Characteristics of Dictionary on the Accuracy of Document Classifiers (용어 사전의 특성이 문서 분류 정확도에 미치는 영향 연구)

  • Jung, Haegang;Kim, Namgyu
    • Management & Information Systems Review
    • /
    • v.37 no.4
    • /
    • pp.41-62
    • /
    • 2018
  • As the volume of unstructured data increases through various social media, Internet news articles, and blogs, the importance of text analysis and the studies are increasing. Since text analysis is mostly performed on a specific domain or topic, the importance of constructing and applying a domain-specific dictionary has been increased. The quality of dictionary has a direct impact on the results of the unstructured data analysis and it is much more important since it present a perspective of analysis. In the literature, most studies on text analysis has emphasized the importance of dictionaries to acquire clean and high quality results. However, unfortunately, a rigorous verification of the effects of dictionaries has not been studied, even if it is already known as the most essential factor of text analysis. In this paper, we generate three dictionaries in various ways from 39,800 news articles and analyze and verify the effect each dictionary on the accuracy of document classification by defining the concept of Intrinsic Rate. 1) A batch construction method which is building a dictionary based on the frequency of terms in the entire documents 2) A method of extracting the terms by category and integrating the terms 3) A method of extracting the features according to each category and integrating them. We compared accuracy of three artificial neural network-based document classifiers to evaluate the quality of dictionaries. As a result of the experiment, the accuracy tend to increase when the "Intrinsic Rate" is high and we found the possibility to improve accuracy of document classification by increasing the intrinsic rate of the dictionary.

A Comparative Study on the Social Awareness of Metaverse in Korea and China: Using Big Data Analysis (한국과 중국의 메타버스에 관한 사회적 인식의 비교연구: 빅데이터 분석의 활용 )

  • Ki-youn Kim
    • Journal of Internet Computing and Services
    • /
    • v.24 no.1
    • /
    • pp.71-86
    • /
    • 2023
  • The purpose of this exploratory study is to compare the differences in public perceptual characteristics of Korean and Chinese societies regarding the metaverse using big data analysis. Due to the environmental impact of the COVID-19 pandemic, technological progress, and the expansion of new consumer bases such as generation Z and Alpha, the world's interest in the metaverse is drawing attention, and related academic studies have been also in full swing from 2021. In particular, Korea and China have emerged as major leading countries in the metaverse industry. It is a timely research question to discover the difference in social awareness using big data accumulated in both countries at a time when the amount of mentions on the metaverse has skyrocketed. The analysis technique identifies the importance of key words by analyzing word frequency, N-gram, and TF-IDF of clean data through text mining analysis, and analyzes the density and centrality of semantic networks to determine the strength of connection between words and their semantic relevance. Python 3.9 Anaconda data science platform 3 and Textom 6 versions were used, and UCINET 6.759 analysis and visualization were performed for semantic network analysis and structural CONCOR analysis. As a result, four blocks, each of which are similar word groups, were driven. These blocks represent different perspectives that reflect the types of social perceptions of the metaverse in both countries. Studies on the metaverse are increasing, but studies on comparative research approaches between countries from a cross-cultural aspect have not yet been conducted. At this point, as a preceding study, this study will be able to provide theoretical grounds and meaningful insights to future studies.

Trend Analysis of Barrier-free Academic Research using Text Mining and CONCOR (텍스트 마이닝과 CONCOR을 활용한 배리어 프리 학술연구 동향 분석)

  • Jeong-Ki Lee;Ki-Hyok Youn
    • Journal of Internet of Things and Convergence
    • /
    • v.9 no.2
    • /
    • pp.19-31
    • /
    • 2023
  • The importance of barrier free is being highlighted worldwide. This study attempted to identify barrier-free research trends using text mining. Through this, it was intended to help with research and policies to create a barrier free environment. The analysis data is 227 papers published in domestic academic journals from 1996 when barrier free research began to 2022. The researcher converted the title, keywords, and abstract of an academic thesis into text, and then analyzed the pattern of the thesis and the meaning of the data. The summary of the research results is as follows. First, barrier-free research began to increase after 2009, with an annual average of 17.1 papers being published. This is related to the implementation guidelines for the barrier-free certification system that took effect on July 15, 2008. Second, results of barrier-free text mining i) As a result of word frequency analysis of top keywords, important keywords such as barrier free, disabled, design, universal design, access, elderly, certification, improvement, evaluation, and space, facility, and environment were searched. ii) As a result of TD-IDF analysis, the main keywords were universal design, design, certification, house, access, elderly, installation, disabled, park, evaluation, architecture, and space. iii) As a result of N-Ggam analysis, barrier free+certification, barrier free+design, barrier free+barrier free, elderly+disabled, disabled+elderly, disabled+convenience facilities, the disabled+the elderly, society+the elderly, convenience facilities+installation, certification+evaluation index, physical+environment, life+quality, etc. appeared in a related language. Third, as a result of the CONCOR analysis, cluster 1 was barrier-free issues and challenges, cluster 2 was universal design and space utilization, cluster 3 was Improving Accessibility for the Disabled, and cluster 4 was barrier free certification and evaluation. Based on the analysis results, this study presented policy implications for vitalizing barrier-free research and establishing a desirable barrier free environment.

Analysis of University Department Name using the R (R을 이용한 대학의 학과 명칭 분석)

  • Ban, ChaeHoon;Kim, Dong Hyun;Ha, JongSoo
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.22 no.6
    • /
    • pp.829-834
    • /
    • 2018
  • As the IT technology is progressing, the big data becomes more important and is exploited on the various industry. The R is the language and the environment analyzing the big data. The university which is the highest level of the academic organization keeps opening and maintaining the departments anticipating the needs of the progressing trends. As analyzing the names of the departments opened at the universities, it is possible to find out the requirements and the needs of the recent trends. In this paper, we analyze the names of the departments presented at the 4 year universities using the R. To do this, we collect the names of the departments and measure the frequency of the names in order to know the department of major frequently presented at the universities.

A Convergence Study for Development of Psychological Language Analysis Program: Comparison of Existing Programs and Trend Analysis of Related Literature (심리학적 언어분석 프로그램 개발을 위한 융합연구: 기존 프로그램의 비교와 관련 문헌의 동향 분석)

  • Kim, Youngjun;Choi, Wonil;Kim, Tae Hoon
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.11
    • /
    • pp.1-18
    • /
    • 2021
  • While content word-based frequency analysis has obvious limitations to intentional deception or irony, KLIWC has evolved into functional word analysis and KrKwic has evolved as a way to visualize co-occurrence frequencies. However, after more than 10 years of development, several issues still need improvement. Therefore, we tried to develop a new psychological language analysis program by analyzing KLIWC and KrKwic. First, the two programs were analyzed. In particular, the morpheme classification of KLIWC and the Korean morpheme analyzer was compared to enhance the functional word analysis function, and the psychological dictionary were analyzed to strengthen the psychological analysis. As a result of the analysis, the Hannanum part-of-speech analyzer was the most subdivided, but KLIWC for personal pronouns and KKMA for endings and endings were more subdivided, suggesting the integrated use of multiple part-of-speech analyzers to strengthen functional word analysis. Second, the research trends of studies that analyzed texts with these programs were analyzed. As a result of the analysis, the two programs were used in various academic fields, including the field of Interdisciplinary Studies. In particular, KrKwic was used a lot for the analysis of papers and reports, and KLIWC was used a lot for the comparative study of the writer's thoughts, emotions, and personality. Based on these results, the necessity and direction of development of a new psychological language analysis program were suggested.

Technology Development Strategy of Piggyback Transportation System Using Topic Modeling Based on LDA Algorithm

  • Jun, Sung-Chan;Han, Seong-Ho;Kim, Sang-Baek
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.12
    • /
    • pp.261-270
    • /
    • 2020
  • In this study, we identify promising technologies for Piggyback transportation system by analyzing the relevant patent information. In order for this, we first develop the patent database by extracting relevant technology keywords from the pioneering research papers for the Piggyback flactcar system. We then employed textmining to identify the frequently referred words from the patent database, and using these words, we applied the LDA (Latent Dirichlet Allocation) algorithm in order to identify "topics" that are corresponding to "key" technologies for the Piggyback system. Finally, we employ the ARIMA model to forecast the trends of these "key" technologies for technology forecasting, and identify the promising technologies for the Piggyback system. with keyword search method the patent analysis. The results show that data-driven integrated management system, operation planning system and special cargo (especially fluid and gas) handling/storage technologies are identified to be the "key" promising technolgies for the future of the Piggyback system, and data reception/analysis techniques must be developed in order to improve the system performance. The proposed procedure and analysis method provides useful insights to develop the R&D strategy and the technology roadmap for the Piggyback system.