• Title/Summary/Keyword: word-cloud

Search Result 179, Processing Time 0.022 seconds

A Study on Unstructured text data Post-processing Methodology using Stopword Thesaurus (불용어 시소러스를 이용한 비정형 텍스트 데이터 후처리 방법론에 관한 연구)

  • Won-Jo Lee
    • The Journal of the Convergence on Culture Technology
    • /
    • v.9 no.6
    • /
    • pp.935-940
    • /
    • 2023
  • Most text data collected through web scraping for artificial intelligence and big data analysis is generally large and unstructured, so a purification process is required for big data analysis. The process becomes structured data that can be analyzed through a heuristic pre-processing refining step and a post-processing machine refining step. Therefore, in this study, in the post-processing machine refining process, the Korean dictionary and the stopword dictionary are used to extract vocabularies for frequency analysis for word cloud analysis. In this process, "user-defined stopwords" are used to efficiently remove stopwords that were not removed. We propose a methodology for applying the "thesaurus" and examine the pros and cons of the proposed refining method through a case analysis using the "user-defined stop word thesaurus" technique proposed to complement the problems of the existing "stop word dictionary" method with R's word cloud technique. We present comparative verification and suggest the effectiveness of practical application of the proposed methodology.

Intelligent Wordcloud Using Text Mining (텍스트 마이닝을 이용한 지능적 워드클라우드)

  • Kim, Yeongchang;Ji, Sangsu;Park, Dongseo;Lee, Choong Ho
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2019.05a
    • /
    • pp.325-326
    • /
    • 2019
  • This paper proposes an intelligent word cloud by improving the existing method of representing word cloud by examining the frequency of nouns with text mining technique. In this paper, we propose a method to visually show word clouds focused on other parts, such as verbs, by effectively adding newly-coined words and the like to a dictionary that extracts noun words in text mining. In the experiment, the KoNLP package was used for extracting the frequency of existing nouns, and 80 new words that were not supported were added manually by examining frequency.

  • PDF

An Overview of Data Security Algorithms in Cloud Computing

  • D. I. George Amalarethinam;S. Edel Josephine Rajakumari
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.5
    • /
    • pp.65-72
    • /
    • 2023
  • Cloud Computing is one of the current research areas in computer science. Recently, Cloud is the buzz word used everywhere in IT industries; It introduced the notion of 'pay as you use' and revolutionized developments in IT. The rapid growth of modernized cloud computing leads to 24×7 accessing of e-resources from anywhere at any time. It offers storage as a service where users' data can be stored on a cloud which is managed by a third party who is called Cloud Service Provider (CSP). Since users' data are managed by a third party, it must be encrypted ensuring confidentiality and privacy of the data. There are different types of cryptographic algorithms used for cloud security; in this article, the algorithms and their security measures are discussed.

Analysis of key words published with the Korea Society of Emergency Medical Services journal using text mining (텍스트마이닝을 이용한 한국응급구조학회지 중심단어 분석)

  • Kwon, Chan-Yang;Yang, Hyun-Mo
    • The Korean Journal of Emergency Medical Services
    • /
    • v.24 no.1
    • /
    • pp.85-92
    • /
    • 2020
  • Purpose: The purpose of this study was to analyze the English abstract key words found within the Korea Society of Emergency Medical Services journal using text mining techniques to determine the adherence of these terms with Medical Subject Headings (MeSH) and identify key word trends. Methods: We analyzed 212 papers that were published from 2012 to 2019. R software, web scraping, and frequency analysis of key words were conducted using R's basic and text mining packages. Additionally, the Word Clouds package was used for visualization. Results: The average number of key words used per study was 3.9. Word cloud visualization revealed that CPR was most prominent in the first half and emergency medical technician was most frequently used during the second half. There were a total of 542 (64.9%) words that exactly matched the MeSH listed words. A total of 293 (35%) key words did not match MeSH listed words. Conclusion: Researchers should obey submission rules. Further, journals should update their respective submission rules. MeSH key words that are frequently cited should be suggested for use.

Analysis of Inauguration Address of Previous Korean Presidents Based on Network (네트워크 기반 대한민국 역대 대통령 취임사 분석)

  • Kim, Hak Yong
    • The Journal of the Korea Contents Association
    • /
    • v.21 no.11
    • /
    • pp.11-19
    • /
    • 2021
  • The presidential inaugural address is a very useful means of presenting the national vision and conveying the president's political philosophy and policy direction to the people. For this reason, analyzing the address will help to understand the president him/herself and the presidential times. The address can be analyzed in various academic fields, but in this study, it was considered as only content and analyzed based on the network. It is widely used for word cloud analysis based on the frequency of words appearing in the address. If it is analyzed based on a network, it will be a useful method because it is possible to derive the context contained in the sentence. The entire network of the addresses of past presidents of the Republic of Korea was established and structural factors were presented. The president and political direction were derived by comparatively analyzing the key words derived from the network and the word cloud. The characteristics of the address were presented by comparing and analyzing key words and closeness centrality, which is a structural factor of the network, by constructing a network of each president's inaugural address. It is expected that the network-based analysis of past presidential inaugural addresses can ultimately be used as data for understanding and evaluating presidents.

Text Mining of Successful Casebook of Agricultural Settlement in Graduates of Korea National College of Agriculture and Fisheries - Frequency Analysis and Word Cloud of Key Words - (한국농수산대학 졸업생 영농정착 성공 사례집의 Text Mining - 주요단어의 빈도 분석 및 word cloud -)

  • Joo, J.S.;Kim, J.S.;Park, S.Y.;Song, C.Y.
    • Journal of Practical Agriculture & Fisheries Research
    • /
    • v.20 no.2
    • /
    • pp.57-72
    • /
    • 2018
  • In order to extract meaningful information from the excellent farming settlement cases of young farmers published by KNCAF, we studied the key words with text mining and created a word cloud for visualization. First, in the text mining results for the entire sample, the words 'CEO', 'corporate executive', 'think', 'self', 'start', 'mind', and 'effort' are the words with high frequency among the top 50 core words. Their ability to think, judge and push ahead with themselves is a result of showing that they have ability of to be managers or managers. And it is a expression of how they manages to achieve their dream without giving up their dream. The high frequency of words such as "father" and "parent" is due to the high ratio of parents' cooperation and succession. Also 'KNCAF', 'university', 'graduation' and 'study' are the results of their high educational awareness, and 'organic farming' and 'eco-friendly' are the result of the interest in eco-friendly agriculture. In addition, words related to the 6th industry such as 'sales' and 'experience' represent their efforts to revitalize farming and fishing villages. Meanwhile, 'internet', 'blog', 'online', 'SNS', 'ICT', 'composite' and 'smart' were not included in the top 50. However, the fact that these words were extracted without omission shows that young farmers are increasingly interested in the scientificization and high-tech of agriculture and fisheries Next, as a result of grouping the top 50 key words by crop, the words 'facilities' in livestock, vegetables and aquatic crops, the words 'equipment' and 'machine' in food crops were extracted as main words. 'Eco-friendly' and 'organic' appeared in vegetable crops and food crops, and 'organic' appeared in fruit crops. The 'worm' of eco-friendly farming method appeared in the food crops, and the 'certification', which means excellent agricultural and marine products, appeared only in the fishery crops. 'Production', which is related to '6th industry', appeared in all crops, 'processing' and 'distribution' appeared in the fruit crops, and 'experience' appeared in the vegetable crops, food crops and fruit crops. To visualize the extracted words by text mining, we created a word cloud with the entire samples and each crop sample. As a result, we were able to judge the meaning of excellent practices, which are unstructured text, by character size.

Evaluation of Facilitating Factors for Cloud Service by Delphi Method (델파이 기법을 이용한 클라우드 서비스의 개념 정의와 활성화 요인 분석)

  • Suh, Jung-Han;Chang, Suk-Gwon
    • Journal of Information Technology Services
    • /
    • v.11 no.2
    • /
    • pp.107-118
    • /
    • 2012
  • Recently, as the clouding computing begins to receive a great attention from people all over the world, it became the most popular buzz word in recent IT magazines or journal and heard it in many different services or different fields. However, a notion of the cloud service is defined vaguely compared to increasing attentions from others. Generally the cloud service could be understood as a specific service model base on the clouding computing, but the cloud, the cloud computing, the cloud computing service and cloud service, these four all terms are often used without any distinction of its notions and characteristics so that it's difficult to define the exact nature of the cloud service. To explore and analyze the cloud service systematically, an accurate conception and scope have to be preceded. Therefore this study is to firstly clarify its definition by Delpi method using expert group and then tries to provide the foundation needed to enable relative research such as establishing business model or value chain and policies for its activation to set off. For the Delpi, 16 experts participated in several surveys from different fields such industry, academy and research sector. As a result of the research, Characteristics of the Cloud Service are followings : Pay per use, Scalability, Internet centric Virtualization. And the scope as defined including Grid Computing, Utility Computing, Server Based Computing, Network Computing.

Research Trend Analysis by using Text-Mining Techniques on the Convergence Studies of AI and Healthcare Technologies (텍스트 마이닝 기법을 활용한 인공지능과 헬스케어 융·복합 분야 연구동향 분석)

  • Yoon, Jee-Eun;Suh, Chang-Jin
    • Journal of Information Technology Services
    • /
    • v.18 no.2
    • /
    • pp.123-141
    • /
    • 2019
  • The goal of this study is to review the major research trend on the convergence studies of AI and healthcare technologies. For the study, 15,260 English articles on AI and healthcare related topics were collected from Scopus for 55 years from 1963, and text mining techniques were conducted. As a result, seven key research topics were defined : "AI for Clinical Decision Support System (CDSS)", "AI for Medical Image", "Internet of Healthcare Things (IoHT)", "Big Data Analytics in Healthcare", "Medical Robotics", "Blockchain in Healthcare", and "Evidence Based Medicine (EBM)". The result of this study can be utilized to set up and develop the appropriate healthcare R&D strategies for the researchers and government. In this study, text mining techniques such as Text Analysis, Frequency Analysis, Topic Modeling on LDA (Latent Dirichlet Allocation), Word Cloud, and Ego Network Analysis were conducted.

A Design and Implementation of Disaster Text Crawling and Visualization Application (재난 문자 크롤링 및 시각화 애플리케이션 설계 및 구현)

  • Lee, Won Joo;Park, Bong Kyun;Park, Mun Kyu
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2021.01a
    • /
    • pp.89-90
    • /
    • 2021
  • 본 논문에서는 Python과 Selenium 라이브러리 기반의 재난 문자 크롤링 및 데이터 시각화 애플리케이션을 설계하고 구현한다. 이 애플리케이션의 특징은 웹상에서 재난 문자 데이터를 크롤링(Crawling)하여 그 빈도수에 따라 시각화하는 것이다. 이 애플리케이션을 활용하여 국민재난안전포털에 접속하여 재난 문자 데이터를 크롤링하고, 그 데이터를 Word Cloud를 활용하여 지역별 재난 문자 빈도수를 시각화한다. 지역별 재난 문자 빈도수를 한눈에 보기 쉽게 시각화함으로써 재난문자를 잘 확인하지 않는 사람들에게 해당 지역의 재난 정보를 쉽게 전달하는 기능을 제공한다.

  • PDF

Analysis of Laughter Therapy Trend Using Text Network Analysis and Topic Modeling

  • LEE, Do-Young
    • Journal of Wellbeing Management and Applied Psychology
    • /
    • v.5 no.4
    • /
    • pp.33-37
    • /
    • 2022
  • Purpose: This study aims to understand the trend and central concept of domestic researches on laughter therapy. For the analysis, this study used total 72 theses verified by inputting the keyword 'laughter therapy' from 2007 to 2021. Research design, data and methodology: This study performed the development and analysis of keyword co-occurrence network, analyzed the types of researches through topic modeling, and verified the visualized word cloud and sociogram. The keyword data that was cleaned through preprocessing, was analyzed in the method of centrality analysis and topic modeling through the 1-mode matrix conversion process by using the NetMiner (version 4.4) Program. Results: The keywords that most appeared for last 14 years were laughter therapy, depression, the elderly, and stress. The five topics analyzed in thesis data from 2007 to 2021 were therapy, cognitive behavior, quality of life, stress, and the elderly. Conclusions: This study understood the flow and trend of research topics of domestic laughter therapy for last 14 years, and there should be continuous researches on laughter therapy, which reflects the flow of time in the future.