• Title/Summary/Keyword: wordcloud analysis

Search Result 25, Processing Time 0.028 seconds

100 Article Paper Text Minning Data Analysis and Visualization in Web Environment (웹 환경에서 100 논문에 대한 텍스트 마이닝, 데이터 분석과 시각화)

  • Li, Xiaomeng;Li, Jiapei;Lee, HyunChang;Shin, SeongYoon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2017.10a
    • /
    • pp.157-158
    • /
    • 2017
  • There is a method to analyze the big data of the article and text mining by using Python language. And Python is a kind of programming language and it is easy to operating. Reaserch and use Python to creat a Web environment that the research result of the analysis can show directly on the browser. In this thesis, there are 100 article paper frrom Altmetric, Altmetric tracks a range of sources to capture. It is necessary to collect and analyze the big data use an effictive method, After the result coming out, Use Python wordcloud to make a directive image that can show the highest frequency of words.

  • PDF

A Study on the Use of Stopword Corpus for Cleansing Unstructured Text Data (비정형 텍스트 데이터 정제를 위한 불용어 코퍼스의 활용에 관한 연구)

  • Lee, Won-Jo
    • The Journal of the Convergence on Culture Technology
    • /
    • v.8 no.6
    • /
    • pp.891-897
    • /
    • 2022
  • In big data analysis, raw text data mostly exists in various unstructured data forms, so it becomes a structured data form that can be analyzed only after undergoing heuristic pre-processing and computer post-processing cleansing. Therefore, in this study, unnecessary elements are purified through pre-processing of the collected raw data in order to apply the wordcloud of R program, which is one of the text data analysis techniques, and stopwords are removed in the post-processing process. Then, a case study of wordcloud analysis was conducted, which calculates the frequency of occurrence of words and expresses words with high frequency as key issues. In this study, to improve the problems of the "nested stopword source code" method, which is the existing stopword processing method, using the word cloud technique of R, we propose the use of "general stopword corpus" and "user-defined stopword corpus" and conduct case analysis. The advantages and disadvantages of the proposed "unstructured data cleansing process model" are comparatively verified and presented, and the practical application of word cloud visualization analysis using the "proposed external corpus cleansing technique" is presented.

A Study on Improving the Satisfaction of Non-face-to-face Video Lectures Using IPA Analysis (IPA 분석법을 활용한 비대면 동영상 강의 만족도 제고 방안 연구)

  • Jung, Dae-Hyun;Kim, Jin-Sung
    • The Journal of Information Systems
    • /
    • v.29 no.4
    • /
    • pp.45-56
    • /
    • 2020
  • Purpose The purpose of this study is to present the direction of efficient e-learning education through the importance and satisfaction survey of learners of non-face-to-face video lectures. Therefore, by grasping the degree of satisfaction of the importance ratio through the IPA analysis method, we try to present improvement measures for insufficient education methods. Design/methodology/approach For IPA analysis, we conducted an online survey of four universities and analyzed 154 samples. The analysis method used SPSS, and through the wordcloud analysis method of R, the suggestions for the non-face-to-face lecture method felt by learners were analyzed to derive implications for improving the quality of education. Findings As a result of the overall satisfaction survey for the entire non-face-to-face class, the factors with the greatest dissatisfaction are listed as follows. Complaints about the adequacy of learning materials and activities (quiz, discussion, assignments, etc.), Complaints about how to use the produced content, and complaints about announcements about class management (lecture schedule, lecture method) were identified in order. The factors of dissatisfaction were clear in the non-face-to-face class where interactive communication was impossible or insufficient. In addition to the lack of quick Q&A, there seems to have been a phenomenon of some neglect.

Text Mining Analysis Technique on ECDIS Accident Report (텍스트 마이닝 기법을 활용한 ECDIS 사고보고서 분석)

  • Lee, Jeong-Seok;Lee, Bo-Kyeong;Cho, Ik-Soon
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.25 no.4
    • /
    • pp.405-412
    • /
    • 2019
  • SOLAS requires that ECDIS be installed on ships of more than 500 gross tonnage engaged in international navigation until the first inspection arriving after July 1, 2018. Several accidents related to the use of ECDIS have occurred with its installation as a new major navigation instrument. The 12 incident reports issued by MAIB, BSU, BEAmer, DMAIB, and DSB were analyzed, and the cause of accident was determined to be related to the operation of the navigator and the ECDIS system. The text was analyzed using the R-program to quantitatively analyze words related to the cause of the accident. We used text mining techniques such as Wordcloud, Wordnetwork and Wordweight to represent the importance of words according to their frequency of derivation. Wordcloud uses the N-gram model as a way of expressing the frequency of used words in cloud form. As a result of the uni-gram analysis of the N-gram model, ECDIS words were obtained the most, and the bi-gram analysis results showed that the word "Safety Contour" was used most frequently. Based on the bi-gram analysis, the causative words are classified into the officer and the ECDIS system, and the related words are represented by Wordnetwork. Finally, the related words with the of icer and the ECDIS system were composed of word corpus, and Wordweight was applied to analyze the change in corpus frequency by year. As a result of analyzing the tendency of corpus variation with the trend line graph, more recently, the corpus of the officer has decreased, and conversely, the corpus of the ECDIS system is gradually increasing.

Necessity of the Physical Distribution Cooperation to Enhance Competitive Capabilities of Healthcare SCM -Bigdata Business Model's Viewpoint- (의료 SCM 경쟁역량 강화를 위한 물류공동화 도입 필요성 -빅데이터 비즈니스 모델 관점-)

  • Park, Kwang-O;Jung, Dae-Hyun;Kwon, Sang-Min
    • Management & Information Systems Review
    • /
    • v.39 no.3
    • /
    • pp.17-35
    • /
    • 2020
  • The purpose of this study is to develop business models for current situational scenarios reflecting customer needs emphasize the need for implementing a logistics cooperation system by analyzing big data to strengthen SCM competitiveness capacities. For healthcare SCM competitiveness needed for the logistics cooperation usage intent, they were divided into product quality, price leadership, hand-over speed, and process flexibility for examination. The wordcloud results that analyzed major considerations to realize work efficiency between medical institutes, words like unexpected situations, information sharing, delivery, real-time, delivery, convenience, etc. were mentioned frequently. It can be analyzed as expressing the need to construct a system that can immediately respond to emergency situations on the weekends. Furthermore, in addition to pursuing communication and convenience, the importance of real-time information sharing that can share to the efficiency of inventory management were evident. Accordingly, it is judged that it is necessary to aim for a business model that can enhance visibility of the logistics pipeline in real-time using big data analysis on site. By analyzing the effects of the adaptability of a supply chain network for healthcare SCM competitiveness, it was revealed that obtaining competitive capacities is possible through the implementation of logistics cooperation. Stronger partnerships such as logistics cooperation will lead to SCM competitive capacities. It will be necessary to strengthen SCM competitiveness by searching for a strategic approach among companies in a direction that can promote mutual partnerships among companies using the joint logistics system of medical institutes. In particular, it will be necessary to search for ways to utilize HCSM through big data analysis according to the construction of a logistics cooperation system.

A Study on the Development of Korean Defense Standards through Text Mining-Based Trend Analysis of United States Defense Standards (텍스트 마이닝 기반의 미국 국방 표준 동향 분석을 통한 한국 국방 표준의 발전 방안 연구)

  • Chae, Soohwan;Shim, Bohyun;Yeom, Seulki;Hong, Seongdon
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.22 no.3
    • /
    • pp.651-660
    • /
    • 2021
  • This study examined the trend of standards established in the United States and to find points that can be applied to Korean defense standards. The titles of various United States defense standard documents registered on the web were selected for this research. The wordcloud was created after analyzing the frequency of words appearing in the title using text mining. The trend of words appearing in MIL-STD by era was obtained. This study identified words that appear often due to the format of the document itself, words that appear regularly throughout the era, words that are used frequently in the past but are not used much in the present, and words that did not receive attention in the past but appeared recurrently in the present. In addition, the characteristics of each document were derived through the wordcloud produced for various defense documents. In conclusion, Korean defense standards also require a consideration of safe and efficient management, transport, and load design of hazardous materials. Furthermore, the quality of defense standards can be expected to improve if the defense standard document system can be established, focusing on efficient management.

A case study of Digital humanities lecture on Marcel Proust's À La Recherche du temps perdu (마르셀 프루스트의 『잃어버린 시간을 찾아서』에 대한 디지털인문학적 강의 운영 사례 연구)

  • Jinyoung MIN
    • The Journal of the Convergence on Culture Technology
    • /
    • v.9 no.4
    • /
    • pp.269-275
    • /
    • 2023
  • In 2021, the 150th anniversary of Proust's birth, and in 2022, the 100th anniversary of his death, the interest in À la recherche du temps perdu increased. We took advantage of a digital humanities approach to make these seven novels known as difficult easily accessible to French literature major korean students. We let the students analyze using the analyzing tools for the big data and find some clues to understand the works through the visualized data. We picked out the main characters and places that appear in his works with Wordcloud, and checked the awareness of Proust in domestic and foreign through the various sites to analyze the big data, such as Big Kinds and Textom. Through the methodology of digital humanities, the students commented that they have gradually enlarged their understanding breadth for Proust's 『In Search of Lost Time』 rather than giving up it as difficult. This study confirmed that applying the big data analysis and digital humanities is an appropriate teaching method in finding ways for the students to broaden the understanding of French literature.

Analysis of Keywords and Language Networks of Pedagogical Problems in the Secondary-School Teacher's Employment Exam : Focusing on the 2019~2022 School Year Exam

  • Kwon, Choong-Hoon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.7
    • /
    • pp.115-124
    • /
    • 2022
  • The purpose of this study is to analyze and present keywords, trends, and language networks of keywords for each year of the pedagogical exam of the secondary teacher's employment exam for the 2019~2022 school year. The main research methods were text mining technique and language network analysis method, and analysis programs were KrKwic, Wordcloud Maker, Ucinet6, NetDraw, etc. The research results are as follows; First, keywords such as teacher, student, curriculum, class, and evaluation appeared in the top rankings, and keywords (online, wiki, discussion ceremony, information, etc.) that reflect the recent online class progress in the current COVID-19 situation also tended to appear. The keywords with high frequency of occurrence in the four-year integrated text were student(44), teacher(39), class(27), school(18), curriculum(16), online(10), and discussion method(8). Second, the overall language network of the keywords with high frequency of 4 years showed a significant level of density(0.566), total number of links(492), and average degree of links(16.4). The degree centrality was found in the order of teacher(199.0), class(197.0), student(185.0), and school(150.0). Betweenness centrality was found in the order of teacher(30.859), class(18.956), student(16.054), and school (15.745). It is expected that the results of this study will serve as data to be considered for preparatory teachers, institutions and related persons, and teachers and administrators of secondary school teacher training institutions.

Analyzing Research Trends of Food Tourism Using Text Mining Techniques (텍스트마이닝 기법을 활용한 국내 음식관광 연구 동향 분석)

  • Shin, Seo-Young;Lee, Bum-Jun
    • Journal of the Korean Society of Food Culture
    • /
    • v.35 no.1
    • /
    • pp.65-78
    • /
    • 2020
  • The objective of this study was to review and evaluate the growing subject of food tourism research, and thus identify the trend of food tourism research. Using a Text mining technique, this paper discovered the trends of the literature on food tourism that was published from 2004 to 2018. The study reviewed 201 articles that include the words 'food' and 'tourism' in their abstracts in the KCI database. The Wordscloud analysis results presented that the research subjects were predominantly 'Festival', 'Region', 'Culture', 'Tourist', but there was a slight difference in frequency according to the time period. Based on the main path analysis, we extracted the meaningful paths between the cited references published domestically, resulting in a total of 12 networks from 2004 to 2018. The Text network analysis indicated that the words with high centrality showed similarities and differences in the food tourism literature according to the time period, displaying them in a sociogram, a visualization tool. This study has implications that it offers a new perspective of comprehending the overall flow of relevant research.

Analysis of Descriptive Lecture Evaluation on Liberal Arts ICT utilization using Topic Modeling (토픽 모델링을 활용한 교양 ICT 활용과정 서술형 강의평가 분석)

  • Kim, HyoSook
    • Journal of Platform Technology
    • /
    • v.8 no.1
    • /
    • pp.33-40
    • /
    • 2020
  • The purpose of this study is to identify factors in selecting the elective ICT utilization lecture and to find positive and negative elements of the lecture through conducting topic modeling analysis of text mining of the narrative lecture evaluation. In order to do so, from pre-processing of data, keyword frequency analysis to wordcloud visualization and topic modeling analysis have been conducted from 'reasons of selecting the lecture,' 'improvements to be made on the lecture,' and 'what I liked about the lecture' categories regarding the ICT utilization lecture which was opened in the second semester of 2019 at M University. The analysis results show that students mostly registered for the ICT utilization lecture at M University to obtain a certificate and the fact being certified and taking the lecture can be done simultaneously is a positive element of taking the lecture. On the other hand, negative element included inconvenience of the classroom setting environment.

  • PDF