• Title/Summary/Keyword: Keywords Analysis

Search Result 1,488, Processing Time 0.024 seconds

Exploring Issues Related to the Metaverse from the Educational Perspective Using Text Mining Techniques - Focusing on News Big Data (텍스트마이닝 기법을 활용한 교육관점에서의 메타버스 관련 이슈 탐색 - 뉴스 빅데이터를 중심으로)

  • Park, Ju-Yeon;Jeong, Do-Heon
    • Journal of Industrial Convergence
    • /
    • v.20 no.6
    • /
    • pp.27-35
    • /
    • 2022
  • The purpose of this study is to analyze the metaverse-related issues in the news big data from an educational perspective, explore their characteristics, and provide implications for the educational applicability of the metaverse and future education. To this end, 41,366 cases of metaverse-related data searched on portal sites were collected, and weight values of all extracted keywords were calculated and ranked using TF-IDF, a representative term weight model, and then word cloud visualization analysis was performed. In addition, major topics were analyzed using topic modeling(LDA), a sophisticated probability-based text mining technique. As a result of the study, topics such as platform industry, future talent, and extension in technology were derived as core issues of the metaverse from an educational perspective. In addition, as a result of performing secondary data analysis under three key themes of technology, job, and education, it was found that metaverse has issues related to education platform innovation, future job innovation, and future competency innovation in future education. This study is meaningful in that it analyzes a vast amount of news big data in stages to draw issues from an education perspective and provide implications for future education.

A Study on the Research Trends of Archival Preservation Papers in Korea from 2000 to 2021 (국내 기록보존 연구동향 분석: 2000~2021년 학술논문을 중심으로)

  • Yonwhee, Na;Heejin, Park
    • Journal of Korean Society of Archives and Records Management
    • /
    • v.22 no.4
    • /
    • pp.175-196
    • /
    • 2022
  • This study aims to determine the research trends in archival preservation through keyword analysis, understand the current research status, and identify the research topics' changes over time. The degree and betweenness centrality analyses were conducted and visualized on 463 "archival preservation studies" articles published from 2000 to 2021 in various academic journals, using NetMiner 4.0. The collected research papers were divided into three time periods according to when they were published: the first period (2000-2007), the second period (2008-2014), and the third period (2015-2021). The subject keywords for the research papers on archival preservation in Korea that have influence and expandability are as follows. Across all periods, these were "electronic records" and "long-term preservation." In addition, if taken separately per period, the "OAIS reference model" and "electronic records" dominated the first and second periods, respectively, while the "records management standard table" and "long-term preservation" both dominated the third period. A conceptual framework and theory-oriented study for archival preservation, such as "digital preservation," "digitalization," and the "OAIS reference model," dominated the first period. During the second period, more research focused on procedures and practical applications related to conservation activities, such as "electronic record," "appraisal," and "DRAMBORA." In contrast, the majority of the research in the third period was on technical implementation according to the changes in the records management environment, such as "data set," "administrative information system," and "social media."

Analysis of domestic and foreign future automobile research trends based on topic modeling (토픽모델링 기반의 국내외 미래 자동차 연구동향 비교 분석: CASE 키워드 중심으로)

  • Jeong, Ho Jeong;Kim, Keun-Wook;Kim, Na-Gyeong;Chang, Won-Jun;Jeong, Won-Oong;Park, Dae-Yeong
    • Journal of Digital Convergence
    • /
    • v.20 no.5
    • /
    • pp.463-476
    • /
    • 2022
  • After industrialization in the past, the automobile industry has continued to grow centered on internal combustion engines, but is facing a major change with the recent 4th industrial revolution. Most companies are preparing for the transition to electric vehicles and autonomous driving. Therefore, in this study, topic modeling was performed based on LDA algorithm by collecting 4,002 domestic papers and 68,372 overseas papers that contain keywords related to CASE (Connectivity, Autonomous, Sharing, Electrification), which represent future automobile trends. As a result of the analysis, it was found that domestic research mainly focuses on macroscopic aspects such as traffic infrastructure, urban traffic efficiency, and traffic policy. Through this, the government's technical support for MaaS (Mobility-as-a-Service) is required in the domestic shared car sector, and the need for data opening by means of transportation was presented. It is judged that these analysis results can be used as basic data for the future automobile industry.

An Exploratory Study on the Learning Community: Focusing on the Covid19 Untact Era (배움공동체에 대한 탐색적 연구 : covid19 언택트시대를 중심으로)

  • Jeong, Su-Jeong;Im, Hong-Nam;Park, Hong-Jae
    • Journal of Convergence for Information Technology
    • /
    • v.12 no.5
    • /
    • pp.237-245
    • /
    • 2022
  • This study examines the social discourse on the characteristics of the learning community in the untact era, and discusses the directions that learning communities for children could explore and consider in the pandemic situation and beyond. For this purpose, big data for one year, from January 20, 2020 to January 20, 2021, were collected through internet portal sites (includingincluding Google News, Daum, Naver and other News surfaces), using two keywords "untact" and "learning community", and analyzed by employing a word frequency and network analysis method. The analysis results show that several important terms, such as 'village education community', 'operation', 'activity', 'corona 19', 'support', and 'online' are closely related to the learning community in the untact era. The findings from this study also have implications for developing the learning community as an alternative model to fill the existing gaps in public care and education for children during the prolonged pandemic and afterwards. In conclusion, the study findings highlight that it is meaningful to identify key terms and concepts through word frequency analysis in order to examine social trends and issues related to the learning community.

The Analysis of Research Trends in Social Service Quality Using Text Mining and Topic Modeling (텍스트 마이닝과 토픽모델링 활용한 사회서비스 품질의 학술연구 동향 분석)

  • Lee, Hae-Jung;Youn, Ki-Hyok
    • Journal of Internet of Things and Convergence
    • /
    • v.8 no.3
    • /
    • pp.29-40
    • /
    • 2022
  • The aim of this study was to analyze research trends of social service quality from 2007 to 2020 based on text mining and topic modeling. Our focus was to provide foundational materials for social service improvement by discovering the latent meaning of relevant research papers. We collected 97 scholarly articles on social service, social welfare service, and quality from RISS, and implemented two segments of text mining analysis. Our results showed that the first section included 38 papers and the second 59, indicating 6.9 articles annually. Word frequency results demonstrated that the common keywords of both sections were 'service', 'quality', 'social service', 'satisfaction', 'users', 'quality control', 'reuse', 'policy', 'voucher', etc. TF-IDF suggested that 'social service', 'satisfaction', 'users', 'customer satisfaction', 'revisiting', 'voucher', 'quality', 'assisted living facility', 'quality control', 'community service investment business', etc., were represented in both categories. Lastly, topic modeling analysis revealed that the first segment displayed 'types of care services', 'service costs', 'reuse', 'users based', and 'job creation', whereas the second presented 'service quality', 'public value', 'management system of human resources', 'service provision system', and 'service satisfaction'. Future directions of social service quality were discussed based on the results.

Ensemble Learning-Based Prediction of Good Sellers in Overseas Sales of Domestic Books and Keyword Analysis of Reviews of the Good Sellers (앙상블 학습 기반 국내 도서의 해외 판매 굿셀러 예측 및 굿셀러 리뷰 키워드 분석)

  • Do Young Kim;Na Yeon Kim;Hyon Hee Kim
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.4
    • /
    • pp.173-178
    • /
    • 2023
  • As Korean literature spreads around the world, its position in the overseas publishing market has become important. As demand in the overseas publishing market continues to grow, it is essential to predict future book sales and analyze the characteristics of books that have been highly favored by overseas readers in the past. In this study, we proposed ensemble learning based prediction model and analyzed characteristics of the cumulative sales of more than 5,000 copies classified as good sellers published overseas over the past 5 years. We applied the five ensemble learning models, i.e., XGBoost, Gradient Boosting, Adaboost, LightGBM, and Random Forest, and compared them with other machine learning algorithms, i.e., Support Vector Machine, Logistic Regression, and Deep Learning. Our experimental results showed that the ensemble algorithm outperforms other approaches in troubleshooting imbalanced data. In particular, the LightGBM model obtained an AUC value of 99.86% which is the best prediction performance. Among the features used for prediction, the most important feature is the author's number of overseas publications, and the second important feature is publication in countries with the largest publication market size. The number of evaluation participants is also an important feature. In addition, text mining was performed on the four book reviews that sold the most among good-selling books. Many reviews were interested in stories, characters, and writers and it seems that support for translation is needed as many of the keywords of "translation" appear in low-rated reviews.

Abbreviation Disambiguation using Topic Modeling (토픽모델링을 이용한 약어 중의성 해소)

  • Woon-Kyo Lee;Ja-Hee Kim;Junki Yang
    • Journal of the Korea Society for Simulation
    • /
    • v.32 no.1
    • /
    • pp.35-44
    • /
    • 2023
  • In recent, there are many research cases that analyze trends or research trends with text analysis. When collecting documents by searching for keywords in abbreviations for data analysis, it is necessary to disambiguate abbreviations. In many studies, documents are classified by hand-work reading the data one by one to find the data necessary for the study. Most of the studies to disambiguate abbreviations are studies that clarify the meaning of words and use supervised learning. The previous method to disambiguate abbreviation is not suitable for classification studies of documents looking for research data from abbreviation search documents, and related studies are also insufficient. This paper proposes a method of semi-automatically classifying documents collected by abbreviations by going topic modeling with Non-Negative Matrix Factorization, an unsupervised learning method, in the data pre-processing step. To verify the proposed method, papers were collected from academic DB with the abbreviation 'MSA'. The proposed method found 316 papers related to Micro Services Architecture in 1,401 papers. The document classification accuracy of the proposed method was measured at 92.36%. It is expected that the proposed method can reduce the researcher's time and cost due to hand work.

Analysis of Resident's Satisfaction and Its Determining Factors on Residential Environment: Using Zigbang's Apartment Review Bigdata and Deeplearning-based BERT Model (주거환경에 대한 거주민의 만족도와 영향요인 분석 - 직방 아파트 리뷰 빅데이터와 딥러닝 기반 BERT 모형을 활용하여 - )

  • Kweon, Junhyeon;Lee, Sugie
    • Journal of the Korean Regional Science Association
    • /
    • v.39 no.2
    • /
    • pp.47-61
    • /
    • 2023
  • Satisfaction on the residential environment is a major factor influencing the choice of residence and migration, and is directly related to the quality of life in the city. As online services of real estate increases, people's evaluation on the residential environment can be easily checked and it is possible to analyze their satisfaction and its determining factors based on their evaluation. This means that a larger amount of evaluation can be used more efficiently than previously used methods such as surveys. This study analyzed the residential environment reviews of about 30,000 apartment residents collected from 'Zigbang', an online real estate service in Seoul. The apartment review of Zigbang consists of an evaluation grade on a 5-point scale and the evaluation content directly described by the dweller. At first, this study labeled apartment reviews as positive and negative based on the scores of recommended reviews that include comprehensive evaluation about apartment. Next, to classify them automatically, developed a model by using Bidirectional Encoder Representations from Transformers(BERT), a deep learning-based natural language processing model. After that, by using SHapley Additive exPlanation(SHAP), extract word tokens that play an important role in the classification of reviews, to derive determining factors of the evaluation of the residential environment. Furthermore, by analyzing related keywords using Word2Vec, priority considerations for improving satisfaction on the residential environment were suggested. This study is meaningful that suggested a model that automatically classifies satisfaction on the residential environment into positive and negative by using apartment review big data and deep learning, which are qualitative evaluation data of residents, so that it's determining factors were derived. The result of analysis can be used as elementary data for improving the satisfaction on the residential environment, and can be used in the future evaluation of the residential environment near the apartment complex, and the design and evaluation of new complexes and infrastructure.

Maritime Safety Tribunal Ruling Analysis using SentenceBERT (SentenceBERT 모델을 활용한 해양안전심판 재결서 분석 방법에 대한 연구)

  • Bori Yoon;SeKil Park;Hyerim Bae;Sunghyun Sim
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.29 no.7
    • /
    • pp.843-856
    • /
    • 2023
  • The global surge in maritime traffic has resulted in an increased number of ship collisions, leading to significant economic, environmental, physical, and human damage. The causes of these maritime accidents are multifaceted, often arising from a combination of crew judgment errors, negligence, complexity of navigation routes, weather conditions, and technical deficiencies in the vessels. Given the intricate nuances and contextual information inherent in each incident, a methodology capable of deeply understanding the semantics and context of sentences is imperative. Accordingly, this study utilized the SentenceBERT model to analyze maritime safety tribunal decisions over the last 20 years in the Busan Sea area, which encapsulated data on ship collision incidents. The analysis revealed important keywords potentially responsible for these incidents. Cluster analysis based on the frequency of specific keyword appearances was conducted and visualized. This information can serve as foundational data for the preemptive identification of accident causes and the development of strategies for collision prevention and response.

NFT(Non-Fungible Token) Patent Trend Analysis using Topic Modeling

  • Sin-Nyum Choi;Woong Kim
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.12
    • /
    • pp.41-48
    • /
    • 2023
  • In this paper, we propose an analysis of recent trends in the NFT (Non-Fungible Token) industry using topic modeling techniques, focusing on their universal application across various industrial fields. For this study, patent data was utilized to understand industry trends. We collected data on 371 domestic and 454 international NFT-related patents registered in the patent information search service KIPRIS from 2017, when the first NFT standard was introduced, to October 2023. In the preprocessing stage, stopwords and lemmas were removed, and only noun words were extracted. For the analysis, the top 50 words by frequency were listed, and their corresponding TF-IDF values were examined to derive key keywords of the industry trends. Next, Using the LDA algorithm, we identified four major latent topics within the patent data, both domestically and internationally. We analyzed these topics and presented our findings on NFT industry trends, underpinned by real-world industry cases. While previous review presented trends from an academic perspective using paper data, this study is significant as it provides practical trend information based on data rooted in field practice. It is expected to be a useful reference for professionals in the NFT industry for understanding market conditions and generating new items.