• Title/Summary/Keyword: topic modeling

Search Result 849, Processing Time 0.027 seconds

Recent Research Trend Analysis for the Journal of Society of Korea Industrial and Systems Engineering Using Topic Modeling (토픽모델링을 활용한 한국산업경영시스템학회지의 최근 연구주제 분석)

  • Dong Joon Park;Pyung Hoi Koo;Hyung Sool Oh;Min Yoon
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.46 no.3
    • /
    • pp.170-185
    • /
    • 2023
  • The advent of big data has brought about the need for analytics. Natural language processing (NLP), a field of big data, has received a lot of attention. Topic modeling among NLP is widely applied to identify key topics in various academic journals. The Korean Society of Industrial and Systems Engineering (KSIE) has published academic journals since 1978. To enhance its status, it is imperative to recognize the diversity of research domains. We have already discovered eight major research topics for papers published by KSIE from 1978 to 1999. As a follow-up study, we aim to identify major topics of research papers published in KSIE from 2000 to 2022. We performed topic modeling on 1,742 research papers during this period by using LDA and BERTopic which has recently attracted attention. BERTopic outperformed LDA by providing a set of coherent topic keywords that can effectively distinguish 36 topics found out this study. In terms of visualization techniques, pyLDAvis presented better two-dimensional scatter plots for the intertopic distance map than BERTopic. However, BERTopic provided much more diverse visualization methods to explore the relevance of 36 topics. BERTopic was also able to classify hot and cold topics by presenting 'topic over time' graphs that can identify topic trends over time.

Analysis of Municipal Ordinances for Smart Cities of Municipal Governments: Using Topic Modeling (지방자치단체의 스마트시티 조례 분석: 토픽모델링을 활용하여)

  • Hyungjun Seo
    • Informatization Policy
    • /
    • v.30 no.1
    • /
    • pp.41-66
    • /
    • 2023
  • This study aims to reveal the direction of municipal ordinances for smart cities, while focusing on 74 municipal ordinances from 72 municipal governments through topic modeling. As a result, the main keywords that show a high frequency belong to establishment and operations of the Smart City Committee. From the result of topic modeling Latent Dirichlet Allocation(LDA), it classifies municipal ordinances for smart cities into eight topics as follows: Topic 1(security for process of smart cities), Topic 2(promotion of smart city industry), Topic 3(composition of a smart city consultative body for local residents), Topic 4(support system for smart cities), Topic 5(management for personal information), Topic 6(use of smart city data), Topic 7(implementation for intelligent public administration), and Topic 8(smart city promotion). As for topic categorization by region, Topics 5, 6, and 8 which are mostly related to the practical operation of smart cities have a significant portion of municipal ordinances for smart cities in the Seoul metropolitan area. Then, Topics 2, 3, and 4 which are mostly related to the initial implementation of smart cities have a significant portion of municipal ordinances for smart cities in provincial areas.

How Are the Direction and the Intensity of Indirect Social Information such as Likes and Dislikes Related to the Deliberative Quality of Online News Content Comments? A Topic Diversity Analysis Using Topic Modeling ('좋아요'와 '싫어요'같은 간접적 사회적 정보의 방향과 강도는 온라인 뉴스 콘텐츠 댓글의 숙의의 질과 어떤 관련이 있는가? 토픽 모델링을 이용한 토픽 다양성 분석)

  • Min, Jin Young;Lee, Ae Ri
    • The Journal of Information Systems
    • /
    • v.30 no.4
    • /
    • pp.303-327
    • /
    • 2021
  • Purpose The online comments on news content have become social information and are understood based on deliberative democracy. Although the related research has focused on the relationship between online comments and their deliberative quality, the social information provided by online comments consists of not only direct information such as comments themselves but also indirect information such as 'likes' and 'dislikes'. Therefore, the research on online comments and deliberative quality should study this direct and indirect information together, and the direction and the degree of the indirect information should be also considered with them. Design/methodology/approach This study distinguishes comments by the attached 'likes' and 'dislikes', identifies highly supported and highly unsupported comments by the intensity of 'likes' and 'dislikes', and investigates the relationship between their existence and the deliberative quality measured as the topic diversity. Then, we applied topic modeling to the 2,390 news articles and their 74,385 comments collected from five news sites. Findings The topic diversities of the supported and unsupported comments are related to the topic diversity of all comments but the degree of the relationship is higher in the case of supported comments. Furthermore, the existence of highly supported and unsupported comments is led to less diversity of all comments compared to the case where those comments are absent. Particularly, when only highly supported comments are present, topic diversity was lower than in the opposite case.

Research Trends on Doctor's Job Competencies in Korea Using Text Network Analysis (텍스트네트워크 분석을 활용한 국내 의사 직무역량 연구동향 분석)

  • Kim, Young Jon;Lee, Jea Woog;Yune, So Jung
    • Korean Medical Education Review
    • /
    • v.24 no.2
    • /
    • pp.93-102
    • /
    • 2022
  • We use the concept of the "doctor's role" as a guideline for developing medical education programs for medical students, residents, and doctors. Therefore, we should regularly reflect on the times and social needs to develop a clear sense of that role. The objective of the present study was to understand the knowledge structure related to doctor's job competencies in Korea. We analyzed research trends related to doctor's job competencies in Korea Citation Index journals using text network analysis through an integrative approach focusing on identifying social issues. We finally selected 1,354 research papers related to doctor's job competencies from 2011 to 2020, and we analyzed 2,627 words through data pre-processing with the NetMiner ver. 4.2 program (Cyram Inc., Seongnam, Korea). We conducted keyword centrality analysis, topic modeling, frequency analysis, and linear regression analysis using NetMiner ver. 4.2 (Cyram Inc.) and IBM SPSS ver. 23.0 (IBM Corp., Armonk, NY, USA). As a result of the study, words such as "family," "revision," and "rejection" appeared frequently. In topic modeling, we extracted five potential topics: "topic 1: Life and death in medical situations," "topic 2: Medical practice under the Medical Act," "topic 3: Medical malpractice and litigation," "topic 4: Medical professionalism," and "topic 5: Competency development education for medical students." Although there were no statistically significant changes in the research trends for each topic over time, it is nonetheless known that social changes could affect the demand for doctor's job competencies.

A Study on the Topic Modeling Analysis of Book Reports on Personality Types and Interest Types (성격유형과 흥미유형에 따른 독서 감상문 토픽 분석 연구)

  • Jeong-Hoon Lim
    • Journal of the Korean Society for information Management
    • /
    • v.40 no.1
    • /
    • pp.175-198
    • /
    • 2023
  • This study aimed to investigate the difference in response to reading as shown in book reports by personality type and interest type. For this purpose, personality type analysis data, interest type analysis data, and book report data written in subject reading activities were collected from 81 third graders at D Science High School in Daejeon. Topic analysis was conducted on the collected book reports, and the probability of a topic being mentioned was statistically tested according to personality type (thinking type, feeling type) and interest type (investigative type, types other than investigative). Subsequently, the conceptual connection structure of words was measured by keyword network analysis, and the analysis results of topic modeling were complemented by the centrality index. As a result of the study, the topic regression analysis showed statistically significant differences between thinking type (T) and feeling type (F) in topic 2 (understanding and studying) and topic 3 (reading and thinking), and statistically significant differences between investigative type and non-investigative type in topic 2 (understanding and studying). The results of this study can be used as a basis for tailored book recommendations and personalized reading education.

Analysis on Topic Trends and Topic Modeling of KSHSM Journal Papers using Text Mining (텍스트마이닝을 활용한 보건의료산업학회지의 토픽 모델링 및 토픽트렌드 분석)

  • Cho, Kyoung-Won;Bae, Sung-Kwon;Woo, Young-Woon
    • The Korean Journal of Health Service Management
    • /
    • v.11 no.4
    • /
    • pp.213-224
    • /
    • 2017
  • Objectives : The purpose of this study was to analyze representative topics and topic trends of papers in Korean Society and Health Service Management(KSHSM) Journal. Methods : We collected English abstracts and key words of 516 papers in KSHSM Journal from 2007 to 2017. We utilized Python web scraping programs for collecting the papers from Korea Citation Index web site, and RStudio software for topic analysis based on latent Dirichlet allocation algorithm. Results : 9 topics were decided as the best number of topics by perplexity analysis and the resultant 9 topics for all the papers were extracted using Gibbs sampling method. We could refine 9 topics to 5 topics by deep consideration of meanings of each topics and analysis of intertopic distance map. In topic trends analysis from 2007 to 2017, we could verify 'Health Management' and 'Hospital Service' were two representative topics, and 'Hospital Service' was prevalent topic by 2011, but the ratio of the two topics became to be similar from 2012. Conclusions : We discovered 5 topics were the best number of topics and the topic trends reflected the main issues of KSHSM Journal, such as name revision of the society in 2012.

Classification of Public Perceptions toward Smog Risks on Twitter Using Topic Modeling (Topic Modeling을 이용한 Twitter상에서 스모그 리스크에 관한 대중 인식 분류 연구)

  • Kim, Yun-Ki
    • Journal of Cadastre & Land InformatiX
    • /
    • v.47 no.1
    • /
    • pp.53-79
    • /
    • 2017
  • The main purpose of this study was to detect and classify public perceptions toward smog disasters on Twitter using topic modeling. To help achieve these objectives and to identify gaps in the literature, this research carried out a literature review on public opinions toward smog disasters and topic modeling. The literature review indicated that there are huge gaps in the related literature. In this research, this author formed five research questions to fill the gaps in the literature. And then this study performed research steps such as data extraction, word cloud analysis on the cleaned data, building the network of terms, correlation analysis, hierarchical cluster analysis, topic modeling with the LDA, and stream graphs to answer those research questions. The results of this research revealed that there exist huge differences in the most frequent terms, the shapes of terms network, types of correlation, and smog-related topics changing patterns between New York and London. Therefore, this author could find positive answers to the four of the five research questions and a partially positive answer to Research question 4. Finally, on the basis of the results, this author suggested policy implications and recommendations for future study.

Research Trend Analysis for Smart Grids Using Dynamic Topic Modeling (동적 토픽분석을 활용한 스마트그리드 연구동향 분석)

  • Na, Sang-Tae;Ahn, Joo-Eon;Jung, Min-Ho;Kim, Ja-Hee
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.66 no.4
    • /
    • pp.613-620
    • /
    • 2017
  • The power grid has been changed to a smart grid system to satisfy the growing need for power grid complexity, demand, reliability, security, and efficiency with a combination of existing power and ICT technology. This study analyzes the research trends in smart grid technology in the period since the introduction of the smart grid system and compares it with industrial trends to grasp the progress and characteristics of Smart Grid technology and look for ways to innovate the technology. To do this, we analyze the research trends using dynamic topic modeling, which is capable of time-series research topic analysis. Next, we compare the results of research trends with industrial trends analyzed by Gartner's experts to demonstrate that smart grid research is evolving to the level of industrialization. The results of this study are quantitative analysis through data mining, and it is expected that it will be used in many fields such as companies that want to participate in industry and government agencies that need to establish policies by showing more objective analysis results.

An Exploratory Study of Health Inequality Discourse Using Korean Newspaper Articles: A Topic Modeling Approach

  • Kim, Jin-Hwan
    • Journal of Preventive Medicine and Public Health
    • /
    • v.52 no.6
    • /
    • pp.384-392
    • /
    • 2019
  • Objectives: This study aimed to explore the health inequality discourse in the Korean press by analyzing newspaper articles using a relatively new content analysis technique. Methods: This study used the search term "health inequality" to collect articles containing that term that were published between 2000 and 2018. The collected articles went through pre-processing and topic modeling, and the contents and temporal trends of the extracted topics were analyzed. Results: A total of 1038 articles were identified, and 5 topics were extracted. As the number of studies on health inequality has increased over the past 2 decades, so too has the number of news articles regarding health inequality. The extracted topics were public health policies, social inequalities in health, inequality as a social problem, healthcare policies, and regional health gaps. The total number of occurrences of each topic increased every year, and the trend observed for each theme was influenced by events related to its contents, such as elections. Finally, the frequency of appearance of each topic differed depending on the type of news source. Conclusions: The results of this study can be used as preliminary data for future attempts to address health inequality in Korea. To make addressing health inequality part of the public agenda, the media's perspective and discourse regarding health inequality should be monitored to facilitate further strategic action.

Topic Modeling Analysis of Beauty Industry using BERTopic and LDA

  • YANG, Hoe-Chang;LEE, Won-Dong
    • The Journal of Economics, Marketing and Management
    • /
    • v.10 no.6
    • /
    • pp.1-7
    • /
    • 2022
  • Purpose: The purpose of this study is identifying the research trends of degree papers related to the beauty industry and providing information which can contribute to the development of the domestic beauty industry and the direction of various research about beauty industry. Research design, data and methodology: This study used 154 academic papers and 189 academic papers with English abstracts out of 299 academic papers. All of these papers were found by searching for the keyword "beauty industry" in ScienceON on August 15, 2022. For the analysis, BERTopic and LDA (Latent Dirichlet Allocation) analysis were conducted using Python 3.7. Also, OLS regression analysis was conducted to understand the annual increase and decrease trend of each topic derived with trend analysis. Results: As a result of word frequency analysis, the frequency of satisfaction, management, behavior, and service was found to be high. In addition, it was found that 'service', 'satisfaction' and 'customer' were frequently associated with program and relationship in the word co-occurrence frequency analysis. As a result of topic modeling, six topics were derived: 'Beauty shop', 'Health education', 'Cosmetics', 'Customer satisfaction', 'Beauty education', and 'Beauty business'. The trend analysis result of each topic confirmed that 'Beauty education' and 'Health education' are getting more attention as time goes by. Conclusions: The future studies must resolve the extreme polarization between the structure of the small beauty industry and beauty stores. Furthermore, the researches have to direct various ways to create the performance of internal personnel. The ways to maximize product capabilities such as competitive cosmetics and brands are also needed attentions.