• 제목/요약/키워드: topic modeling

검색결과 828건 처리시간 0.032초

An Ontology-Based Labeling of Influential Topics Using Topic Network Analysis

  • Kim, Hyon Hee;Rhee, Hey Young
    • Journal of Information Processing Systems
    • /
    • 제15권5호
    • /
    • pp.1096-1107
    • /
    • 2019
  • In this paper, we present an ontology-based approach to labeling influential topics of scientific articles. First, to look for influential topics from scientific article, topic modeling is performed, and then social network analysis is applied to the selected topic models. Abstracts of research papers related to data mining published over the 20 years from 1995 to 2015 are collected and analyzed in this research. Second, to interpret and to explain selected influential topics, the UniDM ontology is constructed from Wikipedia and serves as concept hierarchies of topic models. Our experimental results show that the subjects of data management and queries are identified in the most interrelated topic among other topics, which is followed by that of recommender systems and text mining. Also, the subjects of recommender systems and context-aware systems belong to the most influential topic, and the subject of k-nearest neighbor classifier belongs to the closest topic to other topics. The proposed framework provides a general model for interpreting topics in topic models, which plays an important role in overcoming ambiguous and arbitrary interpretation of topics in topic modeling.

토픽 모델링을 이용한 건설현장 추락재해 분석 (Falling Accidents Analysis in Construction Sites by Using Topic Modeling)

  • 류한국
    • 한국융합학회논문지
    • /
    • 제10권7호
    • /
    • pp.175-182
    • /
    • 2019
  • 본 연구는 기계학습 기법 중 토픽 모델링을 활용하여 건설현장에서 발생하는 추락재해에 대한 토픽을 분류하고 각 토픽에 따른 재해요인을 분석하였다. 잠재 디리클레 할당 기반의 토픽 모델링을 적용하기 위해 텍스트 데이터의 전처리를 하였고 Perplexity 점수로 평가하여 모형의 신뢰성을 높였다. 각 토픽에서 공통으로 도출된 추락재해의 대부분은 소규모 사업장에 속한 일용직 작업자들에게 발생하였다. 추락재해의 대부분의 원인은 안전장비 미착용, 현장 정리 정돈 미흡, 안전장비의 성능 및 착용 상태로 인해 제대로 작동하지 않은 것으로 판단되었다. 추락재해를 예방하고 절감하기 위해서는 소규모 사업장에 맞는 안전교육과 작업장의 정리 정돈과 개인 안전장비의 적절한 착용 상태 및 성능을 확인하는 것이 중요한 것으로 도출되었다.

Topic Modeling Analysis of Social Media Marketing using BERTopic and LDA

  • YANG, Woo-Ryeong;YANG, Hoe-Chang
    • 산경연구논집
    • /
    • 제13권9호
    • /
    • pp.37-50
    • /
    • 2022
  • Purpose: The purpose of this study is to explore and compare research trends in Korea and overseas academic papers on social media marketing, and to present new academic perspectives for the future direction in Korea. Research design, data and methodology: We used English abstract of research paper (Korea's: 1,349, overseas': 5,036) for word frequency analysis, topic modeling, and trend analysis for each topic. Results: The results of word frequency and co-occurrence frequency analysis showed that Korea researches focused on the experiential values of users, and overseas researches focused on platforms and content. Next, 13 topics and 12 topics for Korea and overseas researches were derived from topic modeling. And, trend analysis showed that Korean studies were different from overseas in applying marketing methods to specific industries and they were interested in the short-term performance of social media marketing. Conclusions: We found that the long-term strategies of social media marketing and academic interest in the overall industry will necessary in the future researches. Also, data mining techniques will necessary to generate more general results by quantifying various phenomena in reality. Finally, we expected that continuous and various academic approaches for volatile social media is effective to derive practical implications.

슈퍼앱 리뷰 토픽모델링을 통한 서비스 강화 방안 연구 (Research on Service Enhancement Approach based on Super App Review Data using Topic Modeling)

  • 유제원;송지훈
    • 한국산업융합학회 논문집
    • /
    • 제27권2_2호
    • /
    • pp.343-356
    • /
    • 2024
  • Super app is an application that provides a variety of services in a unified interface within a single platform. With the acceleration of digital transformation, super apps are becoming more prevalent. This study aims to suggest service enhancement measures by analyzing the user review data before and after the transition to a super app. To this end, user review data from a payment-based super app(Shinhan Play) were collected and studied via topic modeling. Moreover, a matrix for assessing the importance and usefulness of topics is introduced, which relies on the eigenvector centrality of the inter-topic network obtained through topic modeling and the number of review recommendations. This allowed us to identify and categorize topics with high utility and impact. Prior to the transition, the factors contributing to user satisfaction included 'payment service,' 'additional service,' and 'improvement.' Following the transition, user satisfaction was associated with 'payment service' and 'integrated UX.' Conversely, dissatisfaction factors before the transition encompassed issues related to 'signup/installation,' 'payment error/response,' 'security authentication,' and 'security error.' Following the transition, user dissatisfaction arose from concerns regarding 'update/error response' and 'UX/UI.' The research results are expected to be used as a basis for establishing strategies to strengthen service competitiveness by making super app services more user-oriented.

LDA를 사용한 COVID-19 관련 국내 논문의 연구 토픽 분석 (Research Topic Analysis of the Domestic Papers Related to COVID-19 Using LDA)

  • 김은회;서유화
    • 한국정보전자통신기술학회논문지
    • /
    • 제15권5호
    • /
    • pp.423-432
    • /
    • 2022
  • 본 논문은 학술연구자들이 COVID-19 관련 논문의 전체적인 연구 동향을 파악할 수 있도록 한다. KCI 사이트에서 수집한 2020년 1월부터 2022년 7월까지 총 10,599편의 COVID-19 관련 논문 정보를 LDA 토픽 모델링으로 분석한 결과를 제시한다. 또한 학술연구자들이 자신의 관심 연구분야의 토픽을 쉽게 파악할 수 있도록 LDA 토픽 모델링의 결과를 주요 연구 카테고리별로 분석하고, 토픽별로 연구가 많이 이루어지는 세부 연구 카테고리 정보를 분석한다. 학술연구자들이 시간의 흐름에 따른 연구 토픽의 추세(trend)를 파악하는 것은 연구 동향을 파악하는데 매우 중요하다. 따라서 이를 위해 본 논문에서는 시계열 분해를 사용하여 토픽들의 추세(trend)를 분석하여 제시한다.

The Impact of Topic Distribution on Review Sentiment: A Comparative Study between South Korea and the U.S.

  • Cho, Mina;Hwang, Dugmee;Jeon, Seongmin
    • 한국벤처창업학회:학술대회논문집
    • /
    • 한국벤처창업학회 2022년도 춘계학술대회
    • /
    • pp.123-126
    • /
    • 2022
  • Online reviews offer valuable information to businesses by reflecting consumer experiences about their products and services. Two important aspects of online reviews are first, the topics consumers choose to address and second, the sentiments expressed in their reviews. Building upon previous literature that shows online reviews are context-dependent, we examine the impact of topic distribution on review sentiment in South Korea and the U.S. during pre-and post-pandemic periods. After performing topic modeling on Airbnb app review data, we measure the contribution of each topic on review sentiment using SHAP values. Our results indicate variations in topic distribution trends between 2018 and 2021. Also, the order and magnitude of topics' impact on review sentiment change between pre-and post-pandemic periods for both countries. This study can help businesses to understand how topics and sentiments associated with their products and services changed after pandemic, and also help them identify areas of improvement.

  • PDF

Impact of Topic Distribution on Review Sentiment: A Comparative Study between South Korea and the U.S.

  • Mina Cho;Dugmee Hwang;SeongMin Jeon
    • Asia pacific journal of information systems
    • /
    • 제32권3호
    • /
    • pp.514-536
    • /
    • 2022
  • Online reviews offer valuable information to businesses by reflecting consumer experiences about their products and services. Two crucial aspects of online reviews are the topics consumers choose to address, and the sentiments expressed in their reviews. Building upon previous literature that shows online reviews are context-dependent, we employ the Expectation-Confirmation Theory (ECT) to examine the impact of topic distribution on review sentiment in South Korea and the U.S. during pre- and post-pandemic periods. After applying a topic modeling to Airbnb app review data, we measure the contribution of each topic on review sentiment using SHAP values. Our results indicate variations in topic distribution trends between 2018 and 2021. In addition, the order and magnitude of topics' impact on review sentiment change between pre- and post-pandemic periods for both countries. This study can help businesses understand how topics and sentiments associated with their products and services changed after the pandemic and thus identify areas of improvement.

언어 자원과 토픽 모델의 순차 매칭을 이용한 유사 문장 계산 기반의 위키피디아 한국어-영어 병렬 말뭉치 구축 (Building a Korean-English Parallel Corpus by Measuring Sentence Similarities Using Sequential Matching of Language Resources and Topic Modeling)

  • 천주룡;고영중
    • 정보과학회 논문지
    • /
    • 제42권7호
    • /
    • pp.901-909
    • /
    • 2015
  • 본 논문은 위키피디아로부터 한국어-영어 간 병렬 말뭉치를 구축하기 위한 연구이다. 이를 위해, 언어 자원과 토픽모델의 순차 매칭 기반의 유사 문장 계산 방법을 제안한다. 먼저, 언어자원의 매칭은 위키피디아 제목으로 구성된 위키 사전, 숫자, 다음 온라인 사전을 단어 매칭에 순차적으로 적용하였다. 또한, 위키피디아의 특성을 활용하기 위해 위키 사전에서 추정한 번역 확률을 단어 매칭에 추가 적용하였다. 그리고 토픽모델로부터 추출한 단어 분포를 유사도 계산에 적용함으로써 정확도를 향상시켰다. 실험에서, 선행연구의 언어자원만을 선형 결합한 유사 문장 계산은 F1-score 48.4%, 언어자원과 모든 단어 분포를 고려한 토픽모델의 결합은 51.6%의 성능을 보였으나, 본 논문에서 제안한 언어자원에 번역 확률을 추가하여 순차 매칭을 적용한 방법은 58.3%로 9.9%의 성능 향상을 얻었고, 여기에 중요한 단어 분포를 고려한 토픽모델을 적용한 방법이 59.1%로 7.5%의 성능 향상을 얻었다.

Analysis of Secondary Battery Trends Using Topic Modeling: Focusing on Solid-State Batteries

  • Chunghyun Do;Yong Jin Kim
    • Asian Journal of Innovation and Policy
    • /
    • 제12권3호
    • /
    • pp.345-362
    • /
    • 2023
  • As the widespread adoption and proliferation of electric vehicles continue, the secondary battery market is experiencing rapid growth. However, lithium-ion batteries, which constitute a majority of secondary batteries, present high risks of fire and explosion. Solid-state batteries are thus garnering attention as the next-generation batteries since they eliminate fire hazards and significantly reduce the risk of explosions. Against this background, the study aimed to analyze research trends and provide insights by examining 2,927 domestic papers related to solid-state batteries over the past decade (2013-2022). Specifically, we used topic modeling to extract major keywords associated with solid-state batteries research and to explore the network characteristics across major topics. The changes in research on solid-state batteries were analyzed in-depth by calculating topic dominance by year. The findings provide an overview of the emerging trends in domestic solid-state battery research, and might serve as a valuable reference in shaping long-term research directions.

토픽모델링을 활용한 한국산업경영시스템학회지의 최근 연구주제 분석 (Recent Research Trend Analysis for the Journal of Society of Korea Industrial and Systems Engineering Using Topic Modeling)

  • 박동준;구평회;오형술;윤 민
    • 산업경영시스템학회지
    • /
    • 제46권3호
    • /
    • pp.170-185
    • /
    • 2023
  • The advent of big data has brought about the need for analytics. Natural language processing (NLP), a field of big data, has received a lot of attention. Topic modeling among NLP is widely applied to identify key topics in various academic journals. The Korean Society of Industrial and Systems Engineering (KSIE) has published academic journals since 1978. To enhance its status, it is imperative to recognize the diversity of research domains. We have already discovered eight major research topics for papers published by KSIE from 1978 to 1999. As a follow-up study, we aim to identify major topics of research papers published in KSIE from 2000 to 2022. We performed topic modeling on 1,742 research papers during this period by using LDA and BERTopic which has recently attracted attention. BERTopic outperformed LDA by providing a set of coherent topic keywords that can effectively distinguish 36 topics found out this study. In terms of visualization techniques, pyLDAvis presented better two-dimensional scatter plots for the intertopic distance map than BERTopic. However, BERTopic provided much more diverse visualization methods to explore the relevance of 36 topics. BERTopic was also able to classify hot and cold topics by presenting 'topic over time' graphs that can identify topic trends over time.