• 제목/요약/키워드: topic modeling analysis

검색결과 663건 처리시간 0.026초

귀납적 사회과학연구 방법론을 위한 토픽모델링의 확장 및 사례분석 (Extension and Case Analysis of Topic Modeling for Inductive Social Science Research Methodology)

  • 김근형
    • 한국정보시스템학회지:정보시스템연구
    • /
    • 제31권4호
    • /
    • pp.25-45
    • /
    • 2022
  • Purpose In this paper, we propose the method to extend topic modeling techniques in order to derive data-based research hypotheses when establishing research hypotheses for social sciences, As a concept in contrast to the existing deductive hypothesis establishment methodology for the social science research, the topic modeling technique was expanded to enable the so-called inductive hypothesis establishment methodology, and an analysis case of the Seongsan Ilchulbong online review based on the proposed methodology was presented. Design/methodology/approach In this paper, an extension architecture and extension algorithm in the form of extending the existing topic modeling were proposed. The extended architecture and algorithm include data processing method based on topic ratio in document, correlation analysis and regression analysis of processed data for topics derived by existing topic modeling. In addition, in this paper, an analysis case of the online review of Seongsan Ilchulbong Peak was presented by applying the extended topic modeling algorithm. An exploratory analysis was performed on the Seongsan Ilchulbong online reviews through the basic text analysis. The data was transformed into 5-point scale to enable correlation and regression analysis based on the topic ratio in each online review. A regression analysis was performed using the derived topics as the independent variable and the review rating as the dependent variable, and hypotheses could be derived based on this, which enable the so-called inductive hypothesis establishment. Findings This paper is meaningful in that it confirmed the possibility of deriving a causal model and setting an inductive hypothesis through an extended analysis of topic modeling.

Research trends in the Korean Journal of Women Health Nursing from 2011 to 2021: a quantitative content analysis

  • Ju-Hee Nho;Sookkyoung Park
    • 여성건강간호학회지
    • /
    • 제29권2호
    • /
    • pp.128-136
    • /
    • 2023
  • Purpose: Topic modeling is a text mining technique that extracts concepts from textual data and uncovers semantic structures and potential knowledge frameworks within context. This study aimed to identify major keywords and network structures for each major topic to discern research trends in women's health nursing published in the Korean Journal of Women Health Nursing (KJWHN) using text network analysis and topic modeling. Methods: The study targeted papers with English abstracts among 373 articles published in KJWHN from January 2011 to December 2021. Text network analysis and topic modeling were employed, and the analysis consisted of five steps: (1) data collection, (2) word extraction and refinement, (3) extraction of keywords and creation of networks, (4) network centrality analysis and key topic selection, and (5) topic modeling. Results: Six major keywords, each corresponding to a topic, were extracted through topic modeling analysis: "gynecologic neoplasms," "menopausal health," "health behavior," "infertility," "women's health in transition," and "nursing education for women." Conclusion: The latent topics from the target studies primarily focused on the health of women across all age groups. Research related to women's health is evolving with changing times and warrants further progress in the future. Future research on women's health nursing should explore various topics that reflect changes in social trends, and research methods should be diversified accordingly.

독후감 텍스트의 토픽모델링 적용에 관한 탐색적 연구 (A Study on the Application of Topic Modeling for the Book Report Text)

  • 이수상
    • 한국도서관정보학회지
    • /
    • 제47권4호
    • /
    • pp.1-18
    • /
    • 2016
  • 이 연구는 독후감 텍스트의 주제분석에 토픽모델링의 활용방안을 탐색하는 것을 목적으로 하고 있다. 텍스트의 주제분석 방안으로서 토픽모델링 분석방법을 이해하고, R에서 제공하는 "topicmodels" 패키지의 LDA 함수를 사용하여 23건의 사례 독후감 텍스트들을 대상으로 실제의 분석작업을 수행하였다 토픽모델링 분석결과 16개의 토픽들을 추출하였고 토픽과 구성 단어들의 관계에서 토픽 네트워크 사례 독후감과 토픽들의 관계에서 독후감 네트워크를 구성하였다. 이후 토픽 네트워크와 독후감 네트워크를 대상으로 중심성 분석을 수행하였으며 분석결과는 다음과 같다. 첫째 16개의 토픽들이 1개의 컴포넌트를 가지는 네트워크로 나타났다. 이것은 16개 토픽들이 상호 연관되어 있다는 것을 의미한다. 둘째, 독후감 네트워크에서는 연결정도 중심성이 높은 독후감들과 낮은 독후감들로 구분이 되었다. 전자의 독후감들은 다른 독후감들과 주제적으로 유사성을 가지며 후자의 독후감들은 다른 독후감들과 주제적으로 상이성을 가지는 것으로 해석하였다. 토픽모델링의 결과를 네트워크 분석과 결합함으로써 독후감의 주제파악에 유용한 결과들을 얻게 되었다.

Topic Analysis of Scholarly Communication Research

  • Ji, Hyun;Cha, Mikyeong
    • Journal of Information Science Theory and Practice
    • /
    • 제9권2호
    • /
    • pp.47-65
    • /
    • 2021
  • This study aims to identify specific topics, trends, and structural characteristics of scholarly communication research, based on 1,435 articles published from 1970 to 2018 in the Scopus database through Latent Dirichlet Allocation topic modeling, serial analysis, and network analysis. Topic modeling, time series analysis, and network analysis were used to analyze specific topics, trends, and structures, respectively. The results were summarized into three sets as follows. First, the specific topics of scholarly communication research were nineteen in number, including research resource management and research data, and their research proportion is even. Second, as a result of the time series analysis, there are three upward trending topics: Topic 6: Open Access Publishing, Topic 7: Green Open Access, Topic 19: Informal Communication, and two downward trending topics: Topic 11: Researcher Network and Topic 12: Electronic Journal. Third, the network analysis results indicated that high mean profile association topics were related to the institution, and topics with high triangle betweenness centrality, such as Topic 14: Research Resource Management, shared the citation context. Also, through cluster analysis using parallel nearest neighbor clustering, six clusters connected with different concepts were identified.

Topic Modeling and Sentiment Analysis of Twitter Discussions on COVID-19 from Spatial and Temporal Perspectives

  • AlAgha, Iyad
    • Journal of Information Science Theory and Practice
    • /
    • 제9권1호
    • /
    • pp.35-53
    • /
    • 2021
  • The study reported in this paper aimed to evaluate the topics and opinions of COVID-19 discussion found on Twitter. It performed topic modeling and sentiment analysis of tweets posted during the COVID-19 outbreak, and compared these results over space and time. In addition, by covering a more recent and a longer period of the pandemic timeline, several patterns not previously reported in the literature were revealed. Author-pooled Latent Dirichlet Allocation (LDA) was used to generate twenty topics that discuss different aspects related to the pandemic. Time-series analysis of the distribution of tweets over topics was performed to explore how the discussion on each topic changed over time, and the potential reasons behind the change. In addition, spatial analysis of topics was performed by comparing the percentage of tweets in each topic among top tweeting countries. Afterward, sentiment analysis of tweets was performed at both temporal and spatial levels. Our intention was to analyze how the sentiment differs between countries and in response to certain events. The performance of the topic model was assessed by being compared with other alternative topic modeling techniques. The topic coherence was measured for the different techniques while changing the number of topics. Results showed that the pooling by author before performing LDA significantly improved the produced topic models.

Research trends in dental hygiene based on topic modeling and semantic network analysis

  • Yun-Jeong Kim;Jae-Hee Roh
    • 한국치위생학회지
    • /
    • 제22권6호
    • /
    • pp.495-502
    • /
    • 2022
  • Objectives: The purpose of this study was to analyze research trends in dental hygiene using topic modeling and semantic network analysis. Methods: A total of 261 published studies were collected 686 key words from the Research Information Sharing Service (RISS) by 2019-2021. Topic modeling and semantic network analysis were performed using Textom. Results: The most frequently and frequency-inverse document frequently key words were 'dental hygienist', 'oral health', 'elderly', 'periodontal disease', 'dental hygiene'. N-gram of key words show that 'dental hygienist-emotional labor', 'dental hygienist-elderly', 'dental hygienist-job performance', 'oral health-quality of life', 'oral health-periodontal disease' etc. were frequently. Key words with high degree centrality were 'dental hygienist (0.317)', 'oral health (0.239)', 'elderly (0.127)', 'job satisfaction (0.057)', 'dental care (0.049)'. Extracted topics were 5 by topic modeling. Conclusions: Results from the current study could be available to know research trends in dental hygiene and it is necessary to improve more detailed and qualitative analysis in follow-up study.

토픽 모델링을 이용한 아웃도어웨어 연구 동향 분석 (Analysis of outdoor-wear research trends using topic modeling)

  • 한기향;이민선
    • 복식문화연구
    • /
    • 제31권1호
    • /
    • pp.53-69
    • /
    • 2023
  • This study aims to analyze research trends regarding outdoor wear. For this purpose, the data-collection period was limited to January 2002-October 2022, and the collection consisted of titles of papers, academic names, abstracts, and publication years from the Research Information Sharing Service (RISS). Frequency analysis was conducted on 227 papers in total to check academic journals and annual trends, and LDA topic-modeling analysis was conducted using 20,964 tokens. Data pre-processing was performed prior to topic-modeling analysis; after that, topic-modeling analysis, core topic derivation, and visualization were performed using a Python algorithm. A total of eight topics were obtained from the comprehensive analysis: experiential marketing and lifestyle, property and evaluation of outdoor wear, design and patterns of outdoor wear, outdoor-wear purchase behavior, color, designs and materials of outdoor wear, promotional strategies for outdoor wear, purchase intention and satisfaction depending on the brand image of outdoor wear, differences in outdoor wear preferences by consumer group. The results of topic-modeling analysis revealed that the topic, which includes a study on the design and material of outdoor wear and the pattern of jackets related to the overall shape, was the highest at 30.9% of the total topics. The next highest topic was also the design and color of outdoor wear, indicating that design-related research was the main research topic in outdoor wear research. It is hoped that analyzing outdoor wear research will help comprehend the research conducted thus far and reveal future directions.

R&D Perspective Social Issue Packaging using Text Analysis

  • Wong, William Xiu Shun;Kim, Namgyu
    • 한국IT서비스학회지
    • /
    • 제15권3호
    • /
    • pp.71-95
    • /
    • 2016
  • In recent years, text mining has been used to extract meaningful insights from the large volume of unstructured text data sets of various domains. As one of the most representative text mining applications, topic modeling has been widely used to extract main topics in the form of a set of keywords extracted from a large collection of documents. In general, topic modeling is performed according to the weighted frequency of words in a document corpus. However, general topic modeling cannot discover the relation between documents if the documents share only a few terms, although the documents are in fact strongly related from a particular perspective. For instance, a document about "sexual offense" and another document about "silver industry for aged persons" might not be classified into the same topic because they may not share many key terms. However, these two documents can be strongly related from the R&D perspective because some technologies, such as "RF Tag," "CCTV," and "Heart Rate Sensor," are core components of both "sexual offense" and "silver industry." Thus, in this study, we attempted to discover the differences between the results of general topic modeling and R&D perspective topic modeling. Furthermore, we package social issues from the R&D perspective and present a prototype system, which provides a package of news articles for each R&D issue. Finally, we analyze the quality of R&D perspective topic modeling and provide the results of inter- and intra-topic analysis.

한국산업경영시스템학회지 연구 주제의 토픽모델링 분석 비교: 1978년~99년 논문을 중심으로 (Topic Modeling Analysis Comparison for Research Topic in Korean Society of Industrial and Systems Engineering: Concentrated on Research Papers from 1978~1999)

  • 박동준;오형술;김호균;윤민
    • 산업경영시스템학회지
    • /
    • 제44권4호
    • /
    • pp.113-127
    • /
    • 2021
  • Topic modeling has been receiving much attention in academic disciplines in recent years. Topic modeling is one of the applications in machine learning and natural language processing. It is a statistical modeling procedure to discover topics in the collection of documents. Recently, there have been many attempts to find out topics in diverse fields of academic research. Although the first Department of Industrial Engineering (I.E.) was established in Hanyang university in 1958, Korean Institute of Industrial Engineers (KIIE) which is truly the most academic society was first founded to contribute to research for I.E. and promote industrial techniques in 1974. Korean Society of Industrial and Systems Engineering (KSIE) was established four years later. However, the research topics for KSIE journal have not been deeply examined up until now. Using topic modeling algorithms, we cautiously aim to detect the research topics of KSIE journal for the first half of the society history, from 1978 to 1999. We made use of titles and abstracts in research papers to find out topics in KSIE journal by conducting four algorithms, LSA, HDP, LDA, and LDA Mallet. Topic analysis results obtained by the algorithms were compared. We tried to show the whole procedure of topic analysis in detail for further practical use in future. We employed visualization techniques by using analysis result obtained from LDA. As a result of thorough analysis of topic modeling, eight major research topics were discovered including Production/Logistics/Inventory, Reliability, Quality, Probability/Statistics, Management Engineering/Industry, Engineering Economy, Human Factor/Safety/Computer/Information Technology, and Heuristics/Optimization.

Trend Analysis of Data Mining Research Using Topic Network Analysis

  • Kim, Hyon Hee;Rhee, Hey Young
    • 한국컴퓨터정보학회논문지
    • /
    • 제21권5호
    • /
    • pp.141-148
    • /
    • 2016
  • In this paper, we propose a topic network analysis approach which integrates topic modeling and social network analysis. We collected 2,039 scientific papers from five top journals in the field of data mining published from 1996 to 2015, and analyzed them with the proposed approach. To identify topic trends, time-series analysis of topic network is performed based on 4 intervals. Our experimental results show centralization of the topic network has the highest score from 1996 to 2000, and decreases for next 5 years and increases again. For last 5 years, centralization of the degree centrality increases, while centralization of the betweenness centrality and closeness centrality decreases again. Also, clustering is identified as the most interrelated topic among other topics. Topics with the highest degree centrality evolves clustering, web applications, clustering and dimensionality reduction according to time. Our approach extracts the interrelationships of topics, which cannot be detected with conventional topic modeling approaches, and provides topical trends of data mining research fields.