• 제목/요약/키워드: Topics Modeling analysis

검색결과 441건 처리시간 0.027초

Topic Analysis of Scholarly Communication Research

  • Ji, Hyun;Cha, Mikyeong
    • Journal of Information Science Theory and Practice
    • /
    • 제9권2호
    • /
    • pp.47-65
    • /
    • 2021
  • This study aims to identify specific topics, trends, and structural characteristics of scholarly communication research, based on 1,435 articles published from 1970 to 2018 in the Scopus database through Latent Dirichlet Allocation topic modeling, serial analysis, and network analysis. Topic modeling, time series analysis, and network analysis were used to analyze specific topics, trends, and structures, respectively. The results were summarized into three sets as follows. First, the specific topics of scholarly communication research were nineteen in number, including research resource management and research data, and their research proportion is even. Second, as a result of the time series analysis, there are three upward trending topics: Topic 6: Open Access Publishing, Topic 7: Green Open Access, Topic 19: Informal Communication, and two downward trending topics: Topic 11: Researcher Network and Topic 12: Electronic Journal. Third, the network analysis results indicated that high mean profile association topics were related to the institution, and topics with high triangle betweenness centrality, such as Topic 14: Research Resource Management, shared the citation context. Also, through cluster analysis using parallel nearest neighbor clustering, six clusters connected with different concepts were identified.

독후감 텍스트의 토픽모델링 적용에 관한 탐색적 연구 (A Study on the Application of Topic Modeling for the Book Report Text)

  • 이수상
    • 한국도서관정보학회지
    • /
    • 제47권4호
    • /
    • pp.1-18
    • /
    • 2016
  • 이 연구는 독후감 텍스트의 주제분석에 토픽모델링의 활용방안을 탐색하는 것을 목적으로 하고 있다. 텍스트의 주제분석 방안으로서 토픽모델링 분석방법을 이해하고, R에서 제공하는 "topicmodels" 패키지의 LDA 함수를 사용하여 23건의 사례 독후감 텍스트들을 대상으로 실제의 분석작업을 수행하였다 토픽모델링 분석결과 16개의 토픽들을 추출하였고 토픽과 구성 단어들의 관계에서 토픽 네트워크 사례 독후감과 토픽들의 관계에서 독후감 네트워크를 구성하였다. 이후 토픽 네트워크와 독후감 네트워크를 대상으로 중심성 분석을 수행하였으며 분석결과는 다음과 같다. 첫째 16개의 토픽들이 1개의 컴포넌트를 가지는 네트워크로 나타났다. 이것은 16개 토픽들이 상호 연관되어 있다는 것을 의미한다. 둘째, 독후감 네트워크에서는 연결정도 중심성이 높은 독후감들과 낮은 독후감들로 구분이 되었다. 전자의 독후감들은 다른 독후감들과 주제적으로 유사성을 가지며 후자의 독후감들은 다른 독후감들과 주제적으로 상이성을 가지는 것으로 해석하였다. 토픽모델링의 결과를 네트워크 분석과 결합함으로써 독후감의 주제파악에 유용한 결과들을 얻게 되었다.

한국산업경영시스템학회지 연구 주제의 토픽모델링 분석 비교: 1978년~99년 논문을 중심으로 (Topic Modeling Analysis Comparison for Research Topic in Korean Society of Industrial and Systems Engineering: Concentrated on Research Papers from 1978~1999)

  • 박동준;오형술;김호균;윤민
    • 산업경영시스템학회지
    • /
    • 제44권4호
    • /
    • pp.113-127
    • /
    • 2021
  • Topic modeling has been receiving much attention in academic disciplines in recent years. Topic modeling is one of the applications in machine learning and natural language processing. It is a statistical modeling procedure to discover topics in the collection of documents. Recently, there have been many attempts to find out topics in diverse fields of academic research. Although the first Department of Industrial Engineering (I.E.) was established in Hanyang university in 1958, Korean Institute of Industrial Engineers (KIIE) which is truly the most academic society was first founded to contribute to research for I.E. and promote industrial techniques in 1974. Korean Society of Industrial and Systems Engineering (KSIE) was established four years later. However, the research topics for KSIE journal have not been deeply examined up until now. Using topic modeling algorithms, we cautiously aim to detect the research topics of KSIE journal for the first half of the society history, from 1978 to 1999. We made use of titles and abstracts in research papers to find out topics in KSIE journal by conducting four algorithms, LSA, HDP, LDA, and LDA Mallet. Topic analysis results obtained by the algorithms were compared. We tried to show the whole procedure of topic analysis in detail for further practical use in future. We employed visualization techniques by using analysis result obtained from LDA. As a result of thorough analysis of topic modeling, eight major research topics were discovered including Production/Logistics/Inventory, Reliability, Quality, Probability/Statistics, Management Engineering/Industry, Engineering Economy, Human Factor/Safety/Computer/Information Technology, and Heuristics/Optimization.

Topic Modeling and Sentiment Analysis of Twitter Discussions on COVID-19 from Spatial and Temporal Perspectives

  • AlAgha, Iyad
    • Journal of Information Science Theory and Practice
    • /
    • 제9권1호
    • /
    • pp.35-53
    • /
    • 2021
  • The study reported in this paper aimed to evaluate the topics and opinions of COVID-19 discussion found on Twitter. It performed topic modeling and sentiment analysis of tweets posted during the COVID-19 outbreak, and compared these results over space and time. In addition, by covering a more recent and a longer period of the pandemic timeline, several patterns not previously reported in the literature were revealed. Author-pooled Latent Dirichlet Allocation (LDA) was used to generate twenty topics that discuss different aspects related to the pandemic. Time-series analysis of the distribution of tweets over topics was performed to explore how the discussion on each topic changed over time, and the potential reasons behind the change. In addition, spatial analysis of topics was performed by comparing the percentage of tweets in each topic among top tweeting countries. Afterward, sentiment analysis of tweets was performed at both temporal and spatial levels. Our intention was to analyze how the sentiment differs between countries and in response to certain events. The performance of the topic model was assessed by being compared with other alternative topic modeling techniques. The topic coherence was measured for the different techniques while changing the number of topics. Results showed that the pooling by author before performing LDA significantly improved the produced topic models.

토픽 모델링을 활용한 다문화 연구의 이슈 추적 연구 (A Study on Issue Tracking on Multi-cultural Studies Using Topic Modeling)

  • 박종도
    • 한국문헌정보학회지
    • /
    • 제53권3호
    • /
    • pp.273-289
    • /
    • 2019
  • 본 논문은 국내 다문화 관련 분야의 연구동향을 규명하기 위하여 다문화와 관련한 국내 학술 문헌을 수집하여 LDA (Latent Dirichlet Allocation) 기반의 토픽 모델링을 통해 토픽을 분석하였다. 이를 통해 국내 다문화 관련 연구에서의 중심 연구 토픽을 시기별로 추적하여 그 변화의 양상을 관찰하였고, 그 결과 핫 토픽으로는 '다문화 사회통합'과 '학교 다문화 교육'이 관찰되었으며 콜드 토픽으로는 '문화정체성과 민족주의' 관련 토픽이 관찰되었다.

텍스트 마이닝 기반의 자산관리 핀테크 기업 핵심 요소 분석: 사용자 리뷰를 바탕으로 (An Analysis of Key Elements for FinTech Companies Based on Text Mining: From the User's Review)

  • 손애린;신왕수;이준기
    • 한국정보시스템학회지:정보시스템연구
    • /
    • 제29권4호
    • /
    • pp.137-151
    • /
    • 2020
  • Purpose Domestic asset management fintech companies are expected to grow by leaps and bounds along with the implementation of the "Data bills." Contrary to the market fever, however, academic research is insufficient. Therefore, we want to analyze user reviews of asset management fintech companies that are expected to grow significantly in the future to derive strengths and complementary points of services that have been provided, and analyze key elements of asset management fintech companies. Design/methodology/approach To analyze large amounts of review text data, this study applied text mining techniques. Bank Salad and Toss, domestic asset management application services, were selected for the study. To get the data, app reviews were crawled in the online app store and preprocessed using natural language processing techniques. Topic Modeling and Aspect-Sentiment Analysis were used as analysis methods. Findings According to the analysis results, this study was able to derive the elements that asset management fintech companies should have. As a result of Topic Modeling, 7 topics were derived from Bank Salad and Toss respectively. As a result, topics related to function and usage and topics on stability and marketing were extracted. Sentiment Analysis showed that users responded positively to function-related topics, but negatively to usage-related topics and stability topics. Through this, we were able to extract the key elements needed for asset management fintech companies.

텍스트마이닝을 활용한 보건의료산업학회지의 토픽 모델링 및 토픽트렌드 분석 (Analysis on Topic Trends and Topic Modeling of KSHSM Journal Papers using Text Mining)

  • 조경원;배성권;우영운
    • 보건의료산업학회지
    • /
    • 제11권4호
    • /
    • pp.213-224
    • /
    • 2017
  • Objectives : The purpose of this study was to analyze representative topics and topic trends of papers in Korean Society and Health Service Management(KSHSM) Journal. Methods : We collected English abstracts and key words of 516 papers in KSHSM Journal from 2007 to 2017. We utilized Python web scraping programs for collecting the papers from Korea Citation Index web site, and RStudio software for topic analysis based on latent Dirichlet allocation algorithm. Results : 9 topics were decided as the best number of topics by perplexity analysis and the resultant 9 topics for all the papers were extracted using Gibbs sampling method. We could refine 9 topics to 5 topics by deep consideration of meanings of each topics and analysis of intertopic distance map. In topic trends analysis from 2007 to 2017, we could verify 'Health Management' and 'Hospital Service' were two representative topics, and 'Hospital Service' was prevalent topic by 2011, but the ratio of the two topics became to be similar from 2012. Conclusions : We discovered 5 topics were the best number of topics and the topic trends reflected the main issues of KSHSM Journal, such as name revision of the society in 2012.

토픽모델링을 활용한 농촌연구 동향분석 (An Analysis on the Rural Research Trends using Topic Modeling)

  • 김가은;정유경;임영훈
    • 농촌계획
    • /
    • 제29권4호
    • /
    • pp.81-92
    • /
    • 2023
  • The purpose of this study is to identify rural research topics, differences in research topics over time, and key mediators through the analysis of academic research trends using topic modeling. This study analyzed a total of 1,183 articles published in the Journal of Rural Planning and Rural Society over a 23-year period (2000-2022). We categorized rural research topics into 30, examined the proportion of research in each topic, and identified major changes in research topics over time. We also identified key words that mediate between research topics. The study found that, first, rural research trends can be categorized into five types (resources and utilization, area/space, people, ecosystem/environment, and tourism), with area/space being the most studied. Subtopics include rural amenities, rural disappearance/village miniaturization, and rural landscape management. Second, the research topics for each period were different. In the first period(2003-2007), the main research topics were rural amenities and Agricultural production- based climate vulnerability assessment. In the second period(2008-2012), the main research topics were Rural extinction and village depopulation, and rural landscape management, and in the third period(2013-2017), the main research topics were rural sixth industrialization and rural ecotourism. In the fourth period(2018-2022), rural development planning and rural life services(life SOC) were the main research topics. The significance of this study is that it extends the existing method of analyzing research trends and provides basic data to enhance comprehensive insights and understanding of rural research.

빅데이터 연구동향 분석: 토픽 모델링을 중심으로 (Research Trends Analysis of Big Data: Focused on the Topic Modeling)

  • 박종순;김창식
    • 디지털산업정보학회논문지
    • /
    • 제15권1호
    • /
    • pp.1-7
    • /
    • 2019
  • The objective of this study is to examine the trends in big data. Research abstracts were extracted from 4,019 articles, published between 1995 and 2018, on Web of Science and were analyzed using topic modeling and time series analysis. The 20 single-term topics that appeared most frequently were as follows: model, technology, algorithm, problem, performance, network, framework, analytics, management, process, value, user, knowledge, dataset, resource, service, cloud, storage, business, and health. The 20 multi-term topics were as follows: sense technology architecture (T10), decision system (T18), classification algorithm (T03), data analytics (T17), system performance (T09), data science (T06), distribution method (T20), service dataset (T19), network communication (T05), customer & business (T16), cloud computing (T02), health care (T14), smart city (T11), patient & disease (T04), privacy & security (T08), research design (T01), social media (T12), student & education (T13), energy consumption (T07), supply chain management (T15). The time series data indicated that the 40 single-term topics and multi-term topics were hot topics. This study provides suggestions for future research.

지역신문기사 자료와 토픽모델링을 이용한 해변 관련 계절별 현안분석 (Seasonal analysis of Beach-related Issues using Local Newspaper Articles and Topic Modeling)

  • 유무상;정수연;김건후;손철
    • 지역연구
    • /
    • 제34권4호
    • /
    • pp.19-34
    • /
    • 2018
  • 본 연구의 목적은 2004년부터 2017년까지의 해변과 해수욕장을 키워드로 하는 지역신문기사를 이용하여 계절별 현안을 분석하는 것이다. 분석을 위해 오픈소스 프로그램을 기반으로 한 토픽모델링과 시계열회귀분석을 수행하였다. 토픽모델링 분석 결과 계절별 토픽은 봄 35개, 여름 47개, 가을 36개, 겨울 35개가 도출되었다. 모든 계절에서 공통적으로 도출된 주제는 해수욕장, 축제 행사, 사건사고 및 환경문제, 관광지, 개발 분양, 행정 정책, 날씨로 나타났다. 시계열회귀분석 결과 봄에는 35개의 토픽 중 5개의 상승 토픽과 2개의 하락 토픽이 도출되었다. 여름에는 47개의 토픽 중 6개의 상승 토픽과 3개의 하락 토픽이 도출되었다. 가을에는 36개의 토픽 중 4개의 상승 토픽과 3개의 하락 토픽이 도출되었다. 겨울에는 35개의 토픽 중 3개의 상승 토픽과 3개의 하락 토픽이 도출되었다. 그리고 각 계절별로 상승 토픽과 하락 토픽에 해당하지 않는 토픽은 중립 토픽으로 구분하였다. 본 연구를 통해 해변과 같이 계절별로 용도가 다른 경우에 지역현안에 대한 분석을 위해 계절별 토픽모델링을 진행한다면 더욱 유용한 결과를 도출하고 이에 따른 세부적인 진단이 가능하다고 판단된다.