• Title/Summary/Keyword: Topic Time

Search Result 811, Processing Time 0.031 seconds

Non-Simultaneous Sampling Deactivation during the Parameter Approximation of a Topic Model

  • Jeong, Young-Seob;Jin, Sou-Young;Choi, Ho-Jin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.7 no.1
    • /
    • pp.81-98
    • /
    • 2013
  • Since Probabilistic Latent Semantic Analysis (PLSA) and Latent Dirichlet Allocation (LDA) were introduced, many revised or extended topic models have appeared. Due to the intractable likelihood of these models, training any topic model requires to use some approximation algorithm such as variational approximation, Laplace approximation, or Markov chain Monte Carlo (MCMC). Although these approximation algorithms perform well, training a topic model is still computationally expensive given the large amount of data it requires. In this paper, we propose a new method, called non-simultaneous sampling deactivation, for efficient approximation of parameters in a topic model. While each random variable is normally sampled or obtained by a single predefined burn-in period in the traditional approximation algorithms, our new method is based on the observation that the random variable nodes in one topic model have all different periods of convergence. During the iterative approximation process, the proposed method allows each random variable node to be terminated or deactivated when it is converged. Therefore, compared to the traditional approximation ways in which usually every node is deactivated concurrently, the proposed method achieves the inference efficiency in terms of time and memory. We do not propose a new approximation algorithm, but a new process applicable to the existing approximation algorithms. Through experiments, we show the time and memory efficiency of the method, and discuss about the tradeoff between the efficiency of the approximation process and the parameter consistency.

Topic-Network based Topic Shift Detection on Twitter (트위터 데이터를 이용한 네트워크 기반 토픽 변화 추적 연구)

  • Jin, Seol A;Heo, Go Eun;Jeong, Yoo Kyung;Song, Min
    • Journal of the Korean Society for information Management
    • /
    • v.30 no.1
    • /
    • pp.285-302
    • /
    • 2013
  • This study identified topic shifts and patterns over time by analyzing an enormous amount of Twitter data whose characteristics are high accessibility and briefness. First, we extracted keywords for a certain product and used them for representing the topic network allows for intuitive understanding of keywords associated with topics by nodes and edges by co-word analysis. We conducted temporal analysis of term co-occurrence as well as topic modeling to examine the results of network analysis. In addition, the results of comparing topic shifts on Twitter with the corresponding retrieval results from newspapers confirm that Twitter makes immediate responses to news media and spreads the negative issues out quickly. Our findings may suggest that companies utilize the proposed technique to identify public's negative opinions as quickly as possible and to apply for the timely decision making and effective responses to their customers.

Recent Research Trend Analysis for the Journal of Society of Korea Industrial and Systems Engineering Using Topic Modeling (토픽모델링을 활용한 한국산업경영시스템학회지의 최근 연구주제 분석)

  • Dong Joon Park;Pyung Hoi Koo;Hyung Sool Oh;Min Yoon
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.46 no.3
    • /
    • pp.170-185
    • /
    • 2023
  • The advent of big data has brought about the need for analytics. Natural language processing (NLP), a field of big data, has received a lot of attention. Topic modeling among NLP is widely applied to identify key topics in various academic journals. The Korean Society of Industrial and Systems Engineering (KSIE) has published academic journals since 1978. To enhance its status, it is imperative to recognize the diversity of research domains. We have already discovered eight major research topics for papers published by KSIE from 1978 to 1999. As a follow-up study, we aim to identify major topics of research papers published in KSIE from 2000 to 2022. We performed topic modeling on 1,742 research papers during this period by using LDA and BERTopic which has recently attracted attention. BERTopic outperformed LDA by providing a set of coherent topic keywords that can effectively distinguish 36 topics found out this study. In terms of visualization techniques, pyLDAvis presented better two-dimensional scatter plots for the intertopic distance map than BERTopic. However, BERTopic provided much more diverse visualization methods to explore the relevance of 36 topics. BERTopic was also able to classify hot and cold topics by presenting 'topic over time' graphs that can identify topic trends over time.

Research of Patent Technology Trends in Textile Materials: Text Mining Methodology Using DETM & STM (섬유소재 분야 특허 기술 동향 분석: DETM & STM 텍스트마이닝 방법론 활용)

  • Lee, Hyun Sang;Jo, Bo Geun;Oh, Se Hwan;Ha, Sung Ho
    • The Journal of Information Systems
    • /
    • v.30 no.3
    • /
    • pp.201-216
    • /
    • 2021
  • Purpose The purpose of this study is to analyze the trend of patent technology in textile materials using text mining methodology based on Dynamic Embedded Topic Model and Structural Topic Model. It is expected that this study will have positive impact on revitalizing and developing textile materials industry as finding out technology trends. Design/methodology/approach The data used in this study is 866 domestic patent text data in textile material from 1974 to 2020. In order to analyze technology trends from various aspect, Dynamic Embedded Topic Model and Structural Topic Model mechanism were used. The word embedding technique used in DETM is the GloVe technique. For Stable learning of topic modeling, amortized variational inference was performed based on the Recurrent Neural Network. Findings As a result of this analysis, it was found that 'manufacture' topics had the largest share among the six topics. Keyword trend analysis found the fact that natural and nanotechnology have recently been attracting attention. The metadata analysis results showed that manufacture technologies could have a high probability of patent registration in entire time series, but the analysis results in recent years showed that the trend of elasticity and safety technology is increasing.

Method of Extracting the Topic Sentence Considering Sentence Importance based on ELMo Embedding (ELMo 임베딩 기반 문장 중요도를 고려한 중심 문장 추출 방법)

  • Kim, Eun Hee;Lim, Myung Jin;Shin, Ju Hyun
    • Smart Media Journal
    • /
    • v.10 no.1
    • /
    • pp.39-46
    • /
    • 2021
  • This study is about a method of extracting a summary from a news article in consideration of the importance of each sentence constituting the article. We propose a method of calculating sentence importance by extracting the probabilities of topic sentence, similarity with article title and other sentences, and sentence position as characteristics that affect sentence importance. At this time, a hypothesis is established that the Topic Sentence will have a characteristic distinct from the general sentence, and a deep learning-based classification model is trained to obtain a topic sentence probability value for the input sentence. Also, using the pre-learned ELMo language model, the similarity between sentences is calculated based on the sentence vector value reflecting the context information and extracted as sentence characteristics. The topic sentence classification performance of the LSTM and BERT models was 93% accurate, 96.22% recall, and 89.5% precision, resulting in high analysis results. As a result of calculating the importance of each sentence by combining the extracted sentence characteristics, it was confirmed that the performance of extracting the topic sentence was improved by about 10% compared to the existing TextRank algorithm.

Semantic Visualization of Dynamic Topic Modeling (다이내믹 토픽 모델링의 의미적 시각화 방법론)

  • Yeon, Jinwook;Boo, Hyunkyung;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.131-154
    • /
    • 2022
  • Recently, researches on unstructured data analysis have been actively conducted with the development of information and communication technology. In particular, topic modeling is a representative technique for discovering core topics from massive text data. In the early stages of topic modeling, most studies focused only on topic discovery. As the topic modeling field matured, studies on the change of the topic according to the change of time began to be carried out. Accordingly, interest in dynamic topic modeling that handle changes in keywords constituting the topic is also increasing. Dynamic topic modeling identifies major topics from the data of the initial period and manages the change and flow of topics in a way that utilizes topic information of the previous period to derive further topics in subsequent periods. However, it is very difficult to understand and interpret the results of dynamic topic modeling. The results of traditional dynamic topic modeling simply reveal changes in keywords and their rankings. However, this information is insufficient to represent how the meaning of the topic has changed. Therefore, in this study, we propose a method to visualize topics by period by reflecting the meaning of keywords in each topic. In addition, we propose a method that can intuitively interpret changes in topics and relationships between or among topics. The detailed method of visualizing topics by period is as follows. In the first step, dynamic topic modeling is implemented to derive the top keywords of each period and their weight from text data. In the second step, we derive vectors of top keywords of each topic from the pre-trained word embedding model. Then, we perform dimension reduction for the extracted vectors. Then, we formulate a semantic vector of each topic by calculating weight sum of keywords in each vector using topic weight of each keyword. In the third step, we visualize the semantic vector of each topic using matplotlib, and analyze the relationship between or among the topics based on the visualized result. The change of topic can be interpreted in the following manners. From the result of dynamic topic modeling, we identify rising top 5 keywords and descending top 5 keywords for each period to show the change of the topic. Existing many topic visualization studies usually visualize keywords of each topic, but our approach proposed in this study differs from previous studies in that it attempts to visualize each topic itself. To evaluate the practical applicability of the proposed methodology, we performed an experiment on 1,847 abstracts of artificial intelligence-related papers. The experiment was performed by dividing abstracts of artificial intelligence-related papers into three periods (2016-2017, 2018-2019, 2020-2021). We selected seven topics based on the consistency score, and utilized the pre-trained word embedding model of Word2vec trained with 'Wikipedia', an Internet encyclopedia. Based on the proposed methodology, we generated a semantic vector for each topic. Through this, by reflecting the meaning of keywords, we visualized and interpreted the themes by period. Through these experiments, we confirmed that the rising and descending of the topic weight of a keyword can be usefully used to interpret the semantic change of the corresponding topic and to grasp the relationship among topics. In this study, to overcome the limitations of dynamic topic modeling results, we used word embedding and dimension reduction techniques to visualize topics by era. The results of this study are meaningful in that they broadened the scope of topic understanding through the visualization of dynamic topic modeling results. In addition, the academic contribution can be acknowledged in that it laid the foundation for follow-up studies using various word embeddings and dimensionality reduction techniques to improve the performance of the proposed methodology.

Performance Analysis of TNS System for Improving DDS Discovery (DDS 검색 방식 개선을 위한 TNS 시스템 성능 분석)

  • Yoon, Gunjae;Choi, Jeonghyun;Choi, Hoon
    • The Journal of Korean Institute of Next Generation Computing
    • /
    • v.14 no.6
    • /
    • pp.75-86
    • /
    • 2018
  • The DDS (Data Distribution Service) specification defines a discovery method for finding participants and endpoints in a DDS network. The standard discovery mechanism uses the multicast protocol and finds all the endpoints in the network. Because of using multicasting, discovery may fail in a network with different segments. Other problems include that memory space wastes due to storing information of all the endpoints. The Topic Name Service (TNS) solves these problems by unicasting only the endpoints, which are required for communication. However, an extra delay time is inevitable in components of TNS, i.e, a front-end server, topic name servers, and a terminal server. In this paper, we analyze the performance of TNS. Delay times in the servers of TNS and time required to receive endpoint information are measured. Time to finish discovery and number of receiving endpoints compare with the standard discovery method.

Macroscopic and microscopic mass transfer in silicon czochralski method

  • Kakimoto, Koichi
    • Journal of the Korean Crystal Growth and Crystal Technology
    • /
    • v.9 no.4
    • /
    • pp.381-383
    • /
    • 1999
  • First topic of this paper aims to clarify how oxygen and heat transfer in silicon melt under cusp-shaped magnetic fields. We obtained asymmetric temperature distribution by using time dependent and three-dimensional calculation. Second topic is study on molecular dynamics simulation, which was carried out to estimate diffusion constants of oxygen in silicon melt.

  • PDF

Study of Active Galactic Nuclei and Gravitational Wave Sources with Time-series Observation

  • Kim, Joonho;Im, Myungshin
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.46 no.2
    • /
    • pp.39.1-39.1
    • /
    • 2021
  • In this presentation, study of the energetic astronomical phenomena, active galactic nucleus (AGN) and gravitational wave (GW) source, with time-series observation will be reported. They emit large amounts of energy and play an important role in the history of the Universe. First, intra-night variability of AGNs is studied using Korea Microlensing Telescope Network (KMTNet). Second topic is photometric reverberation mapping which is applied for 11 AGNs with medium-bands and Lee Sang Gak Telescope. Last, three gravitational wave events were followed-up by various optical telescopes. Each topic will be specifically addressed in the presentation.

  • PDF

Research Trends on Doctor's Job Competencies in Korea Using Text Network Analysis (텍스트네트워크 분석을 활용한 국내 의사 직무역량 연구동향 분석)

  • Kim, Young Jon;Lee, Jea Woog;Yune, So Jung
    • Korean Medical Education Review
    • /
    • v.24 no.2
    • /
    • pp.93-102
    • /
    • 2022
  • We use the concept of the "doctor's role" as a guideline for developing medical education programs for medical students, residents, and doctors. Therefore, we should regularly reflect on the times and social needs to develop a clear sense of that role. The objective of the present study was to understand the knowledge structure related to doctor's job competencies in Korea. We analyzed research trends related to doctor's job competencies in Korea Citation Index journals using text network analysis through an integrative approach focusing on identifying social issues. We finally selected 1,354 research papers related to doctor's job competencies from 2011 to 2020, and we analyzed 2,627 words through data pre-processing with the NetMiner ver. 4.2 program (Cyram Inc., Seongnam, Korea). We conducted keyword centrality analysis, topic modeling, frequency analysis, and linear regression analysis using NetMiner ver. 4.2 (Cyram Inc.) and IBM SPSS ver. 23.0 (IBM Corp., Armonk, NY, USA). As a result of the study, words such as "family," "revision," and "rejection" appeared frequently. In topic modeling, we extracted five potential topics: "topic 1: Life and death in medical situations," "topic 2: Medical practice under the Medical Act," "topic 3: Medical malpractice and litigation," "topic 4: Medical professionalism," and "topic 5: Competency development education for medical students." Although there were no statistically significant changes in the research trends for each topic over time, it is nonetheless known that social changes could affect the demand for doctor's job competencies.