• Title/Summary/Keyword: Topic Data

Search Result 1,572, Processing Time 0.024 seconds

Investigation of Topic Trends in Computer and Information Science by Text Mining Techniques: From the Perspective of Conferences in DBLP (텍스트 마이닝 기법을 이용한 컴퓨터공학 및 정보학 분야 연구동향 조사: DBLP의 학술회의 데이터를 중심으로)

  • Kim, Su Yeon;Song, Sung Jeon;Song, Min
    • Journal of the Korean Society for information Management
    • /
    • v.32 no.1
    • /
    • pp.135-152
    • /
    • 2015
  • The goal of this paper is to explore the field of Computer and Information Science with the aid of text mining techniques by mining Computer and Information Science related conference data available in DBLP (Digital Bibliography & Library Project). Although studies based on bibliometric analysis are most prevalent in investigating dynamics of a research field, we attempt to understand dynamics of the field by utilizing Latent Dirichlet Allocation (LDA)-based multinomial topic modeling. For this study, we collect 236,170 documents from 353 conferences related to Computer and Information Science in DBLP. We aim to include conferences in the field of Computer and Information Science as broad as possible. We analyze topic modeling results along with datasets collected over the period of 2000 to 2011 including top authors per topic and top conferences per topic. We identify the following four different patterns in topic trends in the field of computer and information science during this period: growing (network related topics), shrinking (AI and data mining related topics), continuing (web, text mining information retrieval and database related topics), and fluctuating pattern (HCI, information system and multimedia system related topics).

Analysis of Issues Related to Artificial Intelligence Based on Topic Modeling (토픽모델링을 활용한 인공지능 관련 이슈 분석)

  • Noh, Seol-Hyun
    • Journal of Digital Convergence
    • /
    • v.18 no.5
    • /
    • pp.75-87
    • /
    • 2020
  • The present study determined new value that can be created through the convergence between artificial intelligence technology (AIT) and all industries by deriving and thoroughly analyzing major issues related to artificial intelligence (AI). This study analyzes domestic articles related to AI using topic modeling method based on LDA algorithm. Keywords were extracted from 3,889 articles of eleven metropolitan newspapers, eight business newspapers and major broadcasting companies; articles were selected by searching for the keyword "artificial intelligence". Keywords were extracted by optimizing the relevance parameter λ to improve the measure of pointwise mutual information (PMI), which shows the association among the keywords of each topic, and topic names were inferred from keywords based on valid evidence. The extracted topics widely showed changes occurring throughout society, economy, industries, culture, and the support policy and vision of the government.

Analysis of Research Trends in Elementary Information Education in Korea using Topic Modeling (토픽 모델링을 활용한 국내 초등 정보교육 연구동향 분석)

  • Shim, Jaekwoun
    • Journal of The Korean Association of Information Education
    • /
    • v.25 no.2
    • /
    • pp.347-354
    • /
    • 2021
  • As interest in artificial intelligence education for elementary school students has recently increased, it is necessary to analyze the existing elementary information education research from a macroscopic point of view to understand the current situation and to provide implications for subsequent research. This study analyzed Journal of The Korean Association of Information Education for the purpose of looking at the research trend of elementary information education in Korea. For the data of the study, all papers published until 2020 in the first issue of the journal were selected, and 11 research topics were derived by modeling topics. As a result of the study, topic T1, the highest proportion, was analyzed to account for about 38%, and keywords such as education, research, analysis, elementary school, and information were derived according to the order of contribution to topic T1. As a result of regression analysis according to the year of the topic, it was found that the research trend is changing to computing thinking, software education, and artificial intelligence education. The significance of this study is that text data related to elementary information education is objectively clustered.

Analysis of Research Topics in Archival Studies: Focusing on Academic Papers in Archival Science, Library and Information Science, and History from 2002 to 2023 (국내 기록분야 연구주제 분석: 2002~2023년간 기록관리학, 문헌정보학, 역사학 학술논문을 중심으로)

  • SeonWook Kim
    • Journal of Korean Society of Archives and Records Management
    • /
    • v.23 no.4
    • /
    • pp.91-111
    • /
    • 2023
  • This study aims to analyze research topics within the domain of archival studies by examining bibliographic information from academic papers in archival science, library and information science, and history. After collecting 1,173 academic papers, network analysis was performed based on author keyword data, topic modeling was conducted from abstract data, and the analysis results were organized over time. The network analysis results based on author keywords confirmed that the research topic network actively changed according to variations in major laws and policies. Moreover, topic modeling from the abstract showed that the subjects of the entire academic paper were divided into "Records Management," "Archiving," and "National Records Policy." Notably, from 2002 to 2009, "Records Management" and "National Records Policy" were relatively dominant, but it has achieved balanced quantitative growth since 2009, peaking in 2019.

Topic Modeling of Suicide Papers using Text Mining (텍스트마이닝을 활용한 자살 관련 논문 토픽 모델링)

  • Cho, Kyoung Won;Kim, Ha-young;Kim, Mi-ri;Woo, Young Woon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2019.05a
    • /
    • pp.275-277
    • /
    • 2019
  • The purpose of this study is to classify the topics related to the suicide papers published so far and to identify the proporations of the main topics and the trends of the topics over the past 20 years. For this purpose, a text mining technique used in big data analysis was used as a data base of the Korean Journal of Citation Index (KCI), where information sharing about the papers is most active. This study, which grasps the trends of suicide related research according to the changes of the times, will become a basic data for establishing a strategy to adapt the academic direction related to suicide in the future.

  • PDF

An Analysis of Key Elements for FinTech Companies Based on Text Mining: From the User's Review (텍스트 마이닝 기반의 자산관리 핀테크 기업 핵심 요소 분석: 사용자 리뷰를 바탕으로)

  • Son, Aelin;Shin, Wangsoo;Lee, Zoonky
    • The Journal of Information Systems
    • /
    • v.29 no.4
    • /
    • pp.137-151
    • /
    • 2020
  • Purpose Domestic asset management fintech companies are expected to grow by leaps and bounds along with the implementation of the "Data bills." Contrary to the market fever, however, academic research is insufficient. Therefore, we want to analyze user reviews of asset management fintech companies that are expected to grow significantly in the future to derive strengths and complementary points of services that have been provided, and analyze key elements of asset management fintech companies. Design/methodology/approach To analyze large amounts of review text data, this study applied text mining techniques. Bank Salad and Toss, domestic asset management application services, were selected for the study. To get the data, app reviews were crawled in the online app store and preprocessed using natural language processing techniques. Topic Modeling and Aspect-Sentiment Analysis were used as analysis methods. Findings According to the analysis results, this study was able to derive the elements that asset management fintech companies should have. As a result of Topic Modeling, 7 topics were derived from Bank Salad and Toss respectively. As a result, topics related to function and usage and topics on stability and marketing were extracted. Sentiment Analysis showed that users responded positively to function-related topics, but negatively to usage-related topics and stability topics. Through this, we were able to extract the key elements needed for asset management fintech companies.

A Big Data Analysis on Research Keywords, Centrality, and Topics of International Trade using the Text Mining and Social Network (텍스트 마이닝과 소셜 네트워크 기법을 활용한 국제무역 키워드, 중심성과 토픽에 대한 빅데이터 분석)

  • Chae-Deug Yi
    • Korea Trade Review
    • /
    • v.47 no.4
    • /
    • pp.137-159
    • /
    • 2022
  • This study aims to analyze international trade papers published in Korea during the past 2002-2022 years. Through this study, it is possible to understand the main subject and direction of research in Korea's international trade field. As the research mythologies, this study uses the big data analysis such as the text mining and Social Network Analysis such as frequency analysis, several centrality analysis, and topic analysis. After analyzing the empirical results, the frequency of key word is very high in trade, export, tariff, market, industry, and the performance of firm. However, there has been a tendency to include logistics, e-business, value and chain, and innovation over the time. The degree and closeness centrality analyses also show that the higher frequency key words also have been higher in the degree and closeness centrality. In contrast, the order of eigenvector centrality seems to be different from those of the degree and closeness centrality. The ego network shows the density of business, sale, exchange, and integration appears to be high in order unlike the frequency analysis. The topic analysis shows that the export, trade, tariff, logstics, innovation, industry, value, and chain seem to have high the probabilities of included in several topics.

A Topic Analysis of Abstracts in Journal of Korean Data Analysis Society (한국자료분석학회지에 대한 토픽분석)

  • Kang, Changwan;Kim, Kyu Kon;Choi, Seungbae
    • Journal of the Korean Data Analysis Society
    • /
    • v.20 no.6
    • /
    • pp.2907-2915
    • /
    • 2018
  • Journal of the Korean Data Analysis Society founded in 1998 has played the role of a major application journal. In this study, we checked the objective of this journal by checking the abstracts for 10 years. Abstract data was crawled from the online journal site (kdas.jems.or.kr) and analyzed by topic model. As a result, we found 18 topics from 2680 abstracts that had several contents, for example, nursing, marketing, economics, regression, factor analysis, data mining and statistical inferences. Topic1 (regression) is most frequent with 460 documents and we found the usefulness of regression in the applied science area. We confirmed the significant 10 association rules using by Fisher's exact test. Also, for exploring the trend of topics, we conducted the topic analysis for two periods which are 2006-2011 period and 2012-2016 period. We found that the control study was more frequent than survey study over time and regression and factor analysis were frequent regardless of time.

Non-Simultaneous Sampling Deactivation during the Parameter Approximation of a Topic Model

  • Jeong, Young-Seob;Jin, Sou-Young;Choi, Ho-Jin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.7 no.1
    • /
    • pp.81-98
    • /
    • 2013
  • Since Probabilistic Latent Semantic Analysis (PLSA) and Latent Dirichlet Allocation (LDA) were introduced, many revised or extended topic models have appeared. Due to the intractable likelihood of these models, training any topic model requires to use some approximation algorithm such as variational approximation, Laplace approximation, or Markov chain Monte Carlo (MCMC). Although these approximation algorithms perform well, training a topic model is still computationally expensive given the large amount of data it requires. In this paper, we propose a new method, called non-simultaneous sampling deactivation, for efficient approximation of parameters in a topic model. While each random variable is normally sampled or obtained by a single predefined burn-in period in the traditional approximation algorithms, our new method is based on the observation that the random variable nodes in one topic model have all different periods of convergence. During the iterative approximation process, the proposed method allows each random variable node to be terminated or deactivated when it is converged. Therefore, compared to the traditional approximation ways in which usually every node is deactivated concurrently, the proposed method achieves the inference efficiency in terms of time and memory. We do not propose a new approximation algorithm, but a new process applicable to the existing approximation algorithms. Through experiments, we show the time and memory efficiency of the method, and discuss about the tradeoff between the efficiency of the approximation process and the parameter consistency.

Topic-Network based Topic Shift Detection on Twitter (트위터 데이터를 이용한 네트워크 기반 토픽 변화 추적 연구)

  • Jin, Seol A;Heo, Go Eun;Jeong, Yoo Kyung;Song, Min
    • Journal of the Korean Society for information Management
    • /
    • v.30 no.1
    • /
    • pp.285-302
    • /
    • 2013
  • This study identified topic shifts and patterns over time by analyzing an enormous amount of Twitter data whose characteristics are high accessibility and briefness. First, we extracted keywords for a certain product and used them for representing the topic network allows for intuitive understanding of keywords associated with topics by nodes and edges by co-word analysis. We conducted temporal analysis of term co-occurrence as well as topic modeling to examine the results of network analysis. In addition, the results of comparing topic shifts on Twitter with the corresponding retrieval results from newspapers confirm that Twitter makes immediate responses to news media and spreads the negative issues out quickly. Our findings may suggest that companies utilize the proposed technique to identify public's negative opinions as quickly as possible and to apply for the timely decision making and effective responses to their customers.