• Title/Summary/Keyword: Topic Evaluation

Search Result 406, Processing Time 0.029 seconds

Review of Wind Energy Publications in Korea Citation Index using Latent Dirichlet Allocation (잠재디리클레할당을 이용한 한국학술지인용색인의 풍력에너지 문헌검토)

  • Kim, Hyun-Goo;Lee, Jehyun;Oh, Myeongchan
    • New & Renewable Energy
    • /
    • v.16 no.4
    • /
    • pp.33-40
    • /
    • 2020
  • The research topics of more than 1,900 wind energy papers registered in the Korean Journal Citation Index (KCI) were modeled into 25 topics using latent directory allocation (LDA), and their consistency was cross-validated through principal component analysis (PCA) of the document word matrix. Key research topics in the wind energy field were identified as "offshore, wind farm," "blade, design," "generator, voltage, control," 'dynamic, load, noise," and "performance test." As a new method to determine the similarity between research topics in journals, a systematic evaluation method was proposed to analyze the correlation between topics by constructing a journal-topic matrix (JTM) and clustering them based on topic similarity between journals. By evaluating 24 journals that published more than 20 wind energy papers, it was confirmed that they were classified into meaningful clusters of mechanical engineering, electrical engineering, marine engineering, and renewable energy. It is expected that the proposed systematic method can be applied to the evaluation of the specificity of subsequent journals.

Counting Research Publications, Citations, and Topics: A Critical Assessment of the Empirical Basis of Scientometrics and Research Evaluation

  • Wolfgang G. Stock;Gerhard Reichmann;Isabelle Dorsch;Christian Schlogl
    • Journal of Information Science Theory and Practice
    • /
    • v.11 no.2
    • /
    • pp.37-66
    • /
    • 2023
  • Scientometrics and research evaluation describe and analyze research publications when conducting publication, citation, and topic analyses. However, what exactly is a (scientific, academic, scholarly or research) publication? This article demonstrates that there are many problems when it comes to looking in detail at quantitative publication analyses, citation analyses, altmetric analyses, and topic analyses. When is a document a publication and when is it not? We discuss authorship and contribution, formally and informally published documents, as well as documents in between (preprints, research data) and the characteristics of references, citations, and topics. What is a research publication? Is there a commonly accepted criterion for distinguishing between research and non-research? How complete and unbiased are data sources for research publications and sources for altmetrics? What is one research publication? What is the unit of a publication that causes us to count it as "1?" In this regard, we report problems related to multi-author publications and their counting, weighted document types, the unit and weighting of citations and references, the unit of topics, and counting problems-not only at the article and individual researcher level (micro-level), but also at the meso-level (e.g., institutions) and macro-level (e.g., countries). Our results suggest that scientometric counting units are not reliable and clear. Many scientometric and research evaluation studies must therefore be used with the utmost caution.

Analysis of outdoor-wear research trends using topic modeling (토픽 모델링을 이용한 아웃도어웨어 연구 동향 분석)

  • Kihyang Han;Minsun Lee
    • The Research Journal of the Costume Culture
    • /
    • v.31 no.1
    • /
    • pp.53-69
    • /
    • 2023
  • This study aims to analyze research trends regarding outdoor wear. For this purpose, the data-collection period was limited to January 2002-October 2022, and the collection consisted of titles of papers, academic names, abstracts, and publication years from the Research Information Sharing Service (RISS). Frequency analysis was conducted on 227 papers in total to check academic journals and annual trends, and LDA topic-modeling analysis was conducted using 20,964 tokens. Data pre-processing was performed prior to topic-modeling analysis; after that, topic-modeling analysis, core topic derivation, and visualization were performed using a Python algorithm. A total of eight topics were obtained from the comprehensive analysis: experiential marketing and lifestyle, property and evaluation of outdoor wear, design and patterns of outdoor wear, outdoor-wear purchase behavior, color, designs and materials of outdoor wear, promotional strategies for outdoor wear, purchase intention and satisfaction depending on the brand image of outdoor wear, differences in outdoor wear preferences by consumer group. The results of topic-modeling analysis revealed that the topic, which includes a study on the design and material of outdoor wear and the pattern of jackets related to the overall shape, was the highest at 30.9% of the total topics. The next highest topic was also the design and color of outdoor wear, indicating that design-related research was the main research topic in outdoor wear research. It is hoped that analyzing outdoor wear research will help comprehend the research conducted thus far and reveal future directions.

Evolutionary Topic Maps (진화연산을 통해 만들어지는 토픽맵)

  • Kim, Ju-Ho;Hong, Won-Wook;McKay, Robert Ian
    • 한국HCI학회:학술대회논문집
    • /
    • 2009.02a
    • /
    • pp.685-689
    • /
    • 2009
  • Evolutionary Computation is not only widely used in optimization and machine learning, but also being applied in creating novel structures and entities. This paper proposes evolutionary topic maps that can suggest new and creative knowledge not easily producible by humans. Interactive evolutionary computation method is applied into topic maps in order to accept human evaluation on feasibility of intermediate topic maps. Evolutionary topic maps are creativity support tools, helping users to encounter new and creative knowledge. Further work can greatly improve the system by providing more operations, preventing over-convergence, and overcoming user fatigue problem by providing more intuitive user interface, better visualization, and interpolation mechanisms.

  • PDF

A Semantic Aspect-Based Vector Space Model to Identify the Event Evolution Relationship within Topics

  • Xi, Yaoyi;Li, Bicheng;Liu, Yang
    • Journal of Computing Science and Engineering
    • /
    • v.9 no.2
    • /
    • pp.73-82
    • /
    • 2015
  • Understanding how the topic evolves is an important and challenging task. A topic usually consists of multiple related events, and the accurate identification of event evolution relationship plays an important role in topic evolution analysis. Existing research has used the traditional vector space model to represent the event, which cannot be used to accurately compute the semantic similarity between events. This has led to poor performance in identifying event evolution relationship. This paper suggests constructing a semantic aspect-based vector space model to represent the event: First, use hierarchical Dirichlet process to mine the semantic aspects. Then, construct a semantic aspect-based vector space model according to these aspects. Finally, represent each event as a point and measure the semantic relatedness between events in the space. According to our evaluation experiments, the performance of our proposed technique is promising and significantly outperforms the baseline methods.

A Process-Centered Knowledge Model for Analysis of Technology Innovation Procedures

  • Chun, Seungsu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.3
    • /
    • pp.1442-1453
    • /
    • 2016
  • Now, there are prodigiously expanding worldwide economic networks in the information society, which require their social structural changes through technology innovations. This paper so tries to formally define a process-centered knowledge model to be used to analyze policy-making procedures on technology innovations. The eventual goal of the proposed knowledge model is to apply itself to analyze a topic network based upon composite keywords from a document written in a natural language format during the technology innovation procedures. Knowledge model is created to topic network that compositing driven keyword through text mining from natural language in document. And we show that the way of analyzing knowledge model and automatically generating feature keyword and relation properties into topic networks.

Evaluation of Topic Modeling Performance for Overseas Construction Market Analysis Using LDA and BERTopic on News Articles (LDA 및 BERTopic 기반 해외건설시장 뉴스 기사 토픽모델링 성능평가)

  • Baik, Joonwoo;Chung, Sehwan;Chi, Seokho
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.43 no.6
    • /
    • pp.811-819
    • /
    • 2023
  • Understanding the local conditions is a crucial factor in enhancing the success potential of overseas construction projects. This can be achieved through the analysis of news articles of the target market using topic modeling techniques. In this study, the authors aimed to analyze news articles using two topic modeling methods, namely Latent Dirichlet Allocation (LDA) and BERTopic, in order to determine the optimal approach for market condition analysis. To evaluate the alignment between the generated topics and the actual themes of the news documents, the research collected 6,273 BBC news articles, created ground truth data for individual news article topics, and finally compared this ground truth with the results of the topic modeling. The F1 score for LDA was 0.011, while BERTopic achieved a score of 0.244. These results indicate that BERTopic more accurately reflected the actual topics of news articles, making it more effective for understanding the overseas construction market.

Combining Ego-centric Network Analysis and Dynamic Citation Network Analysis to Topic Modeling for Characterizing Research Trends (자아 중심 네트워크 분석과 동적 인용 네트워크를 활용한 토픽모델링 기반 연구동향 분석에 관한 연구)

  • Yu, So-Young
    • Journal of the Korean Society for information Management
    • /
    • v.32 no.1
    • /
    • pp.153-169
    • /
    • 2015
  • The combined approach of using ego-centric network analysis and dynamic citation network analysis for refining the result of LDA-based topic modeling was suggested and examined in this study. Tow datasets were constructed by collecting Web of Science bibliographic records of White LED and topic modeling was performed by setting a different number of topics on each dataset. The multi-assigned top keywords of each topic were re-assigned to one specific topic by applying an ego-centric network analysis algorithm. It was found that the topical cohesion of the result of topic modeling with the number of topic corresponding to the lowest value of perplexity to the dataset extracted by SPLC network analysis was the strongest with the best values of internal clustering evaluation indices. Furthermore, it demonstrates the possibility of developing the suggested approach as a method of multi-faceted research trend detection.

A Survey on Automatic Twitter Event Summarization

  • Rudrapal, Dwijen;Das, Amitava;Bhattacharya, Baby
    • Journal of Information Processing Systems
    • /
    • v.14 no.1
    • /
    • pp.79-100
    • /
    • 2018
  • Twitter is one of the most popular social platforms for online users to share trendy information and views on any event. Twitter reports an event faster than any other medium and contains enormous information and views regarding an event. Consequently, Twitter topic summarization is one of the most convenient ways to get instant gist of any event. However, the information shared on Twitter is often full of nonstandard abbreviations, acronyms, out of vocabulary (OOV) words and with grammatical mistakes which create challenges to find reliable and useful information related to any event. Undoubtedly, Twitter event summarization is a challenging task where traditional text summarization methods do not work well. In last decade, various research works introduced different approaches for automatic Twitter topic summarization. The main aim of this survey work is to make a broad overview of promising summarization approaches on a Twitter topic. We also focus on automatic evaluation of summarization techniques by surveying recent evaluation methodologies. At the end of the survey, we emphasize on both current and future research challenges in this domain through a level of depth analysis of the most recent summarization approaches.

Abnormal Behavior Recognition Based on Spatio-temporal Context

  • Yang, Yuanfeng;Li, Lin;Liu, Zhaobin;Liu, Gang
    • Journal of Information Processing Systems
    • /
    • v.16 no.3
    • /
    • pp.612-628
    • /
    • 2020
  • This paper presents a new approach for detecting abnormal behaviors in complex surveillance scenes where anomalies are subtle and difficult to distinguish due to the intricate correlations among multiple objects' behaviors. Specifically, a cascaded probabilistic topic model was put forward for learning the spatial context of local behavior and the temporal context of global behavior in two different stages. In the first stage of topic modeling, unlike the existing approaches using either optical flows or complete trajectories, spatio-temporal correlations between the trajectory fragments in video clips were modeled by the latent Dirichlet allocation (LDA) topic model based on Markov random fields to obtain the spatial context of local behavior in each video clip. The local behavior topic categories were then obtained by exploiting the spectral clustering algorithm. Based on the construction of a dictionary through the process of local behavior topic clustering, the second phase of the LDA topic model learns the correlations of global behaviors and temporal context. In particular, an abnormal behavior recognition method was developed based on the learned spatio-temporal context of behaviors. The specific identification method adopts a top-down strategy and consists of two stages: anomaly recognition of video clip and anomalous behavior recognition within each video clip. Evaluation was performed using the validity of spatio-temporal context learning for local behavior topics and abnormal behavior recognition. Furthermore, the performance of the proposed approach in abnormal behavior recognition improved effectively and significantly in complex surveillance scenes.