• Title/Summary/Keyword: LDA algorithm

Search Result 157, Processing Time 0.024 seconds

Design of Optimized Radial Basis Function Neural Networks Classifier with the Aid of Principal Component Analysis and Linear Discriminant Analysis (주성분 분석법과 선형판별 분석법을 이용한 최적화된 방사형 기저 함수 신경회로망 분류기의 설계)

  • Kim, Wook-Dong;Oh, Sung-Kwun
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.22 no.6
    • /
    • pp.735-740
    • /
    • 2012
  • In this paper, we introduce design methodologies of polynomial radial basis function neural network classifier with the aid of Principal Component Analysis(PCA) and Linear Discriminant Analysis(LDA). By minimizing the information loss of given data, Feature data is obtained through preprocessing of PCA and LDA and then this data is used as input data of RBFNNs. The hidden layer of RBFNNs is built up by Fuzzy C-Mean(FCM) clustering algorithm instead of receptive fields and linear polynomial function is used as connection weights between hidden and output layer. In order to design optimized classifier, the structural and parametric values such as the number of eigenvectors of PCA and LDA, and fuzzification coefficient of FCM algorithm are optimized by Artificial Bee Colony(ABC) optimization algorithm. The proposed classifier is applied to some machine learning datasets and its result is compared with some other classifiers.

Study of Traffic Sign Auto-Recognition (교통 표지판 자동 인식에 관한 연구)

  • Kwon, Mann-Jun
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.15 no.9
    • /
    • pp.5446-5451
    • /
    • 2014
  • Because there are some mistakes by hand in processing electronic maps using a navigation terminal, this paper proposes an automatic offline recognition for traffic signs, which are considered ingredient navigation information. Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA), which have been used widely in the field of 2D face recognition as computer vision and pattern recognition applications, was used to recognize traffic signs. First, using PCA, a high-dimensional 2D image data was projected to a low-dimensional feature vector. The LDA maximized the between scatter matrix and minimized the within scatter matrix using the low-dimensional feature vector obtained from PCA. The extracted traffic signs under a real-world road environment were recognized successfully with a 92.3% recognition rate using the 40 feature vectors created by the proposed algorithm.

Parameter Considering Variance Property for Speech Recognition in Noisy Environment (잡음환경에서의 음성인식을 위한 변이특성을 고려한 파라메터)

  • Park, Jin-Young;Lee, Kwang-Seok;Koh, Si-Young;Hur, Kang-In
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • v.9 no.2
    • /
    • pp.469-472
    • /
    • 2005
  • This paper propose about effective speech feature parameter that have robust character in effect of noise in realizing speech recognition system. Established MFCC that is the basic parameter used to ASR(Automatic Speech Recognition) and DCTCs that use DCT in basic parameter. Also, proposed delta-Cepstrum and delta-delta-Cepstrum parameter that reconstruct Cepstrum to have information for variation of speech. And compared recognition performance in using HMM. For dimension reduction of each parameter LDA algorithm apply and compared recognition. Results are presented reduced dimension delta-delta-Cepstrum parameter in using LDA recognition performance that improve more than existent parameter in noise environment of various condition.

  • PDF

Evaluation of Depth Image of IR Range Sensor with Face Recognition Algorithms (적외선 거리 센서 깊이이미지를 이용한 얼굴 인식 알고리즘 평가)

  • Kwon, Ki-Hyeon
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.13 no.8
    • /
    • pp.3666-3671
    • /
    • 2012
  • We evaluate the face detection and recognition of depth image that is obtained by infrared range sensor. and Face recognition was usually focused on accuracy aspect but it is not enough to evaluate the performance in testing for real world application. In this paper, we evaluate the overall performance like accuracy, training, test speed and memory use for the well known face recognition algorithm like PCA, LDA, ICA and SVM. This experiment evaluate the good results of depth and colored depth image compatible with the colored image although the file size of depth and colored depth image is 30%~40% less than the colored image. Whereas, LDA got the good accuracy performance next to the SVM and also shows the good performance in speed and the amount of memory.

Study on Potential Topics of the MyData and Data Transactions Using LDA Topic Modeling (국내 마이데이터 태동과 데이터 거래에 관한 잠재적 주제 분석)

  • Cho, Ji Yeon;Lee, Bong Gyou
    • Journal of Digital Convergence
    • /
    • v.20 no.3
    • /
    • pp.221-229
    • /
    • 2022
  • With the recent full-fledged MyData service, interest in the use of personal data is increasing. However, studies on MyData are still in the early stages, focusing on legal and institutional discussions, and studies from a comprehensive perspective are insufficient. Therefore, this study aimed at finding the potential topics formed by social discussions by analyzing news data from 2018 to the present. News data analysis using LDA topic modeling were conducted and 6 potential topics including digital transformation in finance, scope of Mydata business license, amendments and data-related laws, safe use of big data, data economy promotion policy and strategy of the financial industry were derived. This study has significance in that it comprehensively viewed the issues that emerged with the MyData and deriving gaps in previous discussion. Future research is expected to identify changes after the launch of MyData service and provide specific implications through research by specific industries.

Performance Improvement of Topic Modeling using BART based Document Summarization (BART 기반 문서 요약을 통한 토픽 모델링 성능 향상)

  • Eun Su Kim;Hyun Yoo;Kyungyong Chung
    • Journal of Internet Computing and Services
    • /
    • v.25 no.3
    • /
    • pp.27-33
    • /
    • 2024
  • The environment of academic research is continuously changing due to the increase of information, which raises the need for an effective way to analyze and organize large amounts of documents. In this paper, we propose Performance Improvement of Topic Modeling using BART(Bidirectional and Auto-Regressive Transformers) based Document Summarization. The proposed method uses BART-based document summary model to extract the core content and improve topic modeling performance using LDA(Latent Dirichlet Allocation) algorithm. We suggest an approach to improve the performance and efficiency of LDA topic modeling through document summarization and validate it through experiments. The experimental results show that the BART-based model for summarizing article data captures the important information of the original articles with F1-Scores of 0.5819, 0.4384, and 0.5038 in Rouge-1, Rouge-2, and Rouge-L performance evaluations, respectively. In addition, topic modeling using summarized documents performs about 8.08% better than topic modeling using full text in the performance comparison using the Perplexity metric. This contributes to the reduction of data throughput and improvement of efficiency in the topic modeling process.

Real-Time Face Recognition Based on Subspace and LVQ Classifier (부분공간과 LVQ 분류기에 기반한 실시간 얼굴 인식)

  • Kwon, Oh-Ryun;Min, Kyong-Pil;Chun, Jun-Chul
    • Journal of Internet Computing and Services
    • /
    • v.8 no.3
    • /
    • pp.19-32
    • /
    • 2007
  • This paper present a new face recognition method based on LVQ neural net to construct a real time face recognition system. The previous researches which used PCA, LDA combined neural net usually need much time in training neural net. The supervised LVQ neural net needs much less time in training and can maximize the separability between the classes. In this paper, the proposed method transforms the input face image by PCA and LDA sequentially into low-dimension feature vectors and recognizes the face through LVQ neural net. In order to make the system robust to external light variation, light compensation is performed on the detected face by max-min normalization method as preprocessing. PCA and LDA transformations are applied to the normalized face image to produce low-level feature vectors of the image. In order to determine the initial centers of LVQ and speed up the convergency of the LVQ neural net, the K-Means clustering algorithm is adopted. Subsequently, the class representative vectors can be produced by LVQ2 training using initial center vectors. The face recognition is achieved by using the euclidean distance measure between the center vector of classes and the feature vector of input image. From the experiments, we can prove that the proposed method is more effective in the recognition ratio for the cases of still images from ORL database and sequential images rather than using conventional PCA of a hybrid method with PCA and LDA.

  • PDF

A study on the classification of research topics based on COVID-19 academic research using Topic modeling (토픽모델링을 활용한 COVID-19 학술 연구 기반 연구 주제 분류에 관한 연구)

  • Yoo, So-yeon;Lim, Gyoo-gun
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.155-174
    • /
    • 2022
  • From January 2020 to October 2021, more than 500,000 academic studies related to COVID-19 (Coronavirus-2, a fatal respiratory syndrome) have been published. The rapid increase in the number of papers related to COVID-19 is putting time and technical constraints on healthcare professionals and policy makers to quickly find important research. Therefore, in this study, we propose a method of extracting useful information from text data of extensive literature using LDA and Word2vec algorithm. Papers related to keywords to be searched were extracted from papers related to COVID-19, and detailed topics were identified. The data used the CORD-19 data set on Kaggle, a free academic resource prepared by major research groups and the White House to respond to the COVID-19 pandemic, updated weekly. The research methods are divided into two main categories. First, 41,062 articles were collected through data filtering and pre-processing of the abstracts of 47,110 academic papers including full text. For this purpose, the number of publications related to COVID-19 by year was analyzed through exploratory data analysis using a Python program, and the top 10 journals under active research were identified. LDA and Word2vec algorithm were used to derive research topics related to COVID-19, and after analyzing related words, similarity was measured. Second, papers containing 'vaccine' and 'treatment' were extracted from among the topics derived from all papers, and a total of 4,555 papers related to 'vaccine' and 5,971 papers related to 'treatment' were extracted. did For each collected paper, detailed topics were analyzed using LDA and Word2vec algorithms, and a clustering method through PCA dimension reduction was applied to visualize groups of papers with similar themes using the t-SNE algorithm. A noteworthy point from the results of this study is that the topics that were not derived from the topics derived for all papers being researched in relation to COVID-19 (

    ) were the topic modeling results for each research topic (
    ) was found to be derived from For example, as a result of topic modeling for papers related to 'vaccine', a new topic titled Topic 05 'neutralizing antibodies' was extracted. A neutralizing antibody is an antibody that protects cells from infection when a virus enters the body, and is said to play an important role in the production of therapeutic agents and vaccine development. In addition, as a result of extracting topics from papers related to 'treatment', a new topic called Topic 05 'cytokine' was discovered. A cytokine storm is when the immune cells of our body do not defend against attacks, but attack normal cells. Hidden topics that could not be found for the entire thesis were classified according to keywords, and topic modeling was performed to find detailed topics. In this study, we proposed a method of extracting topics from a large amount of literature using the LDA algorithm and extracting similar words using the Skip-gram method that predicts the similar words as the central word among the Word2vec models. The combination of the LDA model and the Word2vec model tried to show better performance by identifying the relationship between the document and the LDA subject and the relationship between the Word2vec document. In addition, as a clustering method through PCA dimension reduction, a method for intuitively classifying documents by using the t-SNE technique to classify documents with similar themes and forming groups into a structured organization of documents was presented. In a situation where the efforts of many researchers to overcome COVID-19 cannot keep up with the rapid publication of academic papers related to COVID-19, it will reduce the precious time and effort of healthcare professionals and policy makers, and rapidly gain new insights. We hope to help you get It is also expected to be used as basic data for researchers to explore new research directions.

  • Study on Using Teeth Images in Biometrics (생체 인식에서 치아 영상의 이용에 관한 연구)

    • Kim, Tae-Woo;Cho, Tae-Kyung;Lee, Min-Soo
      • Journal of the Korea Academia-Industrial cooperation Society
      • /
      • v.7 no.2
      • /
      • pp.200-205
      • /
      • 2006
    • Abstract This paper presents a personal identification method based on BMME and LDA for images acquired at anterior and posterior occlusion expression of teeth. The method consists of teeth region extraction, BMME, and pattern recognition forthe images acquired at the anterior and posterior occlusion state of teeth. Two occlusions can provide consistent teeth appearance in images and BMME can reduce matching error in pattern recognition. Using teeth images can be beneficial in recognition because teeth, rigid objects, cannot be deformed at the moment of image acquisition. In the experiments, the algorithm was successful in teeth recognition for personal identification for 20 people, which encouraged our method to be able to contribute to multi-modal authentication systems.

    • PDF

    A Study on the Research Trends in Int'l Trade Using Topic modeling (토픽모델링을 활용한 무역분야 연구동향 분석)

    • Jee-Hoon Lee;Jung-Suk Kim
      • Korea Trade Review
      • /
      • v.45 no.3
      • /
      • pp.55-69
      • /
      • 2020
    • This study examines the research trends and knowledge structure of international trade studies using topic modeling method, which is one of the main methodologies of text mining. We collected and analyzed English abstracts of 1,868 papers of three Korean major journals in the area of international trade from 2003 to 2019. We used the Latent Dirichlet Allocation(LDA), an unsupervised machine learning algorithm to extract the latent topics from the large quantity of research abstracts. 20 topics are identified without any prior human judgement. The topics reveal topographical maps of research in international trade and are representative and meaningful in the sense that most of them correspond to previously established sub-topics in trade studies. Then we conducted a regression analysis on the document-topic distributions generated by LDA to identify hot and cold topics. We discovered 2 hot topics(internationalization capacity and performance of export companies, economic effect of trade) and 2 cold topics(exchange rate and current account, trade finance). Trade studies are characterized as a interdisciplinary study of three agendas(i.e. international economy, International Business, trade practice), and 20 topics identified can be grouped into these 3 agendas. From the estimated results of the study, we find that the Korean government's active pursuit of FTA and consequent necessity of capacity building in Korean export firms lie behind the popularity of topic selection by the Korean researchers in the area of int'l trade.


    (34141) Korea Institute of Science and Technology Information, 245, Daehak-ro, Yuseong-gu, Daejeon
    Copyright (C) KISTI. All Rights Reserved.