• Title/Summary/Keyword: LDA기법

Search Result 212, Processing Time 0.026 seconds

News Article Analysis of the 4th Industrial Revolution and Advertising before and after COVID-19: Focusing on LDA and Word2vec (코로나 이전과 이후의 4차 산업혁명과 광고의 뉴스기사 분석 : LDA와 Word2vec을 중심으로)

  • Cha, Young-Ran
    • The Journal of the Korea Contents Association
    • /
    • v.21 no.9
    • /
    • pp.149-163
    • /
    • 2021
  • The 4th industrial revolution refers to the next-generation industrial revolution led by information and communication technologies such as artificial intelligence (AI), Internet of Things (IoT), robot technology, drones, autonomous driving and virtual reality (VR) and it also has made a significant impact on the development of the advertising industry. However, the world is rapidly changing to a non-contact, non-face-to-face living environment to prevent the spread of COVID 19. Accordingly, the role of the 4th industrial revolution and advertising is changing. Therefore, in this study, text analysis was performed using Big Kinds to examine the 4th industrial revolution and changes in advertising before and after COVID 19. Comparisons were made between 2019 before COVID 19 and 2020 after COVID 19. Main topics and documents were classified through LDA topic model analysis and Word2vec, a deep learning technique. As the result of the study showed that before COVID 19, policies, contents, AI, etc. appeared, but after COVID 19, the field gradually expanded to finance, advertising, and delivery services utilizing data. Further, education appeared as an important issue. In addition, if the use of advertising related to the 4th industrial revolution technology was mainstream before COVID 19, keywords such as participation, cooperation, and daily necessities, were more actively used for education on advanced technology, while talent cultivation appeared prominently. Thus, these research results are meaningful in suggesting a multifaceted strategy that can be applied theoretically and practically, while suggesting the future direction of advertising in the 4th industrial revolution after COVID 19.

Emotion Recognition of Korean and Japanese using Facial Images (얼굴영상을 이용한 한국인과 일본인의 감정 인식 비교)

  • Lee, Dae-Jong;Ahn, Ui-Sook;Park, Jang-Hwan;Chun, Myung-Geun
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.15 no.2
    • /
    • pp.197-203
    • /
    • 2005
  • In this paper, we propose an emotion recognition using facial Images to effectively design human interface. Facial database consists of six basic human emotions including happiness, sadness, anger, surprise, fear and dislike which have been known as common emotions regardless of nation and culture. Emotion recognition for the facial images is performed after applying the discrete wavelet. Here, the feature vectors are extracted from the PCA and LDA. Experimental results show that human emotions such as happiness, sadness, and anger has better performance than surprise, fear and dislike. Expecially, Japanese shows lower performance for the dislike emotion. Generally, the recognition rates for Korean have higher values than Japanese cases.

Sentiment Analysis of Foot-and-Mouth Disease Using Tweet Text-Mining Technique (트윗 텍스트 마이닝 기법을 이용한 구제역의 감성분석)

  • Chae, Heechan;Lee, Jonguk;Choi, Yoona;Park, Daihee;Chung, Yongwha
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.7 no.11
    • /
    • pp.419-426
    • /
    • 2018
  • Due to the FMD(foot-and-mouth disease), the domestic animal husbandry and related industries suffer enormous damage every year. Although various academic researches related to FMD are ongoing, engineering studies on the social effects of FMD are very limited. In this study, we propose a systematic methodology to analyze emotional responses of regular citizens on FMD using text mining techniques. The proposed system first collects data related to FMD from the tweets posted on Twitter, and then performs a polarity classification process using a deep-learning technique. Second, keywords are extracted from the tweet using LDA, which is one of the typical techniques of topic modeling, and a keyword network is constructed from the extracted keywords. Finally, we analyze the various social effects of regular citizens on FMD through keyword network. As a case study, we performed the emotional analysis experiment of regular citizens about FMD from July 2010 to December 2011 in Korea.

Analyzing Female College Student's Recognition of Health Monitoring and Wearable Device Using Topic Modeling and Bi-gram Network Analysis (토픽 모델링 및 바이그램 네트워크 분석 기법을 통한 여대생의 건강관리 및 웨어러블 디바이스 인식에 관한 연구)

  • Jeong, Wookyoung;Shin, Donghee
    • Journal of the Korean Society for information Management
    • /
    • v.38 no.4
    • /
    • pp.129-152
    • /
    • 2021
  • This study proposed a plan to develop wearable devices suitable for female college students by analyzing female college students' perceptions and preferences for wearable devices and their needs for health care using topic modeling and network analysis techniques. To this end, 2,457 posts related to health care and wearable devices were collected from the community used by S Women's University students. After preprocessing the collected posts and comment data, LDA-based topic modeling was performed. Through topic modeling techniques, major issues of female college students related to health care and wearable devices are derived, and bi-gram analysis and network analysis are performed on posts containing related keywords to understand female college students' views on wearable devices.

Comparative Analysis of Dimensionality Reduction Techniques for Advanced Ransomware Detection with Machine Learning (기계학습 기반 랜섬웨어 공격 탐지를 위한 효과적인 특성 추출기법 비교분석)

  • Kim Han Seok;Lee Soo Jin
    • Convergence Security Journal
    • /
    • v.23 no.1
    • /
    • pp.117-123
    • /
    • 2023
  • To detect advanced ransomware attacks with machine learning-based models, the classification model must train learning data with high-dimensional feature space. And in this case, a 'curse of dimension' phenomenon is likely to occur. Therefore, dimensionality reduction of features must be preceded in order to increase the accuracy of the learning model and improve the execution speed while avoiding the 'curse of dimension' phenomenon. In this paper, we conducted classification of ransomware by applying three machine learning models and two feature extraction techniques to two datasets with extremely different dimensions of feature space. As a result of the experiment, the feature dimensionality reduction techniques did not significantly affect the performance improvement in binary classification, and it was the same even when the dimension of featurespace was small in multi-class clasification. However, when the dataset had high-dimensional feature space, LDA(Linear Discriminant Analysis) showed quite excellent performance.

Analysis of Topic Changes in Metaverse Application Reviews Before and After the COVID-19 Pandemic Using Causal Impact Analysis Techniques (Causal Impact 분석 기법을 접목한 COVID-19 팬데믹 전·후 메타버스 애플리케이션 리뷰의 토픽 변화 분석)

  • Lee, Sowon;Mijin Noh;MuMoungCho Han;YangSok Kim
    • Smart Media Journal
    • /
    • v.13 no.1
    • /
    • pp.36-44
    • /
    • 2024
  • Metaverse is attracting attention as the development of virtual environment technology and the emergence of untact culture due to the COVID-19 pandemic. In this study, by analyzing users' reviews on the "Zepeto" application, which has recently attracted attention as a metaverse service, we tried to confirm changes in the requirements for the metaverse after the COVID-19 pandemic. To this end, 109,662 reviews of "Zepeto" applications written on the Google Play Store from September 2018 to March 2023 were collected, topics were extracted using LDA topic modeling technique, and topics were analyzed using the Causal Impact technique to examine how topics changed before and after based on "March 11, 2020" when the COVID-19 pandemic was declared. As a result of the analysis, five topics were extracted: application functional problems (topic1), security problems (topic 2), complaints about cryptocurrency (Zem) in the application (topic 3), application performance (topic 4), and personal information-related problems (topic 5). Among them, it was confirmed that security problems (topic 2) were most affected by the COVID-19 pandemic.

Product Evaluation Criteria Extraction through Online Review Analysis: Using LDA and k-Nearest Neighbor Approach (온라인 리뷰 분석을 통한 상품 평가 기준 추출: LDA 및 k-최근접 이웃 접근법을 활용하여)

  • Lee, Ji Hyeon;Jung, Sang Hyung;Kim, Jun Ho;Min, Eun Joo;Yeo, Un Yeong;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.1
    • /
    • pp.97-117
    • /
    • 2020
  • Product evaluation criteria is an indicator describing attributes or values of products, which enable users or manufacturers measure and understand the products. When companies analyze their products or compare them with competitors, appropriate criteria must be selected for objective evaluation. The criteria should show the features of products that consumers considered when they purchased, used and evaluated the products. However, current evaluation criteria do not reflect different consumers' opinion from product to product. Previous studies tried to used online reviews from e-commerce sites that reflect consumer opinions to extract the features and topics of products and use them as evaluation criteria. However, there is still a limit that they produce irrelevant criteria to products due to extracted or improper words are not refined. To overcome this limitation, this research suggests LDA-k-NN model which extracts possible criteria words from online reviews by using LDA and refines them with k-nearest neighbor. Proposed approach starts with preparation phase, which is constructed with 6 steps. At first, it collects review data from e-commerce websites. Most e-commerce websites classify their selling items by high-level, middle-level, and low-level categories. Review data for preparation phase are gathered from each middle-level category and collapsed later, which is to present single high-level category. Next, nouns, adjectives, adverbs, and verbs are extracted from reviews by getting part of speech information using morpheme analysis module. After preprocessing, words per each topic from review are shown with LDA and only nouns in topic words are chosen as potential words for criteria. Then, words are tagged based on possibility of criteria for each middle-level category. Next, every tagged word is vectorized by pre-trained word embedding model. Finally, k-nearest neighbor case-based approach is used to classify each word with tags. After setting up preparation phase, criteria extraction phase is conducted with low-level categories. This phase starts with crawling reviews in the corresponding low-level category. Same preprocessing as preparation phase is conducted using morpheme analysis module and LDA. Possible criteria words are extracted by getting nouns from the data and vectorized by pre-trained word embedding model. Finally, evaluation criteria are extracted by refining possible criteria words using k-nearest neighbor approach and reference proportion of each word in the words set. To evaluate the performance of the proposed model, an experiment was conducted with review on '11st', one of the biggest e-commerce companies in Korea. Review data were from 'Electronics/Digital' section, one of high-level categories in 11st. For performance evaluation of suggested model, three other models were used for comparing with the suggested model; actual criteria of 11st, a model that extracts nouns by morpheme analysis module and refines them according to word frequency, and a model that extracts nouns from LDA topics and refines them by word frequency. The performance evaluation was set to predict evaluation criteria of 10 low-level categories with the suggested model and 3 models above. Criteria words extracted from each model were combined into a single words set and it was used for survey questionnaires. In the survey, respondents chose every item they consider as appropriate criteria for each category. Each model got its score when chosen words were extracted from that model. The suggested model had higher scores than other models in 8 out of 10 low-level categories. By conducting paired t-tests on scores of each model, we confirmed that the suggested model shows better performance in 26 tests out of 30. In addition, the suggested model was the best model in terms of accuracy. This research proposes evaluation criteria extracting method that combines topic extraction using LDA and refinement with k-nearest neighbor approach. This method overcomes the limits of previous dictionary-based models and frequency-based refinement models. This study can contribute to improve review analysis for deriving business insights in e-commerce market.

Analysis of Social Issues of the Newspaper Articles on Gyeongju Earthquakes (신문기사에 나타난 경주지진 사건의 사회적 이슈분석)

  • Lee, Soo-Sang
    • Journal of Korean Library and Information Science Society
    • /
    • v.48 no.2
    • /
    • pp.53-72
    • /
    • 2017
  • The purpose of this study is to analyze types and features social issues about the Gyeongju earthquakes 2016, South Korea. The specific purpose is to identify types of topics related to Gyeongju Earthquakes, changes of topics over time, and the differences of topics depending on the each type of newspapers. According to the result of topic modeling, 55 topics were extracted. The result of this study is following these. First, the main topics have been changed with the course of time. In September, various topics were emerged, specifically urgent issues was found during two weeks after the first earthquake. After October, topics about social problems derived from the earthquakes received much attention at that time. Topics related to safety problems about nuclear plant have steadily found in all period. Second, topics varied depending whether the newspaper is national or regional. Also, differences of topics were found when dividing the newspapers by their characteristics considered conservative or liberal.

Fault Diagnosis of Voltage-Fed Inverters Using Pattern Recognition Techniques for Induction Motor Drive (패턴인식 기법을 이용한 유도전동기 구동용 전압형 인버터의 고장진단)

  • Park, Jang-Hwan;Park, Sung-Moo;Lee, Dae-Jong;Kim, Dong-Hwa;Chun, Myung-Geun
    • Journal of the Korean Institute of Illuminating and Electrical Installation Engineers
    • /
    • v.19 no.3
    • /
    • pp.75-84
    • /
    • 2005
  • Since an unexpected fault of induction motor drive systems can cause serious troubles in many industrial applications, which the technique is required to diagnose faults of a voltage-fed PWM inverter for induction motor drives. The considered fault types are rectifier diodes, switching devices and input terminals with open-circuit faults, and the signal for diagnosis is derived from motor currents. The magnitude of dq-current trajectory is used for the feature extraction of a fault and PCA LDA are applied to diagnose. Also, we show results with respect to the execution time because of the possibility to use that a diagnosis software is embedded in the controllers of medium and small size induction motors drive for real-time diagnosis. After we performed various simulations for the fault diagnosis of the inverter, the usefulness of proposed algerian was verified.

Topic modeling for automatic classification of learner question and answer in teaching-learning support system (교수-학습지원시스템에서 학습자 질의응답 자동분류를 위한 토픽 모델링)

  • Kim, Kyungrog;Song, Hye jin;Moon, Nammee
    • Journal of Digital Contents Society
    • /
    • v.18 no.2
    • /
    • pp.339-346
    • /
    • 2017
  • There is increasing interest in text analysis based on unstructured data such as articles and comments, questions and answers. This is because they can be used to identify, evaluate, predict, and recommend features from unstructured text data, which is the opinion of people. The same holds true for TEL, where the MOOC service has evolved to automate debating, questioning and answering services based on the teaching-learning support system in order to generate question topics and to automatically classify the topics relevant to new questions based on question and answer data accumulated in the system. Therefore, in this study, we propose topic modeling using LDA to automatically classify new query topics. The proposed method enables the generation of a dictionary of question topics and the automatic classification of topics relevant to new questions. Experimentation showed high automatic classification of over 0.7 in some queries. The more new queries were included in the various topics, the better the automatic classification results.