• Title/Summary/Keyword: LDA 모형

Search Result 32, Processing Time 0.028 seconds

Topic change monitoring study based on Blue House national petition using a control chart (관리도를 활용한 국민청원 토픽 모니터링 연구)

  • Lee, Heeyeon;Choi, Jieun;Lee, Sungim;Son, Won
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.5
    • /
    • pp.795-806
    • /
    • 2021
  • Recently, as text data through online channels have become vast, there is a growing interest in research that summarizes and analyzes them. One of the fundamental analyses of text data is to extract potential topics. Although the researcher may read all the data and summarize the contents one by one, it is not easy to deal with large amounts of data. Blei and Lafferty (2007) and Blei et al. (2003) proposed topic modeling methods for extracting topics using a statistical model. Since the text data is generally collected over time, it is worthwhile to monitor the topic's changes. In this study, we propose a topic index based on the results of the topic model. In addition, a control chart, a representative tool for statistical process management, is applied to monitor the topic index over time. As a practical example, we use text data collected from Blue House National Petition boards between March 5, 2018, and March 5, 2020.

A Comparative Study of Classification Methods Using Data with Label Noise (레이블 노이즈가 존재하는 자료의 판별분석 방법 비교연구)

  • Kwon, So Young;Kim, Kyoung Hee
    • Journal of the Korean Data Analysis Society
    • /
    • v.20 no.6
    • /
    • pp.2853-2864
    • /
    • 2018
  • Discriminant analysis predicts a class label of a new observation with an unknown label, using information from the existing labeled data. Hence, observed labels play a critical role in the analysis and we usually assume that these labels are correct. If the observed label contains an error, the data has label noise. Label noise can frequently occur in real data, which would affect classification performance. In order to resolve this, a comparative study was carried out using simulated data with label noise. In particular, we considered 4 different classification techniques such as LDA (linear discriminant analysis classifiers), QDA (quadratic discriminant analysis classifiers), KNN (k-nearest neighbour), and SVM (support vector machine). Then we evaluated each method via average accuracy using generated data from various scenarios. The effect of label noise was investigated through its occurrence rate and type (noise location). We confirmed that the label noise is a significant factor influencing the classification performance.

Topic modeling and topic change trend analysis for advanced construction technologies (건설신기술에 대한 토픽 모델링 및 토픽 변화추이 분석)

  • Jeong, Seong Yun;Kim, Nam Gon
    • Smart Media Journal
    • /
    • v.10 no.4
    • /
    • pp.102-110
    • /
    • 2021
  • Currently, the advanced construction technology endorsement system is being operated to promote the development of domestic construction technology. We tried to examine the implicit meanings inherent in advanced construction technologies by analyzing the relationship between emerging vocabularies with high importance in relation to the advanced construction technologies endorsed through this system. For this purpose, 918 cases of advanced construction technology information were collected. Based on the endorsed year and summary of the advanced construction technologies, the importance of the emerging vocabularies was measured for each advanced construction technology. And, based on the LDA model, the degree of influence between related vocabularies was evaluated for each of the four topic areas. Topics according to the technical application fields were analyzed. From 1990 to 2021, the trend of changes in highly influential vocabularies by each topic was inferred. In the future, changes in the degree of influence of the topics of environment, machinery, facilities, and maintenance and reinforcement of structures and related technology fields were predicted.

A Study on Big Data Analysis of Related Patents in Smart Factories Using Topic Models and ChatGPT (토픽 모형과 ChatGPT를 활용한 스마트팩토리 연관 특허 빅데이터 분석에 관한 연구)

  • Sang-Gook Kim;Minyoung Yun;Taehoon Kwon;Jung Sun Lim
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.46 no.4
    • /
    • pp.15-31
    • /
    • 2023
  • In this study, we propose a novel approach to analyze big data related to patents in the field of smart factories, utilizing the Latent Dirichlet Allocation (LDA) topic modeling method and the generative artificial intelligence technology, ChatGPT. Our method includes extracting valuable insights from a large data-set of associated patents using LDA to identify latent topics and their corresponding patent documents. Additionally, we validate the suitability of the topics generated using generative AI technology and review the results with domain experts. We also employ the powerful big data analysis tool, KNIME, to preprocess and visualize the patent data, facilitating a better understanding of the global patent landscape and enabling a comparative analysis with the domestic patent environment. In order to explore quantitative and qualitative comparative advantages at this juncture, we have selected six indicators for conducting a quantitative analysis. Consequently, our approach allows us to explore the distinctive characteristics and investment directions of individual countries in the context of research and development and commercialization, based on a global-scale patent analysis in the field of smart factories. We anticipate that our findings, based on the analysis of global patent data in the field of smart factories, will serve as vital guidance for determining individual countries' directions in research and development investment. Furthermore, we propose a novel utilization of GhatGPT as a tool for validating the suitability of selected topics for policy makers who must choose topics across various scientific and technological domains.

Comparative Study of Information Literacy Education and Librarian Teacher Evaluation Index in Teachers' Competency Development Evaluation (정보활용교육 주요 토픽과 교원능력개발평가 사서교사 평가지표 비교 연구)

  • Lee, Min-Soo;Kim, Hea-Jin
    • Journal of Korean Library and Information Science Society
    • /
    • v.53 no.3
    • /
    • pp.455-477
    • /
    • 2022
  • This study aimed to compare and analyze librarian teacher evaluation index from evaluation of teachers' competency development with the the topics of information utilization education. To this end, LDA topic modeling was conducted by collecting papers related to information utilization education published in four major journals in the field of literature and information from 1995 to May 2022. As a result of topic modeling, it can be seen that information utilization education (T10) was the most actively discussed at 12.0% of the 20 topics, followed by library utilization classes (T2) 10.4% and user service (T3) 8.8%.On the other hand, 3.3% of reading discussion (T7), 2.9% of reading education (T19), 2.1% of manpower management (T13), and 2.1% of librarian teacher job satisfaction (T17) showed the lowest distributions 3.3%, 2.9%, 2.1%, and 2.1%, respectively. In addition, although librarian teacher's class model development (T1) and curriculum development (T20) are essential processes for collaborative classes and information utilization education, they were not reflected in the current teacher competency development evaluation index. Therefore, this study proposed that 'instructional model and curriculum development' indicator should be added on 'training and support classes' factors in the Librarian Teacher Evaluation Index in Teachers' Competency Development Evaluation for further evaluation.

Antecedents of Customer Loyalty in the Context of Sharing Accommodation: Analysis of Structural Equation Modelling and Topic Modelling (공유숙박업에서 고객 충성도에 영향을 미치는 요인: 구조 방정식 모형과 토픽 모델링 분석)

  • Kim, Seon ju;Kim, Byoungsoo
    • Knowledge Management Research
    • /
    • v.22 no.3
    • /
    • pp.55-73
    • /
    • 2021
  • The sharing economy is considered as a collaborative consumption which enables customers to share unused resources. This study investigated the key factors affecting consumer loyalty in the context of sharing accommodation. Emotions, perceived value and self-image consistency were posited as key antecedents of enhancing customer loyalty. Authentic experience, home amenities, and price fairness were also considered as Airbnb's selection attributes. Airbnb was selected a survey target because it is the largest company in the domain of shared accommodation market. The research model was analyzed for 294 Airbnb customer through structural equation models. Additionally, this paper examine Airbnb customers' experiences by topic modelling method posted on the Naver blog. Based on the understanding of the key factors affecting customer loyalty to sharing accommodation, the analysis results contribute to establish effective marketing and operation strategies by enhancing customer experience.

A Comparative Analysis of Social Commerce and Open Market Using User Reviews in Korean Mobile Commerce (사용자 리뷰를 통한 소셜커머스와 오픈마켓의 이용경험 비교분석)

  • Chae, Seung Hoon;Lim, Jay Ick;Kang, Juyoung
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.4
    • /
    • pp.53-77
    • /
    • 2015
  • Mobile commerce provides a convenient shopping experience in which users can buy products without the constraints of time and space. Mobile commerce has already set off a mega trend in Korea. The market size is estimated at approximately 15 trillion won (KRW) for 2015, thus far. In the Korean market, social commerce and open market are key components. Social commerce has an overwhelming open market in terms of the number of users in the Korean mobile commerce market. From the point of view of the industry, quick market entry, and content curation are considered to be the major success factors, reflecting the rapid growth of social commerce in the market. However, academics' empirical research and analysis to prove the success rate of social commerce is still insufficient. Henceforward, it is to be expected that social commerce and the open market in the Korean mobile commerce will compete intensively. So it is important to conduct an empirical analysis to prove the differences in user experience between social commerce and open market. This paper is an exploratory study that shows a comparative analysis of social commerce and the open market regarding user experience, which is based on the mobile users' reviews. Firstly, this study includes a collection of approximately 10,000 user reviews of social commerce and open market listed Google play. A collection of mobile user reviews were classified into topics, such as perceived usefulness and perceived ease of use through LDA topic modeling. Then, a sentimental analysis and co-occurrence analysis on the topics of perceived usefulness and perceived ease of use was conducted. The study's results demonstrated that social commerce users have a more positive experience in terms of service usefulness and convenience versus open market in the mobile commerce market. Social commerce has provided positive user experiences to mobile users in terms of service areas, like 'delivery,' 'coupon,' and 'discount,' while open market has been faced with user complaints in terms of technical problems and inconveniences like 'login error,' 'view details,' and 'stoppage.' This result has shown that social commerce has a good performance in terms of user service experience, since the aggressive marketing campaign conducted and there have been investments in building logistics infrastructure. However, the open market still has mobile optimization problems, since the open market in mobile commerce still has not resolved user complaints and inconveniences from technical problems. This study presents an exploratory research method used to analyze user experience by utilizing an empirical approach to user reviews. In contrast to previous studies, which conducted surveys to analyze user experience, this study was conducted by using empirical analysis that incorporates user reviews for reflecting users' vivid and actual experiences. Specifically, by using an LDA topic model and TAM this study presents its methodology, which shows an analysis of user reviews that are effective due to the method of dividing user reviews into service areas and technical areas from a new perspective. The methodology of this study has not only proven the differences in user experience between social commerce and open market, but also has provided a deep understanding of user experience in Korean mobile commerce. In addition, the results of this study have important implications on social commerce and open market by proving that user insights can be utilized in establishing competitive and groundbreaking strategies in the market. The limitations and research direction for follow-up studies are as follows. In a follow-up study, it will be required to design a more elaborate technique of the text analysis. This study could not clearly refine the user reviews, even though the ones online have inherent typos and mistakes. This study has proven that the user reviews are an invaluable source to analyze user experience. The methodology of this study can be expected to further expand comparative research of services using user reviews. Even at this moment, users around the world are posting their reviews about service experiences after using the mobile game, commerce, and messenger applications.

Unsupervised Motion Learning for Abnormal Behavior Detection in Visual Surveillance (영상감시시스템에서 움직임의 비교사학습을 통한 비정상행동탐지)

  • Jeong, Ha-Wook;Chang, Hyung-Jin;Choi, Jin-Young
    • Journal of the Institute of Electronics Engineers of Korea SC
    • /
    • v.48 no.5
    • /
    • pp.45-51
    • /
    • 2011
  • In this paper, we propose an unsupervised learning method for modeling motion trajectory patterns effectively. In our approach, observations of an object on a trajectory are treated as words in a document for latent dirichlet allocation algorithm which is used for clustering words on the topic in natural language process. This allows clustering topics (e.g. go straight, turn left, turn right) effectively in complex scenes, such as crossroads. After this procedure, we learn patterns of word sequences in each cluster using Baum-Welch algorithm used to find the unknown parameters in a hidden markov model. Evaluation of abnormality can be done using forward algorithm by comparing learned sequence and input sequence. Results of experiments show that modeling of semantic region is robust against noise in various scene.

A Study on VaR Stability for Operational Risk Management (운영리스크 VaR 추정값의 안정성검증 방법 연구)

  • Kim, Hyun-Joong;Kim, Woo-Hwan;Lee, Sang-Cheol;Im, Jong-Ho;Cho, Sang-Hee;Kim, Ah-Hyoun
    • Communications for Statistical Applications and Methods
    • /
    • v.15 no.5
    • /
    • pp.697-708
    • /
    • 2008
  • Operational risk is defined as the risk of loss resulting from inadequate or failed internal processes, people and systems, or external events. The advanced measurement approach proposed by Basel committee uses loss distribution approach(LDA) which quantifies operational loss based on bank's own historical data and measurement system. LDA involves two distribution fittings(frequency and severity) and then generates aggregate loss distribution by employing mathematical convolution. An objective validation for the operational risk measurement is essential because the operational risk measurement allows flexibility and subjective judgement to calculate regulatory capital. However, the methodology to verify the soundness of the operational risk measurement was not fully developed because the internal operational loss data had been extremely sparse and the modeling of extreme tail was very difficult. In this paper, we propose a methodology for the validation of operational risk measurement based on bootstrap confidence intervals of operational VaR(value at risk). We derived two methods to generate confidence intervals of operational VaR.

AI speakers!, Speak with feelings - Focusing on Analysis of SNS Comments (AI 스피커!, 감정을 담아 말해봐 - SNS 댓글 분석을 중심으로)

  • Kim, Joon-Hwan;Lee, Namyeon
    • Journal of Digital Convergence
    • /
    • v.18 no.7
    • /
    • pp.101-110
    • /
    • 2020
  • Devices that add emotion-specific services or various functions are appearing in AI speakers and related devices. To this end, this study performed topic modeling analysis on the topics of post-purchase texts written by AI speaker users, and compared them with the data collected via survey questionnaires. Furthermore, data on the emotional intelligence of AI speakers and relationship quality were collected from 600 users and analyzed using structural equation modeling. The findings of the study are as follows: First, the analysis results of topic modeling showed that most of the articles mainly mention the functional aspects of AI speakers. Second, emotional intelligence of AI speaker perceived by consumer affected relationship quality, and relationship quality had a positive effect on customer satisfaction. Therefore, this study expands the area of AI research by integrating the concept of emotional intelligence and relationship quality to provide new theoretical and practical implications.