• Title/Summary/Keyword: LDA 토픽 모델링

Search Result 228, Processing Time 0.028 seconds

Topic modeling for automatic classification of learner question and answer in teaching-learning support system (교수-학습지원시스템에서 학습자 질의응답 자동분류를 위한 토픽 모델링)

  • Kim, Kyungrog;Song, Hye jin;Moon, Nammee
    • Journal of Digital Contents Society
    • /
    • v.18 no.2
    • /
    • pp.339-346
    • /
    • 2017
  • There is increasing interest in text analysis based on unstructured data such as articles and comments, questions and answers. This is because they can be used to identify, evaluate, predict, and recommend features from unstructured text data, which is the opinion of people. The same holds true for TEL, where the MOOC service has evolved to automate debating, questioning and answering services based on the teaching-learning support system in order to generate question topics and to automatically classify the topics relevant to new questions based on question and answer data accumulated in the system. Therefore, in this study, we propose topic modeling using LDA to automatically classify new query topics. The proposed method enables the generation of a dictionary of question topics and the automatic classification of topics relevant to new questions. Experimentation showed high automatic classification of over 0.7 in some queries. The more new queries were included in the various topics, the better the automatic classification results.

Identify Dispute Types of Corporate Information Security Incidents; Focusing on Performance Evaluation of BERTopic, Top2Vec, and LDA-based Topic Modeling (기업 정보보안 사고의 분쟁 유형 도출; BERTopic, Top2Vec, LDA 기반 토픽모델링의 성능 평가를 중심으로)

  • Minjung Park;Young Jin Son;Sangmi Chai
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2024.05a
    • /
    • pp.531-533
    • /
    • 2024
  • 최근 AI 를 비롯한 데이터 기반의 비즈니스 모델 증가에 따라, 데이터 유출 등의 기업 정보보안 사고가 빈번하게 발생하고 있다. 해당 사고들은 종종 법적 분쟁으로 이어지며, 이는 기업의 막대한 경제적 손실을 초래하며 정보보안 사고를 선제적으로 대비하기 위한 기술적, 관리적 조치 마련을 위한 기업의 관심이 증가하고 있다. 이에 본 연구에서는 최근 들어 급증한 기업의 정보보안 관련 판례를 대상으로 BERTopic, Top2Vec, LDA 를 활용하여 토픽 모델링을 수행하여 산출된 토픽 기반의 기업 정보보안 사고를 유형화하고자 한다. 전통적으로 각각 다른 법적 요소와 판결을 담고 있어, 유사 사건 간의 비교 및 분석이 어려운 판례 데이터의 특징을 반영하여 본 연구에서는 앞서 제시된 3가지의 모델을 각각 적용한다. 이를 통하여 각 모델 수행 결과의 성능 비교를 통하여 기업의 정보보안 사건의 유형화 및 동향을 파악하는 동시에 판례 데이터를 분석하기 위한 최적의 모델을 확인한다.

  • PDF

Big Data Analysis of Busan Civil Affairs Using the LDA Topic Modeling Technique (LDA 토픽모델링 기법을 활용한 부산시 민원 빅데이터 분석)

  • Park, Ju-Seop;Lee, Sae-Mi
    • Informatization Policy
    • /
    • v.27 no.2
    • /
    • pp.66-83
    • /
    • 2020
  • Local issues that occur in cities typically garner great attention from the public. While local governments strive to resolve these issues, it is often difficult to effectively eliminate them all, which leads to complaints. In tackling these issues, it is imperative for local governments to use big data to identify the nature of complaints, and proactively provide solutions. This study applies the LDA topic modeling technique to research and analyze trends and patterns in complaints filed online. To this end, 9,625 cases of online complaints submitted to the city of Busan from 2015 to 2017 were analyzed, and 20 topics were identified. From these topics, key topics were singled out, and through analysis of quarterly weighting trends, four "hot" topics(Bus stops, Taxi drivers, Praises, and Administrative handling) and four "cold" topics(CCTV installation, Bus routes, Park facilities including parking, and Festivities issues) were highlighted. The study conducted big data analysis for the identification of trends and patterns in civil affairs and makes an academic impact by encouraging follow-up research. Moreover, the text mining technique used for complaint analysis can be used for other projects requiring big data processing.

Analysis of User Reviews of Running Applications Using Text Mining: Focusing on Nike Run Club and Runkeeper (텍스트마이닝을 활용한 러닝 어플리케이션 사용자 리뷰 분석: Nike Run Club과 Runkeeper를 중심으로)

  • Gimun Ryu;Ilgwang Kim
    • Journal of Industrial Convergence
    • /
    • v.22 no.4
    • /
    • pp.11-19
    • /
    • 2024
  • The purpose of this study was to analyze user reviews of running applications using text mining. This study used user reviews of Nike Run Club and Runkeeper in the Google Play Store using the selenium package of python3 as the analysis data, and separated the morphemes by leaving only Korean nouns through the OKT analyzer. After morpheme separation, we created a rankNL dictionary to remove stopwords. To analyze the data, we used TF, TF-IDF and LDA topic modeling in text mining. The results of this study are as follows. First, the keywords 'record', 'app', and 'workout' were identified as the top keywords in the user reviews of Nike Run Club and Runkeeper applications, and there were differences in the rankings of TF and TF-IDF. Second, the LDA topic modeling of Nike Run Club identified the topics of 'basic items', 'additional features', 'errors', and 'location-based data', and the topics of Runkeeper identified the topics of 'errors', 'voice function', 'running data', 'benefits', and 'motivation'. Based on the results, it is recommended that errors and improvements should be made to contribute to the competitiveness of the application.

A Study on Technology Trend of Power Semiconductor Packaging using Topic model (토픽모델을 이용한 전력반도체 패키징 기술 동향 연구)

  • Park, Keunseo;Choi, Gyunghyun
    • Journal of the Microelectronics and Packaging Society
    • /
    • v.27 no.2
    • /
    • pp.53-58
    • /
    • 2020
  • Analysis of electric semiconductor packaging technology for electric vehicles was performed. Topic modeling using LDA technique was performed by collecting valid patents by deriving valid patents. It was classified into 20 topics, and the definition of technology was defined through extracted words for each topic. In order to analyze the trend of each topic, the trend of power semiconductor packaging technology was analyzed by deriving hot and cold topics by topic through regression analysis on frequency by year. The package structure technology according to the withstand voltage, the input/output-related control technology and the heat dissipation technology were derived as the hot topic technology, and the inductance reduction technology was derived as the cold topic technology.

소셜 데이터에서 재난 사건 추출을 위한 사용자 행동 및 시간 분석을 반영한 토픽 모델

  • ;Lee, Gyeong-Sun
    • Information and Communications Magazine
    • /
    • v.34 no.6
    • /
    • pp.43-50
    • /
    • 2017
  • 본고에서는 소셜 빅데이터에서 공공안전에 위협되고 사회적으로 이슈가 되는 재난사건을 추출하기 위한 방법으로 소셜 네트워크상에서 사용자 행동 분석과 시간분석을 반영한 토픽 모델링 기법을 알아본다. 소셜 사용자의 글 수, 리트윗 반응, 활동주기, 팔로워 수, 팔로잉 수 등 사용자의 행동 분석을 통하여 활동적이고 신뢰성 있는 사용자를 분류함으로써 트윗에서 스팸성과 광고성을 제외하고 이슈에 대해 신뢰성 높은 사용자가 쓴 트윗을 중요하게 반영한다. 또한, 트위터 데이터에서 새로운 이슈가 발생한 것을 탐지하기 위해 시간별 핵심어휘 빈도의 분포 변화를 측정하고, 이슈 트윗에 대해 감성 표현 분석을 통해 핵심이슈에 대해 사건 어휘를 추출한다. 소셜 빅데이터의 특성상 같은 날짜에 여러 이슈에 대한 트윗이 많이 생성될 수 있기 때문에, 트윗들을 토픽별로 그룹핑하는 것이 필요하므로, 최근 많이 사용되고 있는 LDA 토픽모델링 기법에 시간 특성과 사용자 특성을 분석한 시간상에서의 중요한 사건 어휘를 반영하고, 해당이슈에 대한 신뢰성 있는 사용자가 쓴 트윗을 중요시 반영하도록 토픽모델링 기법을 개선한 소셜 사건 탐지 방법에 대해 알아본다.

Analyzing Female College Student's Recognition of Health Monitoring and Wearable Device Using Topic Modeling and Bi-gram Network Analysis (토픽 모델링 및 바이그램 네트워크 분석 기법을 통한 여대생의 건강관리 및 웨어러블 디바이스 인식에 관한 연구)

  • Jeong, Wookyoung;Shin, Donghee
    • Journal of the Korean Society for information Management
    • /
    • v.38 no.4
    • /
    • pp.129-152
    • /
    • 2021
  • This study proposed a plan to develop wearable devices suitable for female college students by analyzing female college students' perceptions and preferences for wearable devices and their needs for health care using topic modeling and network analysis techniques. To this end, 2,457 posts related to health care and wearable devices were collected from the community used by S Women's University students. After preprocessing the collected posts and comment data, LDA-based topic modeling was performed. Through topic modeling techniques, major issues of female college students related to health care and wearable devices are derived, and bi-gram analysis and network analysis are performed on posts containing related keywords to understand female college students' views on wearable devices.

Analysis of Domestic and Foreign Financial Security Research Activities and Trends through Topic Modeling Analysis (토픽모델링 분석 기법을 활용한 국내외 금융보안 분야 연구동향 분석)

  • Chae, Ho-Geun;Lee, Gi-Hyun;Lee, Joo-Yeoun
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.26 no.1
    • /
    • pp.83-95
    • /
    • 2021
  • In this study, major research trends at home and abroad were compared and analyzed in order to derive key research fields in the financial security field and to suggest directions. To this end, 689 domestic and 20,736 foreign data were collected from domestic and international academic journal DB, and major research fields related to financial security were extracted through LDA analysis. After that, hot & cold topics were derived through time series linear regression analysis. As a result of the analysis, studies related to government policy issues, personal information, and accredited certification were derived as promising research fields in Korea. In the case of foreign countries, related studies were drawn to develop advanced security systems such as cryptographic protocols and quantum security. Recently, it has become possible to apply various security technologies in Korea through the abolition of public certification. Accordingly, as changes in promising research fields are expected, the results of this study are expected to contribute to the establishment and development of a successful roadmap for domestic financial security.

Investigating the Trends of Research for the Small Business Owners (소상공인 연구 동향 분석)

  • Bang, Mi-Hyun;Lee, Young-Min
    • The Journal of the Korea Contents Association
    • /
    • v.22 no.7
    • /
    • pp.73-80
    • /
    • 2022
  • In this study, prior studies of 280 small business owners in Korea over the past two decades were comprehensively analyzed through keyword network and LDA topic modeling analysis, and overall views and trends in academia were examined. As core keywords, "sales" and "protection," which conflict with each other but are essential for stable and sustainable growth were selected, and 7 topics (Topic 1: start-up, topic 2: digital, topic 3: tax system, topic 4: capability, topic 5: coexistence, topic 6: regulation, and topic 7: funding) were drawn up. Based on the results of the analysis, the need to improve digital maturity for the continued growth and development of small business owners was raised, and the response at the pan-ministerial level and the stability of the performance of functions that can survive even after the new administration to solve the economic damage problems facing small business owners were suggested. In addition, attention to the long-term, speed, detail, and direction of government support in a new way, and a flexible approach to the negative way in which pre-allowance and post-regulation is given were suggested.

Keyword trends analysis related to the aviation industry during the Covid-19 period using text mining (텍스트마이닝을 활용한 Covid-19 기간 동안의 항공산업 관련 키워드 트렌드 분석)

  • Choi, Donghyun;Song, Bomi;Park, Dahyeon;Lee, Sungwoo
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.27 no.2
    • /
    • pp.115-128
    • /
    • 2022
  • The purpose of this study is to conduct keyword trend analysis using articles data on the impact of Covid-19 in the aviation in dustry. In this study, related articles were extracted centering on the keyword "Airline" by dividing the period of 6months before and after Covid-19 occurrence. After that, Topic modeling(LDA) was performed. Through this, The main topic was extracted in the event of an epidemic such as Covid-19, It is expected to be used as primary data to predict the aviation industry's impact when occurrence like Covid-19.