• Title/Summary/Keyword: 잠재 디리클레 할당 모형

Search Result 8, Processing Time 0.028 seconds

Falling Accidents Analysis in Construction Sites by Using Topic Modeling (토픽 모델링을 이용한 건설현장 추락재해 분석)

  • Ryu, Hanguk
    • Journal of the Korea Convergence Society
    • /
    • v.10 no.7
    • /
    • pp.175-182
    • /
    • 2019
  • We classify topics on fall incidents occurring in construction sites using topic modeling among machine learning techniques and analyze the causes of the accidents according to each topic. In order to apply topic modeling based on latent dirichlet allocation, text data was preprocessed and evaluated with Perplexity score to improve the reliability of the model. The most common falling accidents happened to the daily workers belonging to small construction site. Most of the causes were not operated properly due to lack of safety equipment, inadequacy of arrangement and wearing, and low performance of safety equipment. In order to prevent and reduce the falling accidents, it is important to educate the daily workers of small construction site, arrange the workplace, and check the wearing of personal safety equipment and device.

Aviation Safety Mandatory Report Topic Prediction Model using Latent Dirichlet Allocation (LDA) (잠재 디리클레 할당(LDA)을 이용한 항공안전 의무보고 토픽 예측 모형)

  • Jun Hwan Kim;Hyunjin Paek;Sungjin Jeon;Young Jae Choi
    • Journal of the Korean Society for Aviation and Aeronautics
    • /
    • v.31 no.3
    • /
    • pp.42-49
    • /
    • 2023
  • Not only in aviation industry but also in other industries, safety data plays a key role to improve the level of safety performance. By analyzing safety data such as aviation safety report (text data), hazard can be identified and removed before it leads to a tragic accident. However, pre-processing of raw data (or natural language data) collected from each site should be carried out first to utilize proactive or predictive safety management system. As air traffic volume increases, the amount of data accumulated is also on the rise. Accordingly, there are clear limitation in analyzing data directly by manpower. In this paper, a topic prediction model for aviation safety mandatory report is proposed. In addition, the prediction accuracy of the proposed model was also verified using actual aviation safety mandatory report data. This research model is meaningful in that it not only effectively supports the current aviation safety mandatory report analysis work, but also can be applied to various data produced in the aviation safety field in the future.

Unsupervised Motion Learning for Abnormal Behavior Detection in Visual Surveillance (영상감시시스템에서 움직임의 비교사학습을 통한 비정상행동탐지)

  • Jeong, Ha-Wook;Chang, Hyung-Jin;Choi, Jin-Young
    • Journal of the Institute of Electronics Engineers of Korea SC
    • /
    • v.48 no.5
    • /
    • pp.45-51
    • /
    • 2011
  • In this paper, we propose an unsupervised learning method for modeling motion trajectory patterns effectively. In our approach, observations of an object on a trajectory are treated as words in a document for latent dirichlet allocation algorithm which is used for clustering words on the topic in natural language process. This allows clustering topics (e.g. go straight, turn left, turn right) effectively in complex scenes, such as crossroads. After this procedure, we learn patterns of word sequences in each cluster using Baum-Welch algorithm used to find the unknown parameters in a hidden markov model. Evaluation of abnormality can be done using forward algorithm by comparing learned sequence and input sequence. Results of experiments show that modeling of semantic region is robust against noise in various scene.

Text Mining Analysis of News Articles Related to 'Space Hazard' ('우주 위험' 관련 뉴스 기사의 텍스트 마이닝 분석 연구)

  • Jo, Hoon;Sohn, Jungjoo
    • Journal of the Korean earth science society
    • /
    • v.43 no.1
    • /
    • pp.224-235
    • /
    • 2022
  • This study aimed to confirm the status of media reports on space hazards using topic modeling analysis of media articles that are related to space hazards for the past 12 years. Therefore, Latent Dirichlet Allocation (LDA) analysis was performed by collecting over 1200 space hazards articles between 2010 and 2021 on solar storm, artificial space objects, and natural space objects from BIGKins news platform. The articles related to solar storm focused on three topics: the effect of solar explosion on satellites; effect of solar explosion on radio communication in Korea, centered on the Korean Space Weather Center; and relationship between aircrew and space radiation. The articles related to artificial space objects focused on three topics: the threat of space garbage to satellite and space stations and the transition of useful objects into space junk; the relationship between space garbage and humanity as shown in movies; and the effort of developed countries for tracking, monitoring, and disposing of space garbage. The articles related to natural space objects focused on two topics: International Space Agency's tracking and monitoring of near-Earth asteroids and the countermeasures of collisions, and the evolution and extinction of dinosaurs and mammals, with a focus on the collisions of asteroids or comets. Therefore, this study confirmed that domestic media play a role in conveying dangers of space hazards and arousing the attention of public using a total of eight themes in various fields such as society and culture, and derived education method and policy on space hazards.

Feature selection for text data via topic modeling (토픽 모형을 이용한 텍스트 데이터의 단어 선택)

  • Woosol, Jang;Ye Eun, Kim;Won, Son
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.6
    • /
    • pp.739-754
    • /
    • 2022
  • Usually, text data consists of many variables, and some of them are closely correlated. Such multi-collinearity often results in inefficient or inaccurate statistical analysis. For supervised learning, one can select features by examining the relationship between target variables and explanatory variables. On the other hand, for unsupervised learning, since target variables are absent, one cannot use such a feature selection procedure as in supervised learning. In this study, we propose a word selection procedure that employs topic models to find latent topics. We substitute topics for the target variables and select terms which show high relevance for each topic. Applying the procedure to real data, we found that the proposed word selection procedure can give clear topic interpretation by removing high-frequency words prevalent in various topics. In addition, we observed that, by applying the selected variables to the classifiers such as naïve Bayes classifiers and support vector machines, the proposed feature selection procedure gives results comparable to those obtained by using class label information.

Convergence Study on Research Topics for Thyroid Cancer in Korea (국내 갑상선암 논문 토픽에 대한 융합연구)

  • Yang, Ji-Yeon
    • Journal of the Korea Convergence Society
    • /
    • v.10 no.2
    • /
    • pp.75-81
    • /
    • 2019
  • The purpose of this study was to perform a convergence study for the investigation of the trend of research topics related to thyroid cancer in Korea. We collected related research papers from DBpia and employed LDA-based topic model. In result, we identified four research topics, each of which concerns "Surgery", "Disease aggressiveness", "Survival analysis", and "Well-being of patients". With multinomial logistic regression, we found significant time trend, where "Surgery"-related topic was popular before 2000, topics regarding "Disease aggressiveness" and "Survival analysis" were frequently addressed in the 2000s, and "Survival analysis" and especially "Well-being of patients" have been pursued since 2010. The findings would serve as a reference guide for research directions. Future work may examine whether the recent change in research topics is observed in other diseases.

Analysis on Status and Trends of SIAM Journal Papers using Text Mining (텍스트마이닝 기법을 활용한 미국산업응용수학 학회지의 연구 현황 및 동향 분석)

  • Kim, Sung-Yeun
    • The Journal of the Korea Contents Association
    • /
    • v.20 no.7
    • /
    • pp.212-222
    • /
    • 2020
  • The purpose of this study is to understand the current status and trends of the research studies published by the Society for Industrial and Applied Mathematics which is a leader in the field of industrial mathematics around the world. To perform this purpose, titles and abstracts were collected from 6,255 research articles between 2016 and 2019, and the R program was used to analyze the topic modeling model with LDA techniques and a regression model. As the results of analyses, first, a variety of studies have been studied in the fields of industrial mathematics, such as algebra, discrete mathematics, geometry, topological mathematics, probability and statistics. Second, it was found that the ascending research subjects were fluid mechanics, graph theory, and stochastic differential equations, and the descending research subjects were computational theory and classical geometry. The results of the study, based on the understanding of the overall flows and changes of the intellectual structure in the fields of industrial mathematics, are expected to provide researchers in the field with implications of the future direction of research and how to build an industrial mathematics curriculum that reflects the zeitgeist in the field of education.

Topic Modeling on Fine Dust Issues Using LDA Analysis (LDA 기법을 이용한 미세먼지 이슈의 토픽모델링 분석)

  • Yoon, soonuk;Kim, Minchul
    • Journal of Energy Engineering
    • /
    • v.29 no.2
    • /
    • pp.23-29
    • /
    • 2020
  • In this study, the last 10 years of news data on fine dust was collected and 80 topics are selected through LDA analysis. As a result, weather-related information made up the main words for the topic, and we can see that fine dust becomes a big issue below 10 degrees Celsius. The frequency of exposure to the media and the maximum concentration of fine dust are correlated with positive. Topics related to fine dust reduction measures and the government's comprehensive measures over the past decade, topics related to products such as air purifiers related to fine dust, topics related to policies protecting vulnerable people from fine dust, and topics on fine dust reduction through R&D were found to be major topics. Measures against fine dust as a social issue can be seen to be closely related to the government's policy.