• Title/Summary/Keyword: bayesian network

Search Result 514, Processing Time 0.029 seconds

A Sliding Window-based Multivariate Stream Data Classification (슬라이딩 윈도우 기반 다변량 스트림 데이타 분류 기법)

  • Seo, Sung-Bo;Kang, Jae-Woo;Nam, Kwang-Woo;Ryu, Keun-Ho
    • Journal of KIISE:Databases
    • /
    • v.33 no.2
    • /
    • pp.163-174
    • /
    • 2006
  • In distributed wireless sensor network, it is difficult to transmit and analyze the entire stream data depending on limited networks, power and processor. Therefore it is suitable to use alternative stream data processing after classifying the continuous stream data. We propose a classification framework for continuous multivariate stream data. The proposed approach works in two steps. In the preprocessing step, it takes input as a sliding window of multivariate stream data and discretizes the data in the window into a string of symbols that characterize the signal changes. In the classification step, it uses a standard text classification algorithm to classify the discretized data in the window. We evaluated both supervised and unsupervised classification algorithms. For supervised, we tested Bayesian classifier and SVM, and for unsupervised, we tested Jaccard, TFIDF Jaro and Jaro Winkler. In our experiments, SVM and TFIDF outperformed other classification methods. In particular, we observed that classification accuracy is improved when the correlation of attributes is also considered along with the n-gram tokens of symbols.

Human Error Probability Assessment During Maintenance Activities of Marine Systems

  • Islam, Rabiul;Khan, Faisal;Abbassi, Rouzbeh;Garaniya, Vikram
    • Safety and Health at Work
    • /
    • v.9 no.1
    • /
    • pp.42-52
    • /
    • 2018
  • Background: Maintenance operations on-board ships are highly demanding. Maintenance operations are intensive activities requiring high man-machine interactions in challenging and evolving conditions. The evolving conditions are weather conditions, workplace temperature, ship motion, noise and vibration, and workload and stress. For example, extreme weather condition affects seafarers' performance, increasing the chances of error, and, consequently, can cause injuries or fatalities to personnel. An effective human error probability model is required to better manage maintenance on-board ships. The developed model would assist in developing and maintaining effective risk management protocols. Thus, the objective of this study is to develop a human error probability model considering various internal and external factors affecting seafarers' performance. Methods: The human error probability model is developed using probability theory applied to Bayesian network. The model is tested using the data received through the developed questionnaire survey of >200 experienced seafarers with >5 years of experience. The model developed in this study is used to find out the reliability of human performance on particular maintenance activities. Results: The developed methodology is tested on the maintenance of marine engine's cooling water pump for engine department and anchor windlass for deck department. In the considered case studies, human error probabilities are estimated in various scenarios and the results are compared between the scenarios and the different seafarer categories. The results of the case studies for both departments are also compared. Conclusion: The developed model is effective in assessing human error probabilities. These probabilities would get dynamically updated as and when new information is available on changes in either internal (i.e., training, experience, and fatigue) or external (i.e., environmental and operational conditions such as weather conditions, workplace temperature, ship motion, noise and vibration, and workload and stress) factors.

A Study on Detection of Small Size Malicious Code using Data Mining Method (데이터 마이닝 기법을 이용한 소규모 악성코드 탐지에 관한 연구)

  • Lee, Taek-Hyun;Kook, Kwang-Ho
    • Convergence Security Journal
    • /
    • v.19 no.1
    • /
    • pp.11-17
    • /
    • 2019
  • Recently, the abuse of Internet technology has caused economic and mental harm to society as a whole. Especially, malicious code that is newly created or modified is used as a basic means of various application hacking and cyber security threats by bypassing the existing information protection system. However, research on small-capacity executable files that occupy a large portion of actual malicious code is rather limited. In this paper, we propose a model that can analyze the characteristics of known small capacity executable files by using data mining techniques and to use them for detecting unknown malicious codes. Data mining analysis techniques were performed in various ways such as Naive Bayesian, SVM, decision tree, random forest, artificial neural network, and the accuracy was compared according to the detection level of virustotal. As a result, more than 80% classification accuracy was verified for 34,646 analysis files.

Two-Layer Approach Using FTA and BBN for Reliability Analysis of Combat Systems (전투 시스템의 신뢰성 분석을 위한 FTA와 BBN을 이용한 2계층 접근에 관한 연구)

  • Kang, Ji-Won;Lee, Jang-Se
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.23 no.3
    • /
    • pp.333-340
    • /
    • 2019
  • A combat system performs a given mission enduring various threats. It is important to analyze the reliability of combat systems in order to increase their ability to perform a given mission. Most of studies considered no threat or on threat and didn't analyze all the dependent relationships among the components. In this paper, we analyze the loss probability of the function of the combat system and use it to analyze the reliability. The proposed method is divided into two layers, A lower layer and a upper layer. In lower layer, the failure probability of each components is derived by using FTA to consider various threats. In the upper layer, The loss probability of function is analyzed using the failure probability of the component derived from lower layer and BBN in order to consider the dependent relationships among the components. Using the proposed method, it is possible to analyze considering various threats and the dependency between components.

Diabetes prediction mechanism using machine learning model based on patient IQR outlier and correlation coefficient (환자 IQR 이상치와 상관계수 기반의 머신러닝 모델을 이용한 당뇨병 예측 메커니즘)

  • Jung, Juho;Lee, Naeun;Kim, Sumin;Seo, Gaeun;Oh, Hayoung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.10
    • /
    • pp.1296-1301
    • /
    • 2021
  • With the recent increase in diabetes incidence worldwide, research has been conducted to predict diabetes through various machine learning and deep learning technologies. In this work, we present a model for predicting diabetes using machine learning techniques with German Frankfurt Hospital data. We apply outlier handling using Interquartile Range (IQR) techniques and Pearson correlation and compare model-specific diabetes prediction performance with Decision Tree, Random Forest, Knn (k-nearest neighbor), SVM (support vector machine), Bayesian Network, ensemble techniques XGBoost, Voting, and Stacking. As a result of the study, the XGBoost technique showed the best performance with 97% accuracy on top of the various scenarios. Therefore, this study is meaningful in that the model can be used to accurately predict and prevent diabetes prevalent in modern society.

An Effective Feature Generation Method for Distributed Denial of Service Attack Detection using Entropy (엔트로피를 이용한 분산 서비스 거부 공격 탐지에 효과적인 특징 생성 방법 연구)

  • Kim, Tae-Hun;Seo, Ki-Taek;Lee, Young-Hoon;Lim, Jong-In;Moon, Jong-Sub
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.20 no.4
    • /
    • pp.63-73
    • /
    • 2010
  • Malicious bot programs, the source of distributed denial of service attack, are widespread and the number of PCs which were infected by malicious bot program are increasing geometrically thesedays. The continuous distributed denial of service attacks are happened constantly through these bot PCs and some financial incident cases have found lately. Therefore researches to response distributed denial of service attack are necessary so we propose an effective feature generation method for distributed denial of service attack detection using entropy. In this paper, we apply our method to both the DARPA 2000 datasets and also the distributed denial of service attack datasets that we composed and generated ourself in general university. And then we evaluate how the proposed method is useful through classification using bayesian network classifier.

Data-Driven Modeling of Freshwater Aquatic Systems: Status and Prospects (자료기반 물환경 모델의 현황 및 발전 방향)

  • Cha, YoonKyung;Shin, Jihoon;Kim, YoungWoo
    • Journal of Korean Society on Water Environment
    • /
    • v.36 no.6
    • /
    • pp.611-620
    • /
    • 2020
  • Although process-based models have been a preferred approach for modeling freshwater aquatic systems over extended time intervals, the increasing utility of data-driven models in a big data environment has made the data-driven models increasingly popular in recent decades. In this study, international peer-reviewed journals for the relevant fields were searched in the Web of Science Core Collection, and an extensive literature review, which included total 2,984 articles published during the last two decades (2000-2020), was performed. The review results indicated that the rate of increase in the number of published studies using data-driven models exceeded those using process-based models since 2010. The increase in the use of data-driven models was partly attributable to the increasing availability of data from new data sources, e.g., remotely sensed hyperspectral or multispectral data. Consistently throughout the past two decades, South Korea has been one of the top ten countries in which the greatest number of studies using the data-driven models were published. Among the major data-driven approaches, i.e., artificial neural network, decision tree, and Bayesian model, were illustrated with case studies. Based on the review, this study aimed to inform the current state of knowledge regarding the biogeochemical water quality and ecological models using data-driven approaches, and provide the remaining challenges and future prospects.

OGLE-2017-BLG-1049: ANOTHER GIANT PLANET MICROLENSING EVENT

  • Kim, Yun Hak;Chung, Sun-Ju;Udalski, A.;Bond, Ian A.;Jung, Youn Kil;Gould, Andrew;Albrow, Michael D.;Han, Cheongho;Hwang, Kyu-Ha;Ryu, Yoon-Hyun;Shin, In-Gu;Shvartzvald, Yossi;Yee, Jennifer C.;Zang, Weicheng;Cha, Sang-Mok;Kim, Dong-Jin;Kim, Hyoun-Woo;Kim, Seung-Lee;Lee, Chung-Uk;Lee, Dong-Joo
    • Journal of The Korean Astronomical Society
    • /
    • v.53 no.6
    • /
    • pp.161-168
    • /
    • 2020
  • We report the discovery of a giant exoplanet in the microlensing event OGLE-2017-BLG-1049, with a planet-host star mass ratio of q = 9.53 ± 0.39 × 10-3 and a caustic crossing feature in Korea Microlensing Telescope Network (KMTNet) observations. The caustic crossing feature yields an angular Einstein radius of θE = 0.52 ± 0.11 mas. However, the microlens parallax is not measured because the time scale of the event, tE ≃ 29 days, is too short. Thus, we perform a Bayesian analysis to estimate physical quantities of the lens system. We find that the lens system has a star with mass Mh = 0.55+0.36-0.29 M⊙ hosting a giant planet with Mp = 5.53+3.62-2.87 MJup, at a distance of DL = 5.67+1.11-1.52 kpc. The projected star-planet separation is a⊥ = 3.92+1.10-1.32 au. This means that the planet is located beyond the snow line of the host. The relative lens-source proper motion is μrel ~ 7 mas yr-1, thus the lens and source will be separated from each other within 10 years. After this, it will be possible to measure the flux of the host star with 30 meter class telescopes and to determine its mass.

Sensitivity assessment of environmental drought based on Bayesian Network model in the Nakdong River basin (베이지안 네트워크 모형 기반의 환경적 가뭄의 민감도 평가: 낙동강 유역을 대상으로)

  • Yoo, Jiyoung;Kim, Tae-Woong
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2021.06a
    • /
    • pp.79-79
    • /
    • 2021
  • 기상학적 측면에서 강수 부족으로 인한 수생태환경(하천), 호소환경(저수지) 및 유역환경(중권역)으로 미치는 환경학적 가뭄의 영향을 평가하기 위한 시도는 매우 중요하다. 만약 동일한 규모의 강수부족 현상이 발생할지라도, 환경적 측면에서의 수질 및 수생태에 미치는 영향이 매우 큰 유역이 있고, 반면 어느 정도의 복원력을 유지할 수 있는 유역이 있을 것이다. 즉, 서로 다른 유역환경에 따라 가뭄으로 인한 환경적 영향은 달라질 가능성이 크며, 이처럼 환경적 가뭄에 취약한 지역을 위해서는 지속적인 환경가뭄 모니터링이 중요하다. 환경적 측면에서 가뭄의 영향을 평가하기 위해서는 다양한 수질 관련 항목을 연계한 환경가뭄 감시가 중요하며, 이와 더불어 가뭄과 관련한 다양한 이해관계자 간의 효율적인 의사결정 도구가 필요하다. 따라서 본 연구에서는 다양한 시나리오 정보를 제공할 수 있는 베이지안 네트워크 모형을 적용하여 환경가뭄 민감도 평가 방안을 제시하고자 한다. 본 모형에서는 수질 문제가 가장 심하게 대두되고 있는 낙동강 유역을 대상으로, 기상학적 가뭄에 의한 수생태 및 환경 관련 변수들(BOD, T-P, TOC)의 복잡한 상호의존성을 파악할 수 있는 베이지안 네트워크 모형을 활용하였다. 또한, 기상학적 가뭄에 의한 상류와 하류 간의 환경적 영향을 연계하여 해석하기 위한 모형을 구축하였다. 그 결과, 기상학적 가뭄으로 인한 환경적 민감도가 크게 나타나는 중권역(예: 임하댐유역)과 이와 반대인 중권역(예: 병성천유역)의 구분이 가능하였다. 또한, 상류에서 발생한 심한 기상학적 가뭄이 하류 지역 내 환경적인 영향을 지속할 가능성이 있음을 확인되었다. 따라서 본 연구에서 제안한 방법은 환경적 가뭄의 취약지역을 우선 선정하고, 나아가 상-하류 간의 환경적 가뭄을 감시하는 데 있어 활용도가 있을 것으로 기대된다.

  • PDF

Real-time prediction on the slurry concentration of cutter suction dredgers using an ensemble learning algorithm

  • Han, Shuai;Li, Mingchao;Li, Heng;Tian, Huijing;Qin, Liang;Li, Jinfeng
    • International conference on construction engineering and project management
    • /
    • 2020.12a
    • /
    • pp.463-481
    • /
    • 2020
  • Cutter suction dredgers (CSDs) are widely used in various dredging constructions such as channel excavation, wharf construction, and reef construction. During a CSD construction, the main operation is to control the swing speed of cutter to keep the slurry concentration in a proper range. However, the slurry concentration cannot be monitored in real-time, i.e., there is a "time-lag effect" in the log of slurry concentration, making it difficult for operators to make the optimal decision on controlling. Concerning this issue, a solution scheme that using real-time monitored indicators to predict current slurry concentration is proposed in this research. The characteristics of the CSD monitoring data are first studied, and a set of preprocessing methods are presented. Then we put forward the concept of "index class" to select the important indices. Finally, an ensemble learning algorithm is set up to fit the relationship between the slurry concentration and the indices of the index classes. In the experiment, log data over seven days of a practical dredging construction is collected. For comparison, the Deep Neural Network (DNN), Long Short Time Memory (LSTM), Support Vector Machine (SVM), Random Forest (RF), Gradient Boosting Decision Tree (GBDT), and the Bayesian Ridge algorithm are tried. The results show that our method has the best performance with an R2 of 0.886 and a mean square error (MSE) of 5.538. This research provides an effective way for real-time predicting the slurry concentration of CSDs and can help to improve the stationarity and production efficiency of dredging construction.

  • PDF