• Title/Summary/Keyword: Classification algorithms

Search Result 1,173, Processing Time 0.024 seconds

A Study on the Detection Model of Illegal Access to Large-scale Service Networks using Netflow (Netflow를 활용한 대규모 서비스망 불법 접속 추적 모델 연구)

  • Lee, Taek-Hyun;Park, WonHyung;Kook, Kwang-Ho
    • Convergence Security Journal
    • /
    • v.21 no.2
    • /
    • pp.11-18
    • /
    • 2021
  • To protect tangible and intangible assets, most of the companies are conducting information protection monitoring by using various security equipment in the IT service network. As the security equipment that needs to be protected increases in the process of upgrading and expanding the service network, it is difficult to monitor the possible exposure to the attack for the entire service network. As a countermeasure to this, various studies have been conducted to detect external attacks and illegal communication of equipment, but studies on effective monitoring of the open service ports and construction of illegal communication monitoring system for large-scale service networks are insufficient. In this study, we propose a framework that can monitor information leakage and illegal communication attempts in a wide range of service networks without large-scale investment by analyzing 'Netflow statistical information' of backbone network equipment, which is the gateway to the entire data flow of the IT service network. By using machine learning algorithms to the Netfllow data, we could obtain the high classification accuracy of 94% in identifying whether the Telnet service port of operating equipment is open or not, and we could track the illegal communication of the damaged equipment by using the illegal communication history of the damaged equipment.

Estimation of Road Surface Condition during Summer Season Using Machine Learning (기계학습을 통한 여름철 노면상태 추정 알고리즘 개발)

  • Yeo, jiho;Lee, Jooyoung;Kim, Ganghwa;Jang, Kitae
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.17 no.6
    • /
    • pp.121-132
    • /
    • 2018
  • Weather is an important factor affecting roadway transportation in many aspects such as traffic flow, driver 's driving patterns, and crashes. This study focuses on the relationship between weather and road surface condition and develops a model to estimate the road surface condition using machine learning. A road surface sensor was attached to the probe vehicle to collect road surface condition classified into three categories as 'dry', 'moist' and 'wet'. Road geometry information (curvature, gradient), traffic information (link speed), weather information (rainfall, humidity, temperature, wind speed) are utilized as variables to estimate the road surface condition. A variety of machine learning algorithms examined for predicting the road surface condition, and a two - stage classification model based on 'Random forest' which has the highest accuracy was constructed. 14 days of data were used to train the model and 2 days of data were used to test the accuracy of the model. As a result, a road surface state prediction model with 81.74% accuracy was constructed. The result of this study shows the possibility of estimating the road surface condition using the existing weather and traffic information without installing new equipment or sensors.

Predicting Corporate Bankruptcy using Simulated Annealing-based Random Fores (시뮬레이티드 어니일링 기반의 랜덤 포레스트를 이용한 기업부도예측)

  • Park, Hoyeon;Kim, Kyoung-jae
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.4
    • /
    • pp.155-170
    • /
    • 2018
  • Predicting a company's financial bankruptcy is traditionally one of the most crucial forecasting problems in business analytics. In previous studies, prediction models have been proposed by applying or combining statistical and machine learning-based techniques. In this paper, we propose a novel intelligent prediction model based on the simulated annealing which is one of the well-known optimization techniques. The simulated annealing is known to have comparable optimization performance to the genetic algorithms. Nevertheless, since there has been little research on the prediction and classification of business decision-making problems using the simulated annealing, it is meaningful to confirm the usefulness of the proposed model in business analytics. In this study, we use the combined model of simulated annealing and machine learning to select the input features of the bankruptcy prediction model. Typical types of combining optimization and machine learning techniques are feature selection, feature weighting, and instance selection. This study proposes a combining model for feature selection, which has been studied the most. In order to confirm the superiority of the proposed model in this study, we apply the real-world financial data of the Korean companies and analyze the results. The results show that the predictive accuracy of the proposed model is better than that of the naïve model. Notably, the performance is significantly improved as compared with the traditional decision tree, random forests, artificial neural network, SVM, and logistic regression analysis.

A Method for Prediction of Quality Defects in Manufacturing Using Natural Language Processing and Machine Learning (자연어 처리 및 기계학습을 활용한 제조업 현장의 품질 불량 예측 방법론)

  • Roh, Jeong-Min;Kim, Yongsung
    • Journal of Platform Technology
    • /
    • v.9 no.3
    • /
    • pp.52-62
    • /
    • 2021
  • Quality control is critical at manufacturing sites and is key to predicting the risk of quality defect before manufacturing. However, the reliability of manual quality control methods is affected by human and physical limitations because manufacturing processes vary across industries. These limitations become particularly obvious in domain areas with numerous manufacturing processes, such as the manufacture of major nuclear equipment. This study proposed a novel method for predicting the risk of quality defects by using natural language processing and machine learning. In this study, production data collected over 6 years at a factory that manufactures main equipment that is installed in nuclear power plants were used. In the preprocessing stage of text data, a mapping method was applied to the word dictionary so that domain knowledge could be appropriately reflected, and a hybrid algorithm, which combined n-gram, Term Frequency-Inverse Document Frequency, and Singular Value Decomposition, was constructed for sentence vectorization. Next, in the experiment to classify the risky processes resulting in poor quality, k-fold cross-validation was applied to categorize cases from Unigram to cumulative Trigram. Furthermore, for achieving objective experimental results, Naive Bayes and Support Vector Machine were used as classification algorithms and the maximum accuracy and F1-score of 0.7685 and 0.8641, respectively, were achieved. Thus, the proposed method is effective. The performance of the proposed method were compared and with votes of field engineers, and the results revealed that the proposed method outperformed field engineers. Thus, the method can be implemented for quality control at manufacturing sites.

A Proposal of New Breaker Index Formula Using Supervised Machine Learning (지도학습을 이용한 새로운 선형 쇄파지표식 개발)

  • Choi, Byung-Jong;Park, Chang-Wook;Cho, Yong-Hwan;Kim, Do-Sam;Lee, Kwang-Ho
    • Journal of Korean Society of Coastal and Ocean Engineers
    • /
    • v.32 no.6
    • /
    • pp.384-395
    • /
    • 2020
  • Breaking waves generated by wave shoaling in coastal areas have a close relationship with various physical phenomena in coastal regions, such as sediment transport, longshore currents, and shock wave pressure. Therefore, it is crucial to accurately predict breaker index such as breaking wave height and breaking depth, when designing coastal structures. Numerous scientific efforts have been made in the past by many researchers to identify and predict the breaking phenomenon. Representative studies on wave breaking provide many empirical formulas for the prediction of breaking index, mainly through hydraulic model experiments. However, the existing empirical formulas for breaking index determine the coefficients of the assumed equation through statistical analysis of data under the assumption of a specific equation. In this paper, we applied a representative linear-based supervised machine learning algorithms that show high predictive performance in various research fields related to regression or classification problems. Based on the used machine learning methods, a model for prediction of the breaking index is developed from previously published experimental data on the breaking wave, and a new linear equation for prediction of breaker index is presented from the trained model. The newly proposed breaker index formula showed similar predictive performance compared to the existing empirical formula, although it was a simple linear equation.

Diagnosis and Visualization of Intracranial Hemorrhage on Computed Tomography Images Using EfficientNet-based Model (전산화 단층 촬영(Computed tomography, CT) 이미지에 대한 EfficientNet 기반 두개내출혈 진단 및 가시화 모델 개발)

  • Youn, Yebin;Kim, Mingeon;Kim, Jiho;Kang, Bongkeun;Kim, Ghootae
    • Journal of Biomedical Engineering Research
    • /
    • v.42 no.4
    • /
    • pp.150-158
    • /
    • 2021
  • Intracranial hemorrhage (ICH) refers to acute bleeding inside the intracranial vault. Not only does this devastating disease record a very high mortality rate, but it can also cause serious chronic impairment of sensory, motor, and cognitive functions. Therefore, a prompt and professional diagnosis of the disease is highly critical. Noninvasive brain imaging data are essential for clinicians to efficiently diagnose the locus of brain lesion, volume of bleeding, and subsequent cortical damage, and to take clinical interventions. In particular, computed tomography (CT) images are used most often for the diagnosis of ICH. In order to diagnose ICH through CT images, not only medical specialists with a sufficient number of diagnosis experiences are required, but even when this condition is met, there are many cases where bleeding cannot be successfully detected due to factors such as low signal ratio and artifacts of the image itself. In addition, discrepancies between interpretations or even misinterpretations might exist causing critical clinical consequences. To resolve these clinical problems, we developed a diagnostic model predicting intracranial bleeding and its subtypes (intraparenchymal, intraventricular, subarachnoid, subdural, and epidural) by applying deep learning algorithms to CT images. We also constructed a visualization tool highlighting important regions in a CT image for predicting ICH. Specifically, 1) 27,758 CT brain images from RSNA were pre-processed to minimize the computational load. 2) Three different CNN-based models (ResNet, EfficientNet-B2, and EfficientNet-B7) were trained based on a training image data set. 3) Diagnosis performance of each of the three models was evaluated based on an independent test image data set: As a result of the model comparison, EfficientNet-B7's performance (classification accuracy = 91%) was a way greater than the other models. 4) Finally, based on the result of EfficientNet-B7, we visualized the lesions of internal bleeding using the Grad-CAM. Our research suggests that artificial intelligence-based diagnostic systems can help diagnose and treat brain diseases resolving various problems in clinical situations.

Domain Knowledge Incorporated Counterfactual Example-Based Explanation for Bankruptcy Prediction Model (부도예측모형에서 도메인 지식을 통합한 반사실적 예시 기반 설명력 증진 방법)

  • Cho, Soo Hyun;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.2
    • /
    • pp.307-332
    • /
    • 2022
  • One of the most intensively conducted research areas in business application study is a bankruptcy prediction model, a representative classification problem related to loan lending, investment decision making, and profitability to financial institutions. Many research demonstrated outstanding performance for bankruptcy prediction models using artificial intelligence techniques. However, since most machine learning algorithms are "black-box," AI has been identified as a prominent research topic for providing users with an explanation. Although there are many different approaches for explanations, this study focuses on explaining a bankruptcy prediction model using a counterfactual example. Users can obtain desired output from the model by using a counterfactual-based explanation, which provides an alternative case. This study introduces a counterfactual generation technique based on a genetic algorithm (GA) that leverages both domain knowledge (i.e., causal feasibility) and feature importance from a black-box model along with other critical counterfactual variables, including proximity, distribution, and sparsity. The proposed method was evaluated quantitatively and qualitatively to measure the quality and the validity.

Comparison of Prediction Accuracy Between Classification and Convolution Algorithm in Fault Diagnosis of Rotatory Machines at Varying Speed (회전수가 변하는 기기의 고장진단에 있어서 특성 기반 분류와 합성곱 기반 알고리즘의 예측 정확도 비교)

  • Moon, Ki-Yeong;Kim, Hyung-Jin;Hwang, Se-Yun;Lee, Jang Hyun
    • Journal of Navigation and Port Research
    • /
    • v.46 no.3
    • /
    • pp.280-288
    • /
    • 2022
  • This study examined the diagnostics of abnormalities and faults of equipment, whose rotational speed changes even during regular operation. The purpose of this study was to suggest a procedure that can properly apply machine learning to the time series data, comprising non-stationary characteristics as the rotational speed changes. Anomaly and fault diagnosis was performed using machine learning: k-Nearest Neighbor (k-NN), Support Vector Machine (SVM), and Random Forest. To compare the diagnostic accuracy, an autoencoder was used for anomaly detection and a convolution based Conv1D was additionally used for fault diagnosis. Feature vectors comprising statistical and frequency attributes were extracted, and normalization & dimensional reduction were applied to the extracted feature vectors. Changes in the diagnostic accuracy of machine learning according to feature selection, normalization, and dimensional reduction are explained. The hyperparameter optimization process and the layered structure are also described for each algorithm. Finally, results show that machine learning can accurately diagnose the failure of a variable-rotation machine under the appropriate feature treatment, although the convolution algorithms have been widely applied to the considered problem.

Quantitative Evaluations of Deep Learning Models for Rapid Building Damage Detection in Disaster Areas (재난지역에서의 신속한 건물 피해 정도 감지를 위한 딥러닝 모델의 정량 평가)

  • Ser, Junho;Yang, Byungyun
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.40 no.5
    • /
    • pp.381-391
    • /
    • 2022
  • This paper is intended to find one of the prevailing deep learning models that are a type of AI (Artificial Intelligence) that helps rapidly detect damaged buildings where disasters occur. The models selected are SSD-512, RetinaNet, and YOLOv3 which are widely used in object detection in recent years. These models are based on one-stage detector networks that are suitable for rapid object detection. These are often used for object detection due to their advantages in structure and high speed but not for damaged building detection in disaster management. In this study, we first trained each of the algorithms on xBD dataset that provides the post-disaster imagery with damage classification labels. Next, the three models are quantitatively evaluated with the mAP(mean Average Precision) and the FPS (Frames Per Second). The mAP of YOLOv3 is recorded at 34.39%, and the FPS reached 46. The mAP of RetinaNet recorded 36.06%, which is 1.67% higher than YOLOv3, but the FPS is one-third of YOLOv3. SSD-512 received significantly lower values than the results of YOLOv3 on two quantitative indicators. In a disaster situation, a rapid and precise investigation of damaged buildings is essential for effective disaster response. Accordingly, it is expected that the results obtained through this study can be effectively used for the rapid response in disaster management.

A study on fault diagnosis of marine engine using a neural network with dimension-reduced vibration signals (차원 축소 진동 신호를 이용한 신경망 기반 선박 엔진 고장진단에 관한 연구)

  • Sim, Kichan;Lee, Kangsu;Byun, Sung-Hoon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.41 no.5
    • /
    • pp.492-499
    • /
    • 2022
  • This study experimentally investigates the effect of dimensionality reduction of vibration signal on fault diagnosis of a marine engine. By using the principal component analysis, a vibration signal having the dimension of 513 is converted into a low-dimensional signal having the dimension of 1 to 15, and the variation in fault diagnosis accuracy according to the dimensionality change is observed. The vibration signal measured from a full-scale marine generator diesel engine is used, and the contribution of the dimension-reduced signal is quantitatively evaluated using two kinds of variable importance analysis algorithms which are the integrated gradients and the feature permutation methods. As a result of experimental data analysis, the accuracy of the fault diagnosis is shown to improve as the number of dimensions used increases, and when the dimension approaches 10, near-perfect fault classification accuracy is achieved. This shows that the dimension of the vibration signal can be considerably reduced without degrading fault diagnosis accuracy. In the variable importance analysis, the dimension-reduced principal components show higher contribution than the conventional statistical features, which supports the effectiveness of the dimension-reduced signals on fault diagnosis.