• Title/Summary/Keyword: 비대칭 오류비용

Search Result 8, Processing Time 0.02 seconds

A Recidivism Prediction Model Based on XGBoost Considering Asymmetric Error Costs (비대칭 오류 비용을 고려한 XGBoost 기반 재범 예측 모델)

  • Won, Ha-Ram;Shim, Jae-Seung;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.127-137
    • /
    • 2019
  • Recidivism prediction has been a subject of constant research by experts since the early 1970s. But it has become more important as committed crimes by recidivist steadily increase. Especially, in the 1990s, after the US and Canada adopted the 'Recidivism Risk Assessment Report' as a decisive criterion during trial and parole screening, research on recidivism prediction became more active. And in the same period, empirical studies on 'Recidivism Factors' were started even at Korea. Even though most recidivism prediction studies have so far focused on factors of recidivism or the accuracy of recidivism prediction, it is important to minimize the prediction misclassification cost, because recidivism prediction has an asymmetric error cost structure. In general, the cost of misrecognizing people who do not cause recidivism to cause recidivism is lower than the cost of incorrectly classifying people who would cause recidivism. Because the former increases only the additional monitoring costs, while the latter increases the amount of social, and economic costs. Therefore, in this paper, we propose an XGBoost(eXtream Gradient Boosting; XGB) based recidivism prediction model considering asymmetric error cost. In the first step of the model, XGB, being recognized as high performance ensemble method in the field of data mining, was applied. And the results of XGB were compared with various prediction models such as LOGIT(logistic regression analysis), DT(decision trees), ANN(artificial neural networks), and SVM(support vector machines). In the next step, the threshold is optimized to minimize the total misclassification cost, which is the weighted average of FNE(False Negative Error) and FPE(False Positive Error). To verify the usefulness of the model, the model was applied to a real recidivism prediction dataset. As a result, it was confirmed that the XGB model not only showed better prediction accuracy than other prediction models but also reduced the cost of misclassification most effectively.

침입탐지시스템의 비대칭 오류비용을 이용한 데이터마이닝의 적용전략

  • Hong, Tae-Ho;Kim, Jin-Wan
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2005.11a
    • /
    • pp.251-257
    • /
    • 2005
  • 최근 들어 네트워크 침입탐지시스템은 정보시스템 보안에서 매우 중요하게 인식되고 있다. 네트워크침입시스템에 데이터마이닝 기법들을 활용하는 연구들이 활발하게 그동안 활발하게 진행되어 왔다. 하지만 단순한 데이터마이닝 기법의 적용만으로는 침입탐지시스템의 효과를 극대화 할 수 없다. 침입탐지시스템은 오류의 종류에 따라 조직에 미치는 영향이 매우 상이한 특징을 갖는다. 따라서 본 연구에서는 침입탐지시스템의 오류의 특징에 따른 각기 다른 데이터마이닝 기법을 적용하는 방안을 제시하였다. 또한 국내에서 사용된 실제 네트워크를 통한 침입공격에 관한 데이터를 수집하고, 신경망, 귀잡적 학습법, 러프집합을 적용하여 국내 데이터 특성을 고려한 네트워크 침입탐지모형을 제시하였다.

  • PDF

Intelligent Intrusion Detection Systems Using the Asymmetric costs of Errors in Data Mining (데이터 마이닝의 비대칭 오류비용을 이용한 지능형 침입탐지시스템 개발)

  • Hong, Tae-Ho;Kim, Jin-Wan
    • The Journal of Information Systems
    • /
    • v.15 no.4
    • /
    • pp.211-224
    • /
    • 2006
  • This study investigates the application of data mining techniques such as artificial neural networks, rough sets, and induction teaming to the intrusion detection systems. To maximize the effectiveness of data mining for intrusion detection systems, we introduced the asymmetric costs with false positive errors and false negative errors. And we present a method for intrusion detection systems to utilize the asymmetric costs of errors in data mining. The results of our empirical experiment show our intrusion detection model provides high accuracy in intrusion detection. In addition the approach using the asymmetric costs of errors in rough sets and neural networks is effective according to the change of threshold value. We found the threshold has most important role of intrusion detection model for decreasing the costs, which result from false negative errors.

  • PDF

An Integrated Model based on Genetic Algorithms for Implementing Cost-Effective Intelligent Intrusion Detection Systems (비용효율적 지능형 침입탐지시스템 구현을 위한 유전자 알고리즘 기반 통합 모형)

  • Lee, Hyeon-Uk;Kim, Ji-Hun;Ahn, Hyun-Chul
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.1
    • /
    • pp.125-141
    • /
    • 2012
  • These days, the malicious attacks and hacks on the networked systems are dramatically increasing, and the patterns of them are changing rapidly. Consequently, it becomes more important to appropriately handle these malicious attacks and hacks, and there exist sufficient interests and demand in effective network security systems just like intrusion detection systems. Intrusion detection systems are the network security systems for detecting, identifying and responding to unauthorized or abnormal activities appropriately. Conventional intrusion detection systems have generally been designed using the experts' implicit knowledge on the network intrusions or the hackers' abnormal behaviors. However, they cannot handle new or unknown patterns of the network attacks, although they perform very well under the normal situation. As a result, recent studies on intrusion detection systems use artificial intelligence techniques, which can proactively respond to the unknown threats. For a long time, researchers have adopted and tested various kinds of artificial intelligence techniques such as artificial neural networks, decision trees, and support vector machines to detect intrusions on the network. However, most of them have just applied these techniques singularly, even though combining the techniques may lead to better detection. With this reason, we propose a new integrated model for intrusion detection. Our model is designed to combine prediction results of four different binary classification models-logistic regression (LOGIT), decision trees (DT), artificial neural networks (ANN), and support vector machines (SVM), which may be complementary to each other. As a tool for finding optimal combining weights, genetic algorithms (GA) are used. Our proposed model is designed to be built in two steps. At the first step, the optimal integration model whose prediction error (i.e. erroneous classification rate) is the least is generated. After that, in the second step, it explores the optimal classification threshold for determining intrusions, which minimizes the total misclassification cost. To calculate the total misclassification cost of intrusion detection system, we need to understand its asymmetric error cost scheme. Generally, there are two common forms of errors in intrusion detection. The first error type is the False-Positive Error (FPE). In the case of FPE, the wrong judgment on it may result in the unnecessary fixation. The second error type is the False-Negative Error (FNE) that mainly misjudges the malware of the program as normal. Compared to FPE, FNE is more fatal. Thus, total misclassification cost is more affected by FNE rather than FPE. To validate the practical applicability of our model, we applied it to the real-world dataset for network intrusion detection. The experimental dataset was collected from the IDS sensor of an official institution in Korea from January to June 2010. We collected 15,000 log data in total, and selected 10,000 samples from them by using random sampling method. Also, we compared the results from our model with the results from single techniques to confirm the superiority of the proposed model. LOGIT and DT was experimented using PASW Statistics v18.0, and ANN was experimented using Neuroshell R4.0. For SVM, LIBSVM v2.90-a freeware for training SVM classifier-was used. Empirical results showed that our proposed model based on GA outperformed all the other comparative models in detecting network intrusions from the accuracy perspective. They also showed that the proposed model outperformed all the other comparative models in the total misclassification cost perspective. Consequently, it is expected that our study may contribute to build cost-effective intelligent intrusion detection systems.

A Study on the Effect of Fair Value Hierarchy upon Cost of Capital Through the Convergence of Market Risk Management and Audit Quality (시장위험관리와 감사품질의 융합을 통한 공정가치 서열체계의 자본비용에 미치는 영향에 대한 연구)

  • Oh, Hyun-Taek
    • Journal of the Korea Convergence Society
    • /
    • v.6 no.5
    • /
    • pp.1-8
    • /
    • 2015
  • The data of fair value hierarchy is expected to contain different degree of measurement error, information asymmetry, and information risk by the level of hierarchy. Thus, this study examines how hierarchy of fair value discriminately influences on companies' cost of capital. Through regression analysis of corporations listed from 2011 to 2014, it turns out that the regression coefficient of level 1 and 2 of fair value variable vary their rank by cost of capital types, while level 3 contains the highest regression coefficient for every cost of capital variable. In addition, further study of how the relevance between cost of capital and the fair value hierarchy gets affected by market risk management level and audit quality finds no consistent results. However, by analyzing the effect of coincident interaction through the convergence of market risk management and audit quality, when audit quality and market risk management level are high, the effect of relieving cost of capital of Level 3 gets the highest. In conclusion, fair value hierarchy data seems to affect discriminately on cost of capital by involved information risk, and the information risk could decrease by the level of market risk management and audit quality.

An Intelligent Intrusion Detection Model Based on Support Vector Machines and the Classification Threshold Optimization for Considering the Asymmetric Error Cost (비대칭 오류비용을 고려한 분류기준값 최적화와 SVM에 기반한 지능형 침입탐지모형)

  • Lee, Hyeon-Uk;Ahn, Hyun-Chul
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.4
    • /
    • pp.157-173
    • /
    • 2011
  • As the Internet use explodes recently, the malicious attacks and hacking for a system connected to network occur frequently. This means the fatal damage can be caused by these intrusions in the government agency, public office, and company operating various systems. For such reasons, there are growing interests and demand about the intrusion detection systems (IDS)-the security systems for detecting, identifying and responding to unauthorized or abnormal activities appropriately. The intrusion detection models that have been applied in conventional IDS are generally designed by modeling the experts' implicit knowledge on the network intrusions or the hackers' abnormal behaviors. These kinds of intrusion detection models perform well under the normal situations. However, they show poor performance when they meet a new or unknown pattern of the network attacks. For this reason, several recent studies try to adopt various artificial intelligence techniques, which can proactively respond to the unknown threats. Especially, artificial neural networks (ANNs) have popularly been applied in the prior studies because of its superior prediction accuracy. However, ANNs have some intrinsic limitations such as the risk of overfitting, the requirement of the large sample size, and the lack of understanding the prediction process (i.e. black box theory). As a result, the most recent studies on IDS have started to adopt support vector machine (SVM), the classification technique that is more stable and powerful compared to ANNs. SVM is known as a relatively high predictive power and generalization capability. Under this background, this study proposes a novel intelligent intrusion detection model that uses SVM as the classification model in order to improve the predictive ability of IDS. Also, our model is designed to consider the asymmetric error cost by optimizing the classification threshold. Generally, there are two common forms of errors in intrusion detection. The first error type is the False-Positive Error (FPE). In the case of FPE, the wrong judgment on it may result in the unnecessary fixation. The second error type is the False-Negative Error (FNE) that mainly misjudges the malware of the program as normal. Compared to FPE, FNE is more fatal. Thus, when considering total cost of misclassification in IDS, it is more reasonable to assign heavier weights on FNE rather than FPE. Therefore, we designed our proposed intrusion detection model to optimize the classification threshold in order to minimize the total misclassification cost. In this case, conventional SVM cannot be applied because it is designed to generate discrete output (i.e. a class). To resolve this problem, we used the revised SVM technique proposed by Platt(2000), which is able to generate the probability estimate. To validate the practical applicability of our model, we applied it to the real-world dataset for network intrusion detection. The experimental dataset was collected from the IDS sensor of an official institution in Korea from January to June 2010. We collected 15,000 log data in total, and selected 1,000 samples from them by using random sampling method. In addition, the SVM model was compared with the logistic regression (LOGIT), decision trees (DT), and ANN to confirm the superiority of the proposed model. LOGIT and DT was experimented using PASW Statistics v18.0, and ANN was experimented using Neuroshell 4.0. For SVM, LIBSVM v2.90-a freeware for training SVM classifier-was used. Empirical results showed that our proposed model based on SVM outperformed all the other comparative models in detecting network intrusions from the accuracy perspective. They also showed that our model reduced the total misclassification cost compared to the ANN-based intrusion detection model. As a result, it is expected that the intrusion detection model proposed in this paper would not only enhance the performance of IDS, but also lead to better management of FNE.

Dispute Settlement in Construction Contracts Under FIDIC (FIDIC에 의한 건설계약 분쟁 해결방안에 관한 연구)

  • Kim, Seong-Chirl;Jung, Byeong-Hwa
    • Journal of the Korea Institute of Building Construction
    • /
    • v.10 no.4
    • /
    • pp.21-29
    • /
    • 2010
  • International construction contractors are often faced with the situation of working in an unfamiliar construction environment. Under FIDIC rules, the contractor has the right to make a claim requesting the consulting engineer for an adjustment to the contract price or the time for completion when a part or parts of the works have changed, or in the event of unforeseeable conditions. Contractors generally have more access to the costs and time implications of such a change or unforeseeable conditions than the consulting engineer or outside neutrals. Due to such an asymmetry of information, the contractor may be motivated to dispute frivolous claims of less merit, expecting erroneous judgments by the consulting engineer or the neutrals. In this paper, a claiming behavior model is presented by using game theory and experience data to study the manner in which frivolous claims develop into disputes. The model also analyzes the impacts of DAB/DRB upon the frivolous claims.

Efficient IoT data processing techniques based on deep learning for Edge Network Environments (에지 네트워크 환경을 위한 딥 러닝 기반의 효율적인 IoT 데이터 처리 기법)

  • Jeong, Yoon-Su
    • Journal of Digital Convergence
    • /
    • v.20 no.3
    • /
    • pp.325-331
    • /
    • 2022
  • As IoT devices are used in various ways in an edge network environment, multiple studies are being conducted that utilizes the information collected from IoT devices in various applications. However, it is not easy to apply accurate IoT data immediately as IoT data collected according to network environment (interference, interference, etc.) are frequently missed or error occurs. In order to minimize mistakes in IoT data collected in an edge network environment, this paper proposes a management technique that ensures the reliability of IoT data by randomly generating signature values of IoT data and allocating only Security Information (SI) values to IoT data in bit form. The proposed technique binds IoT data into a blockchain by applying multiple hash chains to asymmetrically link and process data collected from IoT devices. In this case, the blockchainized IoT data uses a probability function to which a weight is applied according to a correlation index based on deep learning. In addition, the proposed technique can expand and operate grouped IoT data into an n-layer structure to lower the integrity and processing cost of IoT data.