• Title/Summary/Keyword: 혼동행렬

Search Result 33, Processing Time 0.027 seconds

Alternative Optimal Threshold Criteria: MFR (대안적인 분류기준: 오분류율곱)

  • Hong, Chong Sun;Kim, Hyomin Alex;Kim, Dong Kyu
    • The Korean Journal of Applied Statistics
    • /
    • v.27 no.5
    • /
    • pp.773-786
    • /
    • 2014
  • We propose the multiplication of false rates (MFR) which is a classification accuracy criteria and an area type of rectangle from ROC curve. Optimal threshold obtained using MFR is compared with other criteria in terms of classification performance. Their optimal thresholds for various distribution functions are also found; consequently, some properties and advantages of MFR are discussed by comparing FNR and FPR corresponding to optimal thresholds. Based on general cost function, cost ratios of optimal thresholds are computed using various classification criteria. The cost ratios for cost curves are observed so that the advantages of MFR are explored. Furthermore, the de nition of MFR is extended to multi-dimensional ROC analysis and the relations of classification criteria are also discussed.

Partial AUC and optimal thresholds (부분 AUC와 최적분류점들)

  • Hong, Chong Sun;Cho, Hyun Su
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.2
    • /
    • pp.187-198
    • /
    • 2019
  • Extensive literature exists on how to estimate optimal thresholds based on various accuracy measures using receiver operating characteristic (ROC) and cumulative accuracy profile (CAP) curves. This paper now proposes an alternative measure to represented the specific partial area under the ROC and CAP curves. The relationship between ROC and CAP functions is examined using differential equations of the new defined partial area under curves. In addition, the relationship with the optimal thresholds under conditions of various accuracy measures for the ROC and CAP functions is also derived. We assume there are two kinds of distribution functions composing the mixed distribution as various normal distributions before finding the optimal thresholds. Corresponding type 1 and 2 errors are also explored and discussed under various conditions for accuracy measures.

Odds curve and optimal threshold (오즈 곡선과 최적분류점)

  • Hong, Chong Sun;Oh, Tae Gyu;Oh, Se Hyeon
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.5
    • /
    • pp.807-822
    • /
    • 2021
  • Various accuracy measures that can be explained on the odds curve are discussed, and an alternative accuracy measure, the maximum square, is proposed based on the characteristics of the odds curve. Thresholds corresponding to these accuracy measures are obtained by considering various probability distribution functions and an illustrative example. Their characteristics are discussed while comparing many kinds of statistics measuring thresholds. Therefore, we can conclude that optimal thresholds could be explored from the odds curve, similar to the ROC curve, and that the maximum square measure can be used as a good accuracy measure that can improve the performance of the binary classification model.

Optimal Polarization Combination Analysis for SAR Image-Based Hydrographic Detection (SAR 영상 기반 수체탐지를 위한 최적 편파 조합 분석)

  • Sungwoo Lee;Wanyub Kim;Seongkeun Cho;Minha Choi
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2023.05a
    • /
    • pp.359-359
    • /
    • 2023
  • 최근 기후변화로 인한 홍수 및 가뭄과 같은 자연재해가 증가함에 따라 이를 선제적으로 탐지 및 예방할 수 있는 해결책에 대한 필요성이 증가하고 있다. 이러한 수재해를 예방하기 위해서 하천, 저수지 등 가용수자원의 지속적인 모니터링은 필수적이다. SAR 위성 영상의 경우 주야간 및 기상상황에 상관없이 지속적인 수체 탐지가 가능하다. 일반적으로 SAR 기반 수체 탐지 시 송수신 방향이 동일한 편파(co-polarized) 영상을 사용한다. 하지만 co-polarized 영상의 경우 바람 및 강우에 민감하게 반응하여 수체 미탐지의 가능성이 존재한다. 한편 송수신 방향이 서로 다른 편파(cross-polarized) 영상은 강우 및 바람의 영향에 민감하지 않지만 식생에 민감하게 반응하여 수체의 오탐지율이 높다는 단점이 존재한다. 이에 SAR 영상의 편파 특성에 따라 수체 탐지의 정확도 차이가 발생하여 최적의 편파 영상 조합을 구성하는 것이 중요하다. 본 연구에서는 Sentinel-1 SAR 위성의 VV, VH, VV+VH 편파 영상과 머신러닝 알고리즘 중 하나인 SVM (support vector machine)을 활용하여 수체탐지를 수행하였다. 편파 영상 조합별 수체 탐지 결과의 검증을 위하여 혼동행렬 (confusion matrix) 기반 평가지수를 사용하였다. 각각의 수체탐지 결과의 비교 및 분석을 통하여 SAR 기반 수체 탐지를 위한 최적의 밴드 조합을 도출하였다. 본 연구결과를 바탕으로 차후 높은 시공간 해상도를 가진 SAR 영상의 활용이 가능하다면 수재해 및 수자원 관리의 효율성을 높일 수 있을 것으로 기대된다.

  • PDF

Time-Invariant Stock Movement Prediction After Golden Cross Using LSTM

  • Sumin Nam;Jieun Kim;ZoonKy Lee
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.8
    • /
    • pp.59-66
    • /
    • 2023
  • The Golden Cross is commonly seen as a buy signal in financial markets, but its reliability for predicting stock price movements is limited due to market volatility. This paper introduces a time-invariant approach that considers the Golden Cross as a singular event. Utilizing LSTM neural networks, we forecast significant stock price changes following a Golden Cross occurrence. By comparing our approach with traditional time series analysis and using a confusion matrix for classification, we demonstrate its effectiveness in predicting post-event stock price trends. To conclude, this study proposes a model with a precision of 83%. By utilizing the model, investors can alleviate potential losses, rather than making buy decisions under all circumstances following a Golden Cross event.

Analysis of Feature Importance of Ship's Berthing Velocity Using Classification Algorithms of Machine Learning (머신러닝 분류 알고리즘을 활용한 선박 접안속도 영향요소의 중요도 분석)

  • Lee, Hyeong-Tak;Lee, Sang-Won;Cho, Jang-Won;Cho, Ik-Soon
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.26 no.2
    • /
    • pp.139-148
    • /
    • 2020
  • The most important factor affecting the berthing energy generated when a ship berths is the berthing velocity. Thus, an accident may occur if the berthing velocity is extremely high. Several ship features influence the determination of the berthing velocity. However, previous studies have mostly focused on the size of the vessel. Therefore, the aim of this study is to analyze various features that influence berthing velocity and determine their respective importance. The data used in the analysis was based on the berthing velocity of a ship on a jetty in Korea. Using the collected data, machine learning classification algorithms were compared and analyzed, such as decision tree, random forest, logistic regression, and perceptron. As an algorithm evaluation method, indexes according to the confusion matrix were used. Consequently, perceptron demonstrated the best performance, and the feature importance was in the following order: DWT, jetty number, and state. Hence, when berthing a ship, the berthing velocity should be determined in consideration of various features, such as the size of the ship, position of the jetty, and loading condition of the cargo.

중력장 가속도, 중력 가속도, 그리고 가속도계 측정값 사이의 관계

  • Lee, Hyeong-Geun
    • ICROS
    • /
    • v.16 no.3
    • /
    • pp.40-45
    • /
    • 2010
  • 물체의 운동을 측정하기 위하여 관성 센서(inertial sensor)에 대한 배경 지식이 없는 사용자가 가속도계(accelerometer)를 사용하고자 할 경우 센서의 이름이 주는 혼동에 의하여 물체의 운동 가속도(acceleration)를 쉽게 얻어낼 수 있으리라 기대하게 된다. 반면, 가속도계가 실제 측정하여 주는 값은 비력 가속도(acceleration due to specific force)에 해당되므로 적절한 처리를 부가하지 않으면 기대한 바와 같이 물체의 운동 가속도를 얻을 수 없다. 가속도계의 측정값으로부터 운동 가속도를 추출하기 위해서는 중력장 가속도 (gravitational acceleration), 중력 가속도 (acceleration due to gravity), 비력 가속도, 그리고 운동 가속도 사이의 관계를 명확하게 구분 이해할 필요가 있다. 본 고에서는 앞선 고들에서 다룬(막대) 벡터, 좌표값, 좌표계, 좌표변환행렬, 그리고 코리올리 효과 등의 개념을 확장하여 다양한 개념의 가속도들을 구분 설명하였다.

Undecided inference using logistic regression for credit evaluation (신용평가에서 로지스틱 회귀를 이용한 미결정자 추론)

  • Hong, Chong-Sun;Jung, Min-Sub
    • Journal of the Korean Data and Information Science Society
    • /
    • v.22 no.2
    • /
    • pp.149-157
    • /
    • 2011
  • Undecided inference could be regarded as a missing data problem such as MARand MNAR. Under the assumption of MAR, undecided inference make use of logistic regression model. The probability of default for the undecided group is obtained with regression coefficient vectors for the decided group and compare with the probability of default for the decided group. And under the assumption of MNAR, undecide dinference make use of logistic regression model with additional feature random vector. Simulation results based on two kinds of real data are obtained and compared. It is found that the misclassification rates are not much different from the rate of rawdata under the assumption of MAR. However the misclassification rates under the assumption of MNAR are less than those under the assumption of MAR, and as the ratio of the undecided group is increasing, the misclassification rates is decreasing.

Predicting Early Retirees Using Personality Data (인성 데이터를 활용한 조기 퇴사자 예측)

  • Kim, Young Park;Kim, Hyoung Joong
    • Journal of Digital Contents Society
    • /
    • v.19 no.1
    • /
    • pp.141-147
    • /
    • 2018
  • This study analyzed the early retired employees who stayed in company no longer than 3 years based on a certain company's personality evaluation result data. The predicted model was analyzed by dividing into two categories; the manufacture group and the R&D group. Independent variables were selected according to the stepwise method. A logistic regression model was selected as a prediction model among various supervised learning methods, and trained through cross-validation to prevent over-fitting or under-fitting. The accuracy of the two groups were confirmed by the confusion matrix. The most influential factor for early retirement in the manufacture group was revealed as "immersion," and for the R&D group appeared as "antisocial." In the past, people concentrated on collecting data by questionnaire and identifying factors that are highly related to the retirement, but this study suggests a sustainable early retirement prediction model in the future by analyzing the tangible outcome of the recruitment process.

ROC evaluation for MLP ANN drought forecasting model (MLP ANN 가뭄 예측 모형에 대한 ROC 평가)

  • Jeong, Min-Su;Kim, Jong-Suk;Jang, Ho-Won;Lee, Joo-Heon
    • Journal of Korea Water Resources Association
    • /
    • v.49 no.10
    • /
    • pp.877-885
    • /
    • 2016
  • In this study, the Standard Precipitation Index(SPI), meteorological drought index, was used to evaluate the temporal and spatial assessment of drought forecasting results for all cross Korea. For the drought forecasting, the Multi Layer Perceptron-Artificial Neural Network (MLP-ANN) was selected and the drought forecasting was performed according to different forecasting lead time for SPI (3) and SPI (6). The precipitation data observed in 59 gaging stations of Korea Meteorological Adminstration (KMA) from 1976~2015. For the performance evaluation of the drought forecasting, the binary classification confusion matrix, such as evaluating the status of drought occurrence based on threshold, was constituted. Then Receiver Operating Characteristics (ROC) score and F score according to conditional probability are computed. As a result of ROC analysis on forecasting performance, drought forecasting performance, of applying the MLP-ANN model, shows satisfactory forecasting results. Consequently, two-month and five-month leading forecasts were possible for SPI (3) and SPI (6), respectively.