• 제목/요약/키워드: logistic classification

검색결과 383건 처리시간 0.026초

Prediction of Academic Performance of College Students with Bipolar Disorder using different Deep learning and Machine learning algorithms

  • Peerbasha, S.;Surputheen, M. Mohamed
    • International Journal of Computer Science & Network Security
    • /
    • 제21권7호
    • /
    • pp.350-358
    • /
    • 2021
  • In modern years, the performance of the students is analysed with lot of difficulties, which is a very important problem in all the academic institutions. The main idea of this paper is to analyze and evaluate the academic performance of the college students with bipolar disorder by applying data mining classification algorithms using Jupiter Notebook, python tool. This tool has been generally used as a decision-making tool in terms of academic performance of the students. The various classifiers could be logistic regression, random forest classifier gini, random forest classifier entropy, decision tree classifier, K-Neighbours classifier, Ada Boost classifier, Extra Tree Classifier, GaussianNB, BernoulliNB are used. The results of such classification model deals with 13 measures like Accuracy, Precision, Recall, F1 Measure, Sensitivity, Specificity, R Squared, Mean Absolute Error, Mean Squared Error, Root Mean Squared Error, TPR, TNR, FPR and FNR. Therefore, conclusion could be reached that the Decision Tree Classifier is better than that of different algorithms.

직업능력개발훈련 만족도 향상을 위한 방안 연구 (A Study on Measures to Improve Satisfaction with Vocational Competency Development Training)

  • 김태복;김광수
    • 대한안전경영과학회지
    • /
    • 제25권2호
    • /
    • pp.167-174
    • /
    • 2023
  • Currently, the budget for vocational competency development training has been expanded, but the number of participants has decreased. As the budget for the Vocational Competency Development Project increases, the participation of a large number of people becomes necessary. This study aims to derive factors that affect satisfaction by selecting factors related to respondent characteristics, training institutions, training types, and job performance for satisfaction with vocational competency development training, and to study ways to improve satisfaction. Data were collected through focus group interviews (FGI), and logistic regression analysis was conducted through feasibility review and reliability analysis. As a result, in the case of the model, it was confirmed that the degree of agreement between the case actually measured and the case predicted by the model was low in the Hosmer and Lemeshow test, but the overall classification accuracy was classified as 96.0% in the classification accuracy table. As for the influence of the factors, the result was derived that the application of knowledge technology, training institution facility equipment, Business Collaboration, long-term work plan, and satisfaction with work performed have an influence in the order.

Classification of nuclear activity types for neighboring countries of South Korea using machine learning techniques with xenon isotopic activity ratios

  • Sang-Kyung Lee;Ser Gi Hong
    • Nuclear Engineering and Technology
    • /
    • 제56권4호
    • /
    • pp.1372-1384
    • /
    • 2024
  • The discrimination of the source for xenon gases' release can provide an important clue for detecting the nuclear activities in the neighboring countries. In this paper, three machine learning techniques, which are logistic regression, support vector machine (SVM), and k-nearest neighbors (KNN), were applied to develop the predictive models for discriminating the source for xenon gases' release based on the xenon isotopic activity ratio data which were generated using the depletion codes, i.e., ORIGEN in SCALE 6.2 and Serpent, for the probable sources. The considered sources for the neighboring countries of South Korea include PWRs, CANDUs, IRT-2000, Yongbyun 5 MWe reactor, and nuclear tests with plutonium and uranium. The results of the analysis showed that the overall prediction accuracies of models with SVM and KNN using six inputs, all exceeded 90%. Particularly, the models based on SVM and KNN that used six or three xenon isotope activity ratios with three classification categories, namely reactor, plutonium bomb, and uranium bomb, had accuracy levels greater than 88%. The prediction performances demonstrate the applicability of machine learning algorithms to predict nuclear threat using ratios of xenon isotopic activity.

Seismic risk priority classification of reinforced concrete buildings based on a predictive model

  • Isil Sanri Karapinar;Ayse E. Ozsoy Ozbay;Emin Ciftci
    • Structural Engineering and Mechanics
    • /
    • 제91권3호
    • /
    • pp.279-289
    • /
    • 2024
  • The purpose of this study is to represent a useful alternative for the preliminary seismic vulnerability assessment of existing reinforced concrete buildings by introducing a statistical approach employing the binary logistic regression technique. Two different predictive statistical models, namely full and reduced models, were generated utilizing building characteristics obtained from the damage database compiled after 1999 Düzce earthquake. Among the inspected building parameters, number of stories, overhang ratio, priority index, soft story index, normalized redundancy ratio and normalized lateral stiffness index were specifically selected as the predictor variables for vulnerability classification. As a result, normalized redundancy ratio and soft story index were identified as the most significant predictors affecting seismic vulnerability in terms of life safety performance level. In conclusion, it is revealed that both models are capable of classifying the set of buildings being severely damaged or collapsed with a balanced accuracy of 73%, hence, both are able to filter out high-priority buildings for life safety performance assessment. Thus, in this study, having the same high accuracy as the full model, the reduced model using fewer predictors is proposed as a simple and viable classifier for determining life safety levels of reinforced concrete buildings in the preliminary seismic risk assessment.

VUS와 HUM 최적화를 이용한 선형함수의 모수추정 (Parameter estimation of linear function using VUS and HUM maximization)

  • 홍종선;원치환;정동길
    • Journal of the Korean Data and Information Science Society
    • /
    • 제26권6호
    • /
    • pp.1305-1315
    • /
    • 2015
  • ROC 곡선을 구성하는 한 개의 스코어 변수로 이루어진 분류모형을 확장하여 선형 스코어의 함수인 리스크 스코어를 고려하고, 선형 스코어의 계수를 추정하기 위한 방법으로 AUC를 최대화하는 방법을 사용한다. 이런 AUC 접근방법으로 구한 스코어의 계수 추정량은 로지스틱모형을 이용한 선형 스코어의 모수의 최대가능도 추정량보다 자료가 로지스틱 가정이 맞지 않는 일반적인 상황에서도 좋은 추정 결과를 보인다. 본 연구에서는 다항범주로 분류되어 현실적인 판별 및 예측 상황을 고려하여 AUC 접근방법을 확장한 VUS와 HUM 접근방법을 제안한다. 연결함수로는 로짓, complementary log-log와 로짓을 변형한 함수의 세 종류와 그리고 다양한 분류점의 분포인 경우에 대하여도 모의실험을 실시하였다. 본 논문에서는 다항범주 판별결과에 대하여 VUS와 HUM 접근방법도 AUC 접근방법과 유사하게 다양한 연결함수에 대하여 로지스틱모형 추정방법보다 동등하거나 더 나은 모수추정 결과를 보이는 것을 확인하였다.

불균형 데이터 환경에서 로지스틱 회귀모형을 이용한 Cochlodinium polykrikoides 적조 탐지 기법 연구 (Study on Detection Technique for Cochlodinium polykrikoides Red tide using Logistic Regression Model under Imbalanced Data)

  • 박수호;김흥민;김범규;황도현;엥흐자리갈 운자야;윤홍주
    • 한국전자통신학회논문지
    • /
    • 제13권6호
    • /
    • pp.1353-1364
    • /
    • 2018
  • 본 연구에서는 불균형 데이터 환경에서 기계학습 기법의 한 갈래인 로지스틱 회귀모형을 이용하여 인공위성 영상에서 Cochlodinium polykrikoides 적조 픽셀을 탐지하는 방법을 제안한다. 학습자료로 적조, 청수, 탁수 해역에서 추출된 수출광량 분광 프로파일을 활용하였다. 전체 데이터셋의 70%를 추출하여 모형 학습에 활용하였으며, 나머지 30%를 이용하여 모형의 분류 정확도를 평가하였다. 이 때, 청수와 탁수에 비해 자료 수가 상대적으로 적은 적조의 분광 프로파일에 백색 잡음을 추가하여 오버샘플링을 하여 불균형 데이터 문제를 해결하였다. 정확도 평가 결과 본 연구에서 제안하는 알고리즘은 약 94%의 분류 정확도를 보였다.

생존분석에서의 기계학습 (Machine learning in survival analysis)

  • 백재욱
    • 산업진흥연구
    • /
    • 제7권1호
    • /
    • pp.1-8
    • /
    • 2022
  • 본 논문은 중도중단 데이터가 포함된 생존데이터의 경우 적용할 수 있는 기계학습 방법에 대해 살펴보았다. 우선 탐색적인 자료분석으로 각 특성에 대한 분포, 여러 특성들 간의 관계 및 중요도 순위를 파악할 수 있었다. 다음으로 독립변수에 해당하는 여러 특성들과 종속변수에 해당하는 특성(사망여부) 간의 관계를 분류문제로 보고 logistic regression, K nearest neighbor 등의 기계학습 방법들을 적용해본 결과 적은 수의 데이터이지만 통상적인 기계학습 결과에서와 같이 logistic regression보다는 random forest가 성능이 더 좋게 나왔다. 하지만 근래에 성능이 좋다고 하는 artificial neural network나 gradient boost와 같은 기계학습 방법은 성능이 월등히 좋게 나오지 않았는데, 그 이유는 주어진 데이터가 빅데이터가 아니기 때문인 것으로 판명된다. 마지막으로 Kaplan-Meier나 Cox의 비례위험모델과 같은 통상적인 생존분석 방법을 적용하여 어떤 독립변수가 종속변수 (ti, δi)에 결정적인 영향을 미치는지 살펴볼 수 있었으며, 기계학습 방법에 속하는 random forest를 중도중단 데이터가 포함된 생존데이터에도 적용하여 성능을 평가할 수 있었다.

산업재해 데이터의 분석 및 분류를 위한 정확도 성능 평가 (Evaluation on Performance of Accuracy for Analysis and Classification of Data Related to Industrial Accidents)

  • 임영문;유창현
    • 대한안전경영과학회:학술대회논문집
    • /
    • 대한안전경영과학회 2006년도 춘계공동학술대회
    • /
    • pp.51-56
    • /
    • 2006
  • Recently data mining techniques have been used for analysis and classification of data related to industrial accidents. The main objective of this study is to compare performance of algorithms for data analysis of industrial accidents and this paper provides a comparative analysis of 5 kinds of algorithms including CHAID, CART, C4.5, LR (Logistic Regression) and NN (Neural Network) with ROC chart, lift chart and response threshold. In this study, data on 67,278 accidents were analyzed to create risk groups for a number of complications, including the risk of disease and accident. The sample for this work chosen from data related to manufacturing industries during three years $(2002\sim2004)$ in korea. According to the result analysis, NN has excellent performance for data analysis and classification of industrial accidents.

  • PDF

Performance Comparison of Decision Trees of J48 and Reduced-Error Pruning

  • Jin, Hoon;Jung, Yong Gyu
    • International journal of advanced smart convergence
    • /
    • 제5권1호
    • /
    • pp.30-33
    • /
    • 2016
  • With the advent of big data, data mining is more increasingly utilized in various decision-making fields by extracting hidden and meaningful information from large amounts of data. Even as exponential increase of the request of unrevealing the hidden meaning behind data, it becomes more and more important to decide to select which data mining algorithm and how to use it. There are several mainly used data mining algorithms in biology and clinics highlighted; Logistic regression, Neural networks, Supportvector machine, and variety of statistical techniques. In this paper it is attempted to compare the classification performance of an exemplary algorithm J48 and REPTree of ML algorithms. It is confirmed that more accurate classification algorithm is provided by the performance comparison results. More accurate prediction is possible with the algorithm for the goal of experiment. Based on this, it is expected to be relatively difficult visually detailed classification and distinction.

Predicting the Performance of Forecasting Strategies for Naval Spare Parts Demand: A Machine Learning Approach

  • Moon, Seongmin
    • Management Science and Financial Engineering
    • /
    • 제19권1호
    • /
    • pp.1-10
    • /
    • 2013
  • Hierarchical forecasting strategy does not always outperform direct forecasting strategy. The performance generally depends on demand features. This research guides the use of the alternative forecasting strategies according to demand features. This paper developed and evaluated various classification models such as logistic regression (LR), artificial neural networks (ANN), decision trees (DT), boosted trees (BT), and random forests (RF) for predicting the relative performance of the alternative forecasting strategies for the South Korean navy's spare parts demand which has non-normal characteristics. ANN minimized classification errors and inventory costs, whereas LR minimized the Brier scores and the sum of forecasting errors.