• Title/Summary/Keyword: 로지스틱회귀분석기법

Search Result 155, Processing Time 0.028 seconds

Evaluation and Analysis of Gwangwon-do Landslide Susceptibility Using Logistic Regression (로지스틱 회귀분석 기법을 이용한 강원도 산사태 취약성 평가 및 분석)

  • Yeon, Young-Kwang
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.14 no.4
    • /
    • pp.116-127
    • /
    • 2011
  • This study conducted landslide susceptibility analysis using logistic regression. The performance of prediction model needs to be evaluated considering two aspects such as a goodness of fit and a prediction accuracy. Thus to gain more objective prediction results in this study, the prediction performance of the applied model was evaluated considering two such evaluation aspects. The selected study area is located between Inje-eup and Buk-myeon in the middle of Kwangwon. Landslides in the study area were caused by heavy rain in 2006. Landslide causal factors were extracted from topographic map, forest map and soil map. The evaluation of prediction model was assessed based on the area under the curve of the cumulative gain chart. From the results of experiments, 87.9% in the goodness of fit and 84.8% in the cross validation were evaluated, showing good prediction accuracies and not big difference between the results of the two evaluation methods. The results can be interpreted in terms of the use of environmental factors which are highly related to landslide occurrences and the accuracy of the prediction model.

Machine-Learning Evaluation of Factors Influencing Landslides (머신러닝기법을 이용한 산사태 발생인자의 영향도 분석)

  • Park, Seong-Yong;Moon, Seong-Woo;Choi, Jaewan;Seo, Yong-Seok
    • The Journal of Engineering Geology
    • /
    • v.31 no.4
    • /
    • pp.701-718
    • /
    • 2021
  • Geological field surveys and a series of laboratory tests were conducted to obtain data related to landslides in Sancheok-myeon, Chungju-si, Chungcheongbuk-do, South Korea where many landslides occurred in the summer of 2020. The magnitudes of various factors' influence on landslide occurrence were evaluated using logistic regression analysis and an artificial neural network. Undisturbed specimens were sampled according to landslide occurrence, and dynamic cone penetration testing measured the depth of the soil layer during geological field surveys. Laboratory tests were performed following the standards of ASTM International. To solve the problem of multicollinearity, the variation inflation factor was calculated for all factors related to landslides, and then nine factors (shear strength, lithology, saturated water content, specific gravity, hydraulic conductivity, USCS, slope angle, and elevation) were determined as influential factors for consideration by machine learning techniques. Minimum-maximum normalization compared factors directly with each other. Logistic regression analysis identified soil depth, slope angle, saturated water content, and shear strength as having the greatest influence (in that order) on the occurrence of landslides. Artificial neural network analysis ranked factors by greatest influence in the order of slope angle, soil depth, saturated water content, and shear strength. Arithmetically averaging the effectiveness of both analyses found slope angle, soil depth, saturated water content, and shear strength as the top four factors. The sum of their effectiveness was ~70%.

Ananlyzing Customer Management Data by Datamining (Focused on Apartment Customer Classification) (데이터마이닝을 통한 고객관리데이터의 분석 (아파트고객 세분화를 중심으로))

  • Baek, Shin Jung
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2004.05a
    • /
    • pp.69-72
    • /
    • 2004
  • 기업간의 경쟁이 심화되고 정보의 중요성에 대한 인식이 확대되어 가는 상황에서 다량의 데이터로부터 가치 있는 데이터를 추출하는 CRM 데이터 마이닝은 중대한 관심사가 아닐 수 없다. 본 연구는 데이터마이닝의 여러 활용 분야 중 고객세분화를 위해 최근 많이 사용되고 있는 데이터마이닝 기법인 로지스틱 회귀분석, 의사결정나무, 신경망 알고리즘 기법들을 비교하며, 이를 실제 아파트 고객의 데이터를 이용하여 검증하고자 한다. 따라서, 아파트 고객 세분화를 위한 데이터마이닝 수행시 기법 선택의 기준과 비교 평가의 기준을 제시하는 데 연구목적 있다.

  • PDF

Principal Components Logistic Regression based on Robust Estimation (로버스트추정에 바탕을 둔 주성분로지스틱회귀)

  • Kim, Bu-Yong;Kahng, Myung-Wook;Jang, Hea-Won
    • The Korean Journal of Applied Statistics
    • /
    • v.22 no.3
    • /
    • pp.531-539
    • /
    • 2009
  • Logistic regression is widely used as a datamining technique for the customer relationship management. The maximum likelihood estimator has highly inflated variance when multicollinearity exists among the regressors, and it is not robust against outliers. Thus we propose the robust principal components logistic regression to deal with both multicollinearity and outlier problem. A procedure is suggested for the selection of principal components, which is based on the condition index. When a condition index is larger than the cutoff value obtained from the model constructed on the basis of the conjoint analysis, the corresponding principal component is removed from the logistic model. In addition, we employ an algorithm for the robust estimation, which strives to dampen the effect of outliers by applying the appropriate weights and factors to the leverage points and vertical outliers identified by the V-mask type criterion. The Monte Carlo simulation results indicate that the proposed procedure yields higher rate of correct classification than the existing method.

Prediction Modeling through Quantification for Qualitative Variables (질적변수에 대한 계량화를 통한 사면붕괴 예측모형)

  • Na, Jong-Hwa;Yu, Hye-Kyung;Nam, Eun-Mi;Cho, Wan-Sup
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.14 no.5
    • /
    • pp.281-288
    • /
    • 2009
  • The purpose of this paper is to provide the statistical models for landslide prediction through quantification and AHP methods. Quantification method is a statistical method of providing quantity to qualitative variables by analyzing the observed data. In this paper, we suggest the quantification process based on the results of cannonical correlation analysis. In contrast with the quantification method which is based on given data the AHP(Analytic Hierarchy Process) technique is a kind of method based on questionaire data which is usually taken from professionals. We analyze both the real data(provided from KIGAM) and questionaire data collected from professionals of various related area. We developed two kinds of evaluation table which provide the scores of land slide possibility and the logistic model providing the probability of occurring landslide. Finally we compare the performance and evaluate the stability of the suggested two models.

Assessment of Freeway Crash Risk using Probe Vehicle Accelerometer (프로브차량 가속도센서를 이용한 고속도로 교통사고 위험도 평가기법)

  • Park, Jae-Hong;Oh, Cheol;Kang, Kyeong-Pyo
    • International Journal of Highway Engineering
    • /
    • v.13 no.2
    • /
    • pp.49-56
    • /
    • 2011
  • Understanding various casual factors affecting the occurrence of freeway traffic crash is a backbone of deriving effective countermeasures. The first step toward understanding such factors is to identify crash risks on freeways. Unlike existing studies, this study focused on the unsafe vehicle maneuvering that can be detected by in-vehicle sensors. The recent advancement of sensor technologies allows us to gather and analyze detailed microscopic events leading to crash occurrence such as the abrupt change in acceleration. This study used an accelerometer to capture the unsafe events. A set of candidate variables representing unsafe events were derived from analyzing acceleration data obtained by the accelerometer. Then, the crash risk was modeled by the binary logistic regression technique. The probabilistic outcome of crash risk can be provided by the proposed model. An application of the methodology assessing crash risk was presented, and further research items for the successful field implementation were also discussed.

Development of a Logistic Regression Model for Analyzing Site Characteristics of Tombs Surrounding Expressway in Aerial Photographs (항공사진에 나타난 고속국도 주변 묘지의 입지 분석을 위한 로지스틱 회귀모형의 개발)

  • Han, Hee;Seol, A-Ra;Chung, JooSang
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.11 no.4
    • /
    • pp.193-202
    • /
    • 2008
  • The objectives of this study are to analyze the spatial site characteristics of existing tombs and the change in the pattern of spatial distributions of tombs over time. The spatial distributions of tombs located in Honam province along the Honam expressway were investigated by interpreting digital aerial photographs taken in two different points of time; 1990 and 2000. According to the results of the study, the tombs newly observed in 2000 photos were located closer to roads and villages than those found in the photos of 1990. This is a finding indicating that the accessibility of tombs has been more important consideration in determining the location of tomb sites. Also found were the gentle slopes of southern aspects to be favored as tomb sites. Based on the data sets of tombs locations and their topographic site characteristics, the probability function of tombs appearance in the study area was derived using the logistic regression analysis technique. As a result, tomb sites were classified as 74.7% by logistic regression. All of six input factors (elevation, slope, aspect, distance from the roads, the town and the stream, respectively) affected the probability of tombs appearance significantly.

  • PDF

Landslide susceptibility mapping using Logistic Regression and Fuzzy Set model at the Boeun Area, Korea (로지스틱 회귀분석과 퍼지 기법을 이용한 산사태 취약성 지도작성: 보은군을 대상으로)

  • Al-Mamun, Al-Mamun;JANG, Dong-Ho
    • Journal of The Geomorphological Association of Korea
    • /
    • v.23 no.2
    • /
    • pp.109-125
    • /
    • 2016
  • This study aims to identify the landslide susceptible zones of Boeun area and provide reliable landslide susceptibility maps by applying different modeling methods. Aerial photographs and field survey on the Boeun area identified landslide inventory map that consists of 388 landslide locations. A total ofseven landslide causative factors (elevation, slope angle, slope aspect, geology, soil, forest and land-use) were extracted from the database and then converted into raster. Landslide causative factors were provided to investigate about the spatial relationship between each factor and landslide occurrence by using fuzzy set and logistic regression model. Fuzzy membership value and logistic regression coefficient were employed to determine each factor's rating for landslide susceptibility mapping. Then, the landslide susceptibility maps were compared and validated by cross validation technique. In the cross validation process, 50% of observed landslides were selected randomly by Excel and two success rate curves (SRC) were generated for each landslide susceptibility map. The result demonstrates the 84.34% and 83.29% accuracy ratio for logistic regression model and fuzzy set model respectively. It means that both models were very reliable and reasonable methods for landslide susceptibility analysis.

Statistical Analysis for Risk Factors and Prediction of Hypertension based on Health Behavior Information (건강행위정보기반 고혈압 위험인자 및 예측을 위한 통계분석)

  • Heo, Byeong Mun;Kim, Sang Yeob;Ryu, Keun Ho
    • Journal of Digital Contents Society
    • /
    • v.19 no.4
    • /
    • pp.685-692
    • /
    • 2018
  • The purpose of this study is to develop a prediction model of hypertension in middle-aged adults using Statistical analysis. Statistical analysis and prediction models were developed using the National Health and Nutrition Survey (2013-2016).Binary logistic regression analysis showed statistically significant risk factors for hypertension, and a predictive model was developed using logistic regression and the Naive Bayes algorithm using Wrapper approach technique. In the statistical analysis, WHtR(p<0.0001, OR = 2.0242) in men and AGE (p<0.0001, OR = 3.9185) in women were the most related factors to hypertension. In the performance evaluation of the prediction model, the logistic regression model showed the best predictive power in men (AUC = 0.782) and women (AUC = 0.858). Our findings provide important information for developing large-scale screening tools for hypertension and can be used as the basis for hypertension research.