• 제목/요약/키워드: Logistic models

검색결과 804건 처리시간 0.028초

Data Mining for Knowledge Management in a Health Insurance Domain

  • Chae, Young-Moon;Ho, Seung-Hee;Cho, Kyoung-Won;Lee, Dong-Ha;Ji, Sun-Ha
    • 지능정보연구
    • /
    • 제6권1호
    • /
    • pp.73-82
    • /
    • 2000
  • This study examined the characteristicso f the knowledge discovery and data mining algorithms to demonstrate how they can be used to predict health outcomes and provide policy information for hypertension management using the Korea Medical Insurance Corporation database. Specifically this study validated the predictive power of data mining algorithms by comparing the performance of logistic regression and two decision tree algorithms CHAID (Chi-squared Automatic Interaction Detection) and C5.0 (a variant of C4.5) since logistic regression has assumed a major position in the healthcare field as a method for predicting or classifying health outcomes based on the specific characteristics of each individual case. This comparison was performed using the test set of 4,588 beneficiaries and the training set of 13,689 beneficiaries that were used to develop the models. On the contrary to the previous study CHAID algorithm performed better than logistic regression in predicting hypertension but C5.0 had the lowest predictive power. In addition CHAID algorithm and association rule also provided the segment characteristics for the risk factors that may be used in developing hypertension management programs. This showed that data mining approach can be a useful analytic tool for predicting and classifying health outcomes data.

  • PDF

공간통합 모델을 적용한 암괴류 및 애추 지형 분포가능지 추출 (Extraction of Potential Area for Block Stream and Talus Using Spatial Integration Model)

  • 이성호;장동호
    • 한국지형학회지
    • /
    • 제26권2호
    • /
    • pp.1-14
    • /
    • 2019
  • This study analyzed the relativity between block stream and talus distributions by employing a likelihood ratio approach. Possible distribution sites for each debris slope landform were extracted by applying a spatial integration model, in which we combined fuzzy set model, Bayesian predictive model, and logistic regression model. Moreover, to verify model performance, a success rate curve was prepared by cross-validation. The results showed that elevation, slope, curvature, topographic wetness index, geology, soil drainage, and soil depth were closely related to the debris slope landform sites. In addition, all spatial integration models displayed an accuracy of over 90%. The accuracy of the distribution potential area map of the block stream was highest in the logistic regression model (93.79%). Eventually, the accuracy of the distribution potential area map of the talus was also highest in the logistic regression model (97.02%). We expect that the present results will provide essential data and propose methodologies to improve the performance of efficient and systematic micro-landform studies. Moreover, our research will potentially help to enhance field research and topographic resource management.

효율적인 신용평가를 위한 데이터마이닝 모형의 비교.분석에 관한 연구 (Study on the Comparison and Analysis of Data Mining Models for the Efficient Customer Credit Evaluation)

  • 김갑식
    • Journal of Information Technology Applications and Management
    • /
    • 제11권1호
    • /
    • pp.161-174
    • /
    • 2004
  • This study is intended to suggest1 the optimized data mining model for the efficient customer credit evaluation in the capital finance industry. To accomplish the research objective, various data mining models for the customer credit evaluation are compared and analyzed. Furthermore, existing models such as Multi-Layered Perceptrons, Multivariate Discrimination Analysis, Radial Basis Function, Decision Tree, and Logistic Regression are employed for analyzing the customer information in the capital finance market and the detailed data of capital financing transactions. Finally, the data from the integrated model utilizing a genetic algorithm is compared with those of each individual model mentioned above. The results reveals that the integrated model is superior to other existing models.

  • PDF

Height Growth Models for Pinus thunbergii in Jeju Island

  • Park, Gildong;Lee, Daesung;Seo, Yeongwan;Choi, Jungkee
    • Journal of Forest and Environmental Science
    • /
    • 제31권4호
    • /
    • pp.255-260
    • /
    • 2015
  • Height growth models for Pinus thunbergii in Jeju Island were developed in this study using four widely used nonlinear growth models; Exponential, Modified Logistic, Chapman-Richards, and Weibull. All functions were found to be significant at the 1% level. Chapman-Richards model for height-DBH allometry and Weibull model for height-age allometry was chosen as the best model on the all validation. All the model curves showed the similar pattern. Additionally, there was no abnormal pattern when the previous studies were compared. Therefore, these models are highly expected to be used to estimate the tree height using DBH or age for Pinus thunbergii especially in Jeju Island.

로지스틱 회귀, 랜덤포레스트, LSTM 기법을 활용한 서리예측모형 평가 (Comparative assessment of frost event prediction models using logistic regression, random forest, and LSTM networks)

  • 전종안;이현주;임슬희;김대하;백상수
    • 한국수자원학회논문집
    • /
    • 제54권9호
    • /
    • pp.667-680
    • /
    • 2021
  • 이 연구의 목적은 서리 발생일과 무상일 기간의 특성을 분석하고 로지스틱 회귀, 랜덤 포레스트, Long-short Term Memory (LSTM) 기법을 활용하여 서리발생 예측모델을 개발하고 평가하는데 있다. 수원, 청주, 광주 지점에서 봄철과 가을철 서리발생 예측모델 개발을 위한 기상변수들을 수집하였으며, 수집기간은 1973년부터 2019년까지이다. 프리시전(precision), 리콜(Recall), f-1 스코어와, AUC 및 Reliability Diagram과 같은 그래피컬 평가기법을 이용해 서리발생 예측모델을 평가하였다. 봄철과 가을철 모두 서리발생일이 줄어드는 경향성(유의수준: 0.01)을 보였다. 0.9 이상의 높은 AUC 값에도 불구하고, 신뢰도는 일정한 값을 보여주지는 않았다. 서리발생일 측뿐만 아니라, 초상일과 종상일을 정확히 예측할 수 있도록 모형 개선이 필요해 보이며, 다른 지역의 더 많은 지점에서 동일한 기법을 적용해 보는 연구가 필요해 보인다.

사고위치별 로지스틱 회귀 교통사고 모형 - 청주시 4지 신호교차로를 중심으로 - (Logistic Regression Accident Models by Location in the Case of Cheong-ju 4-Legged Signalized Intersections)

  • 박병호;양정모;김준용
    • 한국도로학회논문집
    • /
    • 제11권2호
    • /
    • pp.17-25
    • /
    • 2009
  • 본 연구의 목적은 사고위치별(유입부, 유출부, 교차로내 및 횡단보도) 로지스틱 회귀 교통사고 모형을 개발하는 것이다. 충북지방경찰청의 2004$\sim$2005년도 사고 자료와 현장조사 자료를 근거로, 교통사고와 관련된 기하구조 요소, 환경 요소 등이 분석되었다. 개발된 모형은 카이제곱 p 값은 0.000 그리고 Nagelkerke $R^2$값 0.363$\sim$0.819로 모두 통계적으로 유의한 것으로 분석된다. 개발된 모형의 공통 사고요인은 교통량, 횡단거리 및 좌회전전용차로이며, 특정변수는 교차로내 사고모형의 부도로 교통량, 그리고 횡단보도 사고모형의 주도로 U턴인 것으로 나타나고 있다. Hosmer & Lomeshow 검정은 유입부를 제외한 모형들은 p값이 0.05보다 크기 때문에 통계적으로 적합한 것으로 평가된다. 또한 정분류율 결과는 모든 모형식이 73.9% 이상으로 높은 예측력을 보이는 것으로 분석된다.

  • PDF

서울 경마 경기 우승마 예측 모형 연구 (Analysis of Horse Races: Prediction of Winning Horses in Horse Races Using Statistical Models)

  • 최혜민;황나영;황찬경;송종우
    • 응용통계연구
    • /
    • 제28권6호
    • /
    • pp.1133-1146
    • /
    • 2015
  • 경마 산업은 국내 합법 사행산업의 대부분을 차지하고 있다. 그러나 사행성 도박이라는 인식 하에 여타 스포츠 산업에 비해 활발한 통계적 분석이 이루어지지 않고 있다. 본 연구의 목적은 다양한 데이터마이닝 기법을 이용하여 우승마를 예측하는 모형 개발에 있다. 모형 적합에 사용한 데이터는 한국 마사회에서 제공하는 자료를 바탕으로 하였으며, 경마 성적표, 경주마 정보, 기수 정보, 조교사 정보 등을 사용하였다. 예측 모형은 크게 두 모형으로 나누어 순위를 기반으로 한 모형과 기록을 기반으로 한 모형으로 적합하였고, 분석 방법으로는 선형회귀분석, 랜덤 포레스트, 로지스틱 회귀 분석을 사용하였다. 그 결과 말 기본 정보와 과거 우승 경력, 기수의 과거 우승 경력 등이 순위 예측에 큰 영향을 미치는 것을 알 수 있었다. 모형 적합에 사용되지 않은 최근 1개월 간 데이터를 이용하여 단승식, 복승식, 삼복승식으로 배팅한 결과 모형 간 큰 차이가 없었고, 모두 양의 수익을 얻을 수 있었다.

IP기반 유선인터넷전화 가입요인 도출을 위한 분석적 연구: 통신상품결합서비스의 영향

  • 하성호;양정원
    • 한국데이타베이스학회:학술대회논문집
    • /
    • 한국데이타베이스학회 2010년도 춘계국제학술대회
    • /
    • pp.187-199
    • /
    • 2010
  • Recently, Internet Telephony has become increasingly popular in telecommunication industry. However, previous research on Internet Telephony has focused on analyzing specific Internet Telephony solutions, identifying the Internet Telephony movement itself. The research on prediction models about Internet Telephony adoption has been minimal. The main propose of this study is to develop models for predicting transition intention from using traditional telephones to using Internet Telephony. To do so, this study uses data mining methods to analyze demands in the IT communications market and to provide management strategies for Internet telephony providers. Especially this study uses discriminant analysis, logistic regression, classification tree, and neural nets to develop the prediction models for the Internet Telephony adoption. The models are compared with each other and a superior model is chosen.

  • PDF

외국환 거래의 자금세탁 혐의도 점수모형 개발에 관한 연구 (Scoring models to detect foreign exchange money laundering)

  • 홍성익;문태희;손소영
    • 산업공학
    • /
    • 제18권3호
    • /
    • pp.268-276
    • /
    • 2005
  • In recent years, the money Laundering crimes are increasing by means of foreign exchange transactions. Our study proposes four scoring models to provide early warning of the laundering in foreign exchange transactions for both inward and outward remittances: logistic regression model, decision tree, neural network, and ensemble model which combines the three models. In terms of accuracy of test data, decision tree model is selected for the inward remittance and an ensemble model for the outward remittance. From our study results, the accumulated number of transaction turns out to be the most important predictor variable. The proposed scoring models deal with the transaction level and is expected to help the bank teller to detect the laundering related transactions in the early stage.

인터넷전화(VoIP)의 신규고객 유치를 지원하는 데이터마이닝 모델 (A Data-Mining Model to Support new Customer Acquisition for Internet Telephony(VoIP))

  • 하성호;양정원;송영미
    • Journal of Information Technology Applications and Management
    • /
    • 제17권2호
    • /
    • pp.133-154
    • /
    • 2010
  • Recently, Internet Telephony has become increasingly popular in telecommunication industry. However, previous research on Internet Telephony has focused on analyzing specific Internet Telephonysolutions, identifyingthe Internet Telephony movement itself. The research on prediction models about Internet Telephony adoption has been minimal. The main propose of this study is to develop models for predicting transition intention from using traditional telephones to using Internet Telephony. To do so, this study uses data mining methods to analyze demands in the IT communications market and to provide management strategies for Internet telephony providers. Especially this study uses discriminant analysis, logistic regression, classification tree, and neural nets to develop those prediction models toward Internet Telephony adoption. The models are compared with each other and a superior model is chosen.

  • PDF