• Title/Summary/Keyword: Bayesian logistic regression

Search Result 36, Processing Time 0.024 seconds

Identification of major risk factors association with respiratory diseases by data mining (데이터마이닝 모형을 활용한 호흡기질환의 주요인 선별)

  • Lee, Jea-Young;Kim, Hyun-Ji
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.2
    • /
    • pp.373-384
    • /
    • 2014
  • Data mining is to clarify pattern or correlation of mass data of complicated structure and to predict the diverse outcomes. This technique is used in the fields of finance, telecommunication, circulation, medicine and so on. In this paper, we selected risk factors of respiratory diseases in the field of medicine. The data we used was divided into respiratory diseases group and health group from the Gyeongsangbuk-do database of Community Health Survey conducted in 2012. In order to select major risk factors, we applied data mining techniques such as neural network, logistic regression, Bayesian network, C5.0 and CART. We divided total data into training and testing data, and applied model which was designed by training data to testing data. By the comparison of prediction accuracy, CART was identified as best model. Depression, smoking and stress were proved as the major risk factors of respiratory disease.

Development of Pedestrian Fatality Model using Bayesian-Based Neural Network (베이지안 신경망을 이용한 보행자 사망확률모형 개발)

  • O, Cheol;Gang, Yeon-Su;Kim, Beom-Il
    • Journal of Korean Society of Transportation
    • /
    • v.24 no.2 s.88
    • /
    • pp.139-145
    • /
    • 2006
  • This paper develops pedestrian fatality models capable of producing the probability of pedestrian fatality in collision between vehicles and pedestrians. Probabilistic neural network (PNN) and binary logistic regression (BLR) ave employed in modeling pedestrian fatality pedestrian age, vehicle type, and collision speed obtained from reconstructing collected accidents are used as independent variables in fatality models. One of the nice features of this study is that an iterative sampling technique is used to construct various training and test datasets for the purpose of better performance comparison Statistical comparison considering the variation of model Performances is conducted. The results show that the PNN-based fatality model outperforms the BLR-based model. The models developed in this study that allow us to predict the pedestrian fatality would be useful tools for supporting the derivation of various safety Policies and technologies to enhance Pedestrian safety.

Facial profile parameters and their relative influence on bilabial prominence and the perceptions of facial profile attractiveness: A novel approach

  • Denize, Erin Stewart;McDonald, Fraser;Sherriff, Martyn;Naini, Farhad B.
    • The korean journal of orthodontics
    • /
    • v.44 no.4
    • /
    • pp.184-194
    • /
    • 2014
  • Objective: To evaluate the relative importance of bilabial prominence in relation to other facial profile parameters in a normal population. Methods: Profile stimulus images of 38 individuals (28 female and 10 male; ages 19-25 years) were shown to an unrelated group of first-year students (n = 42; ages 18-24 years). The images were individually viewed on a 17-inch monitor. The observers received standardized instructions before viewing. A six-question questionnaire was completed using a Likert-type scale. The responses were analyzed by ordered logistic regression to identify associations between profile characteristics and observer preferences. The Bayesian Information Criterion was used to select variables that explained observer preferences most accurately. Results: Nasal, bilabial, and chin prominences; the nasofrontal angle; and lip curls had the greatest effect on overall profile attractiveness perceptions. The lip-chin-throat angle and upper lip curl had the greatest effect on forehead prominence perceptions. The bilabial prominence, nasolabial angle (particularly the lower component), and mentolabial angle had the greatest effect on nasal prominence perceptions. The bilabial prominence, nasolabial angle, chin prominence, and submental length had the greatest effect on lip prominence perceptions. The bilabial prominence, nasolabial angle, mentolabial angle, and submental length had the greatest effect on chin prominence perceptions. Conclusions: More prominent lips, within normal limits, may be considered more attractive in the profile view. Profile parameters have a greater influence on their neighboring aesthetic units but indirectly influence related profile parameters, endorsing the importance of achieving an aesthetic balance between relative prominences of all aesthetic units of the facial profile.

Can Housing Prices Be an Alternative to a Census-based Deprivation Index? An Evaluation Based on Multilevel Modeling (주택가격이 센서스에 기반한 박탈지수의 대안이 될 수 있는가?: 다수준 모델에 기반한 평가)

  • Sohn, Chul;Nakaya, Tomoki
    • Journal of Cadastre & Land InformatiX
    • /
    • v.48 no.2
    • /
    • pp.197-211
    • /
    • 2018
  • We conducted this research to examine how well regional housing prices are suited to use as an alternative to conventional census-based regional deprivation indices in health and medical geography studies. To examine the relative performance of mean regional housing prices compared to conventional census-based regional deprivation indices, we compared several multilevel logistic regression models, where the first level was individuals and the second was health districts in the Seoul Metropolitan Area (SMA) in Korea, for the sake of adjusting the regional clustering tendency of unknown factors. In these models, we predicted two dichotomous variables that represented individuals' after-lunch tooth brushing behavior and use of dental floss by individual characteristics and regional indices. Then, we compared the relative predictive performance of the models using the Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC). The results from the estimations showed that mean regional housing prices and census-based deprivation indices were correlated with the two types of dental health behavior in a statistical sense. The results also revealed that the model with mean regional housing prices showed smaller AIC and BIC compared with other models with conventional census-based deprivation indices. These results imply that it is possible for housing prices summarized using aerial units to be used as an alternative to conventional census-based deprivation indices when the census variables employed cannot properly reflect the characteristics of the aerial units.

Forecasting of Customer's Purchasing Intention Using Support Vector Machine (Support Vector Machine 기법을 이용한 고객의 구매의도 예측)

  • Kim, Jin-Hwa;Nam, Ki-Chan;Lee, Sang-Jong
    • Information Systems Review
    • /
    • v.10 no.2
    • /
    • pp.137-158
    • /
    • 2008
  • Rapid development of various information technologies creates new opportunities in online and offline markets. In this changing market environment, customers have various demands on new products and services. Therefore, their power and influence on the markets grow stronger each year. Companies have paid great attention to customer relationship management. Especially, personalized product recommendation systems, which recommend products and services based on customer's private information or purchasing behaviors in stores, is an important asset to most companies. CRM is one of the important business processes where reliable information is mined from customer database. Data mining techniques such as artificial intelligence are popular tools used to extract useful information and knowledge from these customer databases. In this research, we propose a recommendation system that predicts customer's purchase intention. Then, customer's purchasing intention of specific product is predicted by using data mining techniques using receipt data set. The performance of this suggested method is compared with that of other data mining technologies.

Building battery deterioration prediction model using real field data (머신러닝 기법을 이용한 납축전지 열화 예측 모델 개발)

  • Choi, Keunho;Kim, Gunwoo
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.243-264
    • /
    • 2018
  • Although the worldwide battery market is recently spurring the development of lithium secondary battery, lead acid batteries (rechargeable batteries) which have good-performance and can be reused are consumed in a wide range of industry fields. However, lead-acid batteries have a serious problem in that deterioration of a battery makes progress quickly in the presence of that degradation of only one cell among several cells which is packed in a battery begins. To overcome this problem, previous researches have attempted to identify the mechanism of deterioration of a battery in many ways. However, most of previous researches have used data obtained in a laboratory to analyze the mechanism of deterioration of a battery but not used data obtained in a real world. The usage of real data can increase the feasibility and the applicability of the findings of a research. Therefore, this study aims to develop a model which predicts the battery deterioration using data obtained in real world. To this end, we collected data which presents change of battery state by attaching sensors enabling to monitor the battery condition in real time to dozens of golf carts operated in the real golf field. As a result, total 16,883 samples were obtained. And then, we developed a model which predicts a precursor phenomenon representing deterioration of a battery by analyzing the data collected from the sensors using machine learning techniques. As initial independent variables, we used 1) inbound time of a cart, 2) outbound time of a cart, 3) duration(from outbound time to charge time), 4) charge amount, 5) used amount, 6) charge efficiency, 7) lowest temperature of battery cell 1 to 6, 8) lowest voltage of battery cell 1 to 6, 9) highest voltage of battery cell 1 to 6, 10) voltage of battery cell 1 to 6 at the beginning of operation, 11) voltage of battery cell 1 to 6 at the end of charge, 12) used amount of battery cell 1 to 6 during operation, 13) used amount of battery during operation(Max-Min), 14) duration of battery use, and 15) highest current during operation. Since the values of the independent variables, lowest temperature of battery cell 1 to 6, lowest voltage of battery cell 1 to 6, highest voltage of battery cell 1 to 6, voltage of battery cell 1 to 6 at the beginning of operation, voltage of battery cell 1 to 6 at the end of charge, and used amount of battery cell 1 to 6 during operation are similar to that of each battery cell, we conducted principal component analysis using verimax orthogonal rotation in order to mitigate the multiple collinearity problem. According to the results, we made new variables by averaging the values of independent variables clustered together, and used them as final independent variables instead of origin variables, thereby reducing the dimension. We used decision tree, logistic regression, Bayesian network as algorithms for building prediction models. And also, we built prediction models using the bagging of each of them, the boosting of each of them, and RandomForest. Experimental results show that the prediction model using the bagging of decision tree yields the best accuracy of 89.3923%. This study has some limitations in that the additional variables which affect the deterioration of battery such as weather (temperature, humidity) and driving habits, did not considered, therefore, we would like to consider the them in the future research. However, the battery deterioration prediction model proposed in the present study is expected to enable effective and efficient management of battery used in the real filed by dramatically and to reduce the cost caused by not detecting battery deterioration accordingly.