• Title/Summary/Keyword: principal component regression

Search Result 253, Processing Time 0.028 seconds

Characterizing CO2 Supersaturation and Net Atmospheric Flux in the Middle and Lower Nakdong River (낙동강 중하류에서 이산화탄소 과포화 및 순배출 특성 분석)

  • Lee, Eun Ju;Chung, Se Woong;Park, Hyung Seok
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2019.05a
    • /
    • pp.416-416
    • /
    • 2019
  • 육상 담수는 대기중 이산화탄소($CO_2$) 배출의 중요한 발생원으로 주목되고 있다. 하천 및 강에서 대기중으로 배출되는 $CO_2$는 전 세계 탄소순환의 핵심요소이며, 대부분의 하천과 강은 $CO_2$로 과포화 되어있다. 세계적으로 하천 및 강의 $CO_2$ 배출량은 호수 및 저수지의 배출량보다 약 5배 많은 것으로 보고되고 있으나, 국내연구에서는 연구사례가 드물다. 따라서 본 연구의 목적은 낙동강 중하류에 위치해있는 강정고령보(GGW), 달성보(DSW), 합천창녕보(HCW), 창녕함안보(CHW)에서 발생되는 순 대기 배출 플럭스(Net Atmospheric Flux, NAF)의 동적 변동 특성을 분석하고, 데이터마이닝 기법을 적용하여 쉽게 수집할 수 있는 물리적 및 수질 변수로 $CO_2$ NAF를 추정하는데 사용할 수 있는 간략한 예측 모델을 개발하는데 있다. $CO_2$ NAF는 대기-수면 경계면에서의 $CO_2$ 부분압($pCO_2$)의 차에 기체전달속도를 곱하여 산정하였으며, 기체전달속도는 Cole and Caraco(1998)가 제안한 식을 사용하였다. 담수와 해수의 탄산염 시스템에서 열역학적 화학평형을 모두 고려한 $CO_2$SYS 프로그램을 사용하여 수중의 $pCO_2$를 산정하였고, $CO_2$ NAF는 Henry의 법칙과 Fick의 1차 확산법칙을 사용하여 계산하였다. $CO_2$ NAF의 시간적 변동성에 영향을 미치는 환경요인을 평가하기 위해서 상관분석, 주성분분석(Principal Component Analysis; PCA), 단계적다중회귀모델(Step-wise Multiple Linear Regression; SMLR), 랜덤포레스트(Random Forest; RF)방법을 사용하였다. SMLR 모델은 R package인 olsrr, RF 모델은 R package인 caret, randomForest를 이용하여 분석하였다. 연구 결과, 4개 보 상류 하천구간은 조류의 성장이 활발한 일부 기간을 제외한 대부분의 기간에서 $CO_2$를 대기로 배출하는 종속영양시스템(Heterotrophic system)을 보였다. $CO_2$ NAF의 중위값은 HCW에서 최소 $391.5mg-CO_2/m^2day$, DSW에서 최대 $1472.7mg-CO_2/m^2day$였다. 모든 보에서 NAF는 pH와 강한 음의 상관관계를 보였으며, $pCO_2$와 Chl-a도 음의 상관관계를 보였다. 이는 조류가 수중에서 $CO_2$를 소비하고 pH를 증가시키기 때문이다. PCA 분석 결과, NAF와 $pCO_2$가 높은 공분산을 보였으며, pH와 Chl-a는 반대 방향으로 군집되어 상관분석과 동일한 결과를 보였다. 이 연구를 통해 개발된 SMLR 모델과 RF 모델의 Adj. $R^2$ 값은 모든 보에서 0.77 이상으로 나왔으며, $pCO_2$ 측정 데이터가 없더라도 하천의 $CO_2$ NAF를 추정하는 방법으로 사용될 수 있을 것으로 평가된다.

  • PDF

Bhumipol Dam Operation Improvement via smart system for the Thor Tong Daeng Irrigation Project, Ping River Basin, Thailand

  • Koontanakulvong, Sucharit;Long, Tran Thanh;Van, Tuan Pham
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2019.05a
    • /
    • pp.164-175
    • /
    • 2019
  • The Tor Tong Daeng Irrigation Project with the irrigation area of 61,400 hectares is located in the Ping Basin of the Upper Central Plain of Thailand where farmers depended on both surface water and groundwater. In the drought year, water storage in the Bhumipol Dam is inadequate to allocate water for agriculture, and caused water deficit in many irrigation projects. Farmers need to find extra sources of water such as water from farm pond or groundwater as a supplement. The operation of Bhumipol Dam and irrigation demand estimation are vital for irrigation water allocation to help solve water shortage issue in the irrigation project. The study aims to determine the smart dam operation system to mitigate water shortage in this irrigation project via introduction of machine learning to improve dam operation and irrigation demand estimation via soil moisture estimation from satellite images. Via ANN technique application, the inflows to the dam are generated from the upstream rain gauge stations using past 10 years daily rainfall data. The input vectors for ANN model are identified base on regression and principal component analysis. The structure of ANN (length of training data, the type of activation functions, the number of hidden nodes and training methods) is determined from the statistics performance between measurements and ANN outputs. On the other hands, the irrigation demand will be estimated by using satellite images, LANDSAT. The Enhanced Vegetation Index (EVI) and Temperature Vegetation Dryness Index (TVDI) values are estimated from the plant growth stage and soil moisture. The values are calibrated and verified with the field plant growth stages and soil moisture data in the year 2017-2018. The irrigation demand in the irrigation project is then estimated from the plant growth stage and soil moisture in the area. With the estimated dam inflow and irrigation demand, the dam operation will manage the water release in the better manner compared with the past operational data. The results show how smart system concept was applied and improve dam operation by using inflow estimation from ANN technique combining with irrigation demand estimation from satellite images when compared with the past operation data which is an initial step to develop the smart dam operation system in Thailand.

  • PDF

The association of dietary patterns with insulin resistance in Korean adults: based on the 2015 Korea National Health and Nutrition Examination Survey (한국 성인의 식사 패턴과 인슐린 저항성 간의 상관성: 2015년도 국민건강영양조사를 이용하여)

  • Kim, I Seul;Yang, Yoon Jung
    • Journal of Nutrition and Health
    • /
    • v.54 no.3
    • /
    • pp.247-261
    • /
    • 2021
  • Purpose: This study was conducted to identify the association between insulin resistance and the major dietary patterns of Korean adults. Methods: This study used data from the 2015 Korea National Health and Nutrition Examination Survey. The subjects were 2,276 adults aged 19 to 64 years old. Based on the food frequency questionnaire data, 112 food items were reclassified into 30 food groups. The principal component analysis method was applied to identify major dietary patterns. We used homeostatic model assessment of insulin resistance (HOMA-IR) and quantitative insulin sensitivity check index (QUICKI) value as indicators of insulin resistance. The association between major dietary patterns and insulin resistance was investigated using logistic regression analysis. Results: Three major dietary patterns were identified and assigned descriptive names based on the food items with high loadings: 'healthy Korean meal pattern', 'western meal pattern', and 'white rice, alcohol, meat pattern'. As the 'white rice, alcohol, meat pattern' score increased, significant increasing trends for fasting glucose concentration and HOMA-IR and a significant decreasing trend for QUICKI were observed after adjusting for age and sex. The odds ratio of insulin resistance according to the 'healthy Korean meal pattern' and the 'western meal pattern' were not statistically significant. the 'white rice, alcohol, meat pattern' showed a significant positive association with the risk of insulin resistance after adjusting for covariates. Conclusion: These results suggest that the 'white rice, alcohol, meat pattern' is positively associated with the risk of insulin resistance. The white rice, alcohol, meat pattern was related to the high consumption of alcohol together with rice or meat. This pattern was also associated with the high intake of sodium and low intakes of vitamin C, calcium, potassium, and dietary fiber. To confirm the association, further longitudinal studies are required.

Discrimination of Internally Browned Apples Utilizing Near-Infrared Non-Destructive Fruit Sorting System (근적외선 비파괴 과일 선별 시스템을 활용한 내부 갈변 사과의 판별)

  • Kim, Bal Geum;Lim, Jong Guk
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.22 no.1
    • /
    • pp.208-213
    • /
    • 2021
  • There is a lack of studies comparing the internal quality of fruit with its external quality. However, issues of internal quality of fruit such as internal browning are important. We propose a method of classifying normal apples and internally browned apples using a near-infrared (NIR) non-destructive system. Specifically, we found the optimal wavelength and characteristics of the spectra for determining the internal browning of Fuji apples. The NIR spectra of apples were obtained in the wavelength range of 470-1150 nm. A group of normal apples and a group of internally browned apples were identified using principal component analysis (PCA), and a partial least squares regression (PLSR) analysis was performed to develop and evaluate the discriminant model. The PCA analysis revealed a clear difference between the normal and internally browned apples. From the PLSR, the correlation coefficient of the predictive model without pretreatment was determined to be 0.902 with an RMSE value of 0.157. The correlation coefficient of the predictive model with pretreatment was 0.906 with an RMSE value of 0.154. The results show that this model is suitable for classifying normal and internally browned apples and that it can be applied for the sorting and evaluation of agricultural products for internal and external defects.

Simultaneous estimation of fatty acids contents from soybean seeds using fourier transform infrared spectroscopy and gas chromatography by multivariate analysis (적외선 분광스펙트럼 및 기체크로마토그라피 분석 데이터의 다변량 통계분석을 이용한 대두 종자 지방산 함량예측)

  • Ahn, Myung Suk;Ji, Eun Yee;Song, Seung Yeob;Ahn, Joon Woo;Jeong, Won Joong;Min, Sung Ran;Kim, Suk Weon
    • Journal of Plant Biotechnology
    • /
    • v.42 no.1
    • /
    • pp.60-70
    • /
    • 2015
  • The aim of this study was to investigate whether fourier transform infrared (FT-IR) spectroscopy can be applied to simultaneous determination of fatty acids contents in different soybean cultivars. Total 153 lines of soybean (Glycine max Merrill) were examined by FT-IR spectroscopy. Quantification of fatty acids from the soybean lines was confirmed by quantitative gas chromatography (GC) analysis. The quantitative spectral variation among different soybean lines was observed in the amide bond region ($1,700{\sim}1,500cm^{-1}$), phosphodiester groups ($1,500{\sim}1,300cm^{-1}$) and sugar region ($1,200{\sim}1,000cm^{-1}$) of FT-IR spectra. The quantitative prediction modeling of 5 individual fatty acids contents (palmitic acid, stearic acid, oleic acid, linoleic acid, linolenic acid) from soybean lines were established using partial least square regression algorithm from FT-IR spectra. In cross validation, there were high correlations ($R^2{\geq}0.97$) between predicted content of 5 individual fatty acids by PLS regression modeling from FT-IR spectra and measured content by GC. In external validation, palmitic acid ($R^2=0.8002$), oleic acid ($R^2=0.8909$) and linoleic acid ($R^2=0.815$) were predicted with good accuracy, while prediction for stearic acid ($R^2=0.4598$), linolenic acid ($R^2=0.6868$) had relatively lower accuracy. These results clearly show that FT-IR spectra combined with multivariate analysis can be used to accurately predict fatty acids contents in soybean lines. Therefore, we suggest that the PLS prediction system for fatty acid contents using FT-IR analysis could be applied as a rapid and high throughput screening tool for the breeding for modified Fatty acid composition in soybean and contribute to accelerating the conventional breeding.

A Study on Startups' Dependence on Business Incubation Centers (창업보육서비스에 따른 입주기업의 창업보육센터 의존도에 관한 연구)

  • Park, JaeSung;Lee, Chul;Kim, JaeJon
    • Korean small business review
    • /
    • v.31 no.2
    • /
    • pp.103-120
    • /
    • 2009
  • As business incubation centers (BICs) have been operating for more than 10 years in Korea, many early stage startups tend to use the services provided by the incubating centers. BICs in Korea have accumulated the knowledge and experience in the past ten years and their services have been considerably improved. The business incubating service has three facets : (1) business infrastructure service, (2) direct service, and (3) indirect service. The mission of BICs is to provide the early stage entrepreneurs with the incubating service in a limited period time to help them grow strong enough to survive the fierce competition after graduating from the incubation. However, the incubating services sometimes fail to foster the independence of new startup companies, and raise the dependence of many companies on BICs. Thus, the dependence on BICs is a very important factor to understand the survival of the incubated startup companies after graduation from BICs. The purpose of this study is to identify the main factors that influence the firm's dependence on BICs and to characterize the relationships among the identified factors. The business incubating service is a core construct of this study. It includes various activities and resources, such as offering the physical facilities, legal service, and connecting them with outside organizations. These services are extensive and take various forms. They are provided by BICs directly or indirectly. Past studies have identified various incubating services and classify them in different ways. Based on the past studies, we classify the business incubating service into three categories as mentioned above : (1) business infrastructure support, (2) direct support, and (3) networking support. The business infrastructure support is to provide the essential resources to start the business, such as physical facilities. The direct support is to offer the business resources available in the BICs, such as human, technical, and administrational resources. Finally, the indirect service was to support the resource in the outside of business incubation center. Dependence is generally defined as the degree to which a client firm needs the resources provided by the service provider in order to achieve its goals. Dependence is generated when a firm recognizes the benefits of interacting with its counterpart. Hence, the more positive outcomes a firm derives from its relationship with the partner, the more dependent on the partner the firm must inevitably become. In business incubating, as a resident firm is incubated in longer period, we can predict that her dependence on BICs would be stronger. In order to foster the independence of the incubated firms, BICs have to be able to manipulate the provision of their services to control the firms' dependence on BICs. Based on the above discussion, the research model for relationships between dependence and its affecting factors was developed. We surveyed the companies residing in BICs to test our research model. The instrument of our study was modified, in part, on the basis of previous relevant studies. For the purposes of testing reliability and validity, preliminary testing was conducted with firms that were residing in BICs and incubated by the BICs in the region of Gwangju and Jeonnam. The questionnaire was modified in accordance with the pre-test feedback. We mailed to all of the firms that had been incubated by the BICs with the help of business incubating managers of each BIC. The survey was conducted over a three week period. Gifts (of approximately ₩10,000 value) were offered to all actively participating respondents. The incubating period was reported by the business incubating managers, and it was transformed using natural logarithms. A total of 180 firms participated in the survey. However, we excluded 4 cases due to a lack of consistency using reversed items in the answers of the companies, and 176 cases were used for the analysis. We acknowledge that 176 samples may not be sufficient to conduct regression analyses with 5 research variables in our study. Each variable was measured through multiple items. We conducted an exploratory factor analysis to assess their unidimensionality. In an effort to test the construct validity of the instruments, a principal component factor analysis was conducted with Varimax rotation. The items correspond well to each singular factor, demonstrating a high degree of convergent validity. As the factor loadings for a variable (or factor) are higher than the factor loadings for the other variables, the instrument's discriminant validity is shown to be clear. Each factor was extracted as expected, which explained 70.97, 66.321, and 52.97 percent, respectively, of the total variance each with eigen values greater than 1.000. The internal consistency reliability of the variables was evaluated by computing Cronbach's alphas. The Cronbach's alpha values of the variables, which ranged from 0.717 to 0.950, were all securely over 0.700, which is satisfactory. The reliability and validity of the research variables are all, therefore, considered acceptable. The effects of dependence were assessed using a regression analysis. The Pearson correlations were calculated for the variables, measured by interval or ratio scales. Potential multicollinearity among the antecedents was evaluated prior to the multiple regression analysis, as some of the variables were significantly correlated with others (e.g., direct service and indirect service). Although several variables show the evidence of significant correlations, their tolerance values range between 0.334 and 0.613, thereby demonstrating that multicollinearity is not a likely threat to the parameter estimates. Checking some basic assumptions for the regression analyses, we decided to conduct multiple regression analyses and moderated regression analyses to test the given hypotheses. The results of the regression analyses indicate that the regression model is significant at p < 0.001 (F = 44.260), and that the predictors of the research model explain 42.6 percent of the total variance. Hypotheses 1, 2, and 3 address the relationships between the dependence of the incubated firms and the business incubating services. Business infrastructure service, direct service, and indirect service are all significantly related with dependence (β = 0.300, p < 0.001; β = 0.230, p < 0.001; β = 0.226, p < 0.001), thus supporting Hypotheses 1, 2, and 3. When the incubating period is the moderator and dependence is the dependent variable, the addition of the interaction terms with the antecedents to the regression equation yielded a significant increase in R2 (F change = 2.789, p < 0.05). In particular, direct service and indirect service exert different effects on dependence. Hence, the results support Hypotheses 5 and 6. This study provides several strategies and specific calls to action for BICs, based on our empirical findings. Business infrastructure service has more effect on the firm's dependence than the other two services. The introduction of an additional high charge rate for a graduated but allowed to stay in the BIC is a basic and legitimate condition for the BIC to control the firm's dependence. We detected the differential effects of direct and indirect services on the firm's dependence. The firms with long incubating period are more sensitive to indirect service positively, and more sensitive to direct service negatively, when assessing their levels of dependence. This implies that BICs must develop a strategy on the basis of a firm's incubating period. Last but not least, it would be valuable to discover other important variables that influence the firm's dependence in the future studies. Moreover, future studies to explain the independence of startup companies in BICs would also be valuable.

Agency Costs of Clothing Companies with Famous Brand (유명 의류 상호 기업의 대리인 비용에 관한 연구)

  • Gong, Kyung-Tae
    • Management & Information Systems Review
    • /
    • v.36 no.4
    • /
    • pp.21-32
    • /
    • 2017
  • Motivated by the recent cases of negligent social responsibility as manifested by foreign luxury fashion brands in Korea, this study investigates whether agency costs depend on the sustainability of different types of corporate governance. Agency costs refer either to vertical costs arising from the relationship between stockholders and managers, or to horizontal costs associated with the potential conflicts between majority and minority stockholders. The firms with luxury fashion brand could spend large sums of money on maintenance of magnificent brand image, thereby increasing the agency cost. On the contrary, the firms may hold down wasteful spending to report a gaudily financial achievement. This results in mitigation of the agency cost. Agency costs are measured by the value of the principal component. First, three ratios are constructed: asset turnover, operating expense to sales, and earnings before interest, tax, and depreciation. Then, the scores of each of these ratios for individual firms in the sample are differenced from the ratios for the benchmark firm of S-OIL. S-OIL was designated as the best superior governance model firm for 2013 by CGS. We perform regression analysis of each agency cost index, luxury fashion brand dummy and a set of control variables. The regression results indicate that the agency costs of the firms with luxury fashion brand exceed those of control group in the fashion industry in the part of operating expenses, but the agency cost falls short of those of control group in the part of EBITD, thus the aggregate agency costs are not differential of those of the control group. In sensitivity test, the results are same that the agency cost of the firms are higher than those of the matching control group with PSM(propensity matching method). These results are corroborated by an additional analysis comparing the group of the companies with the best brands with the control group. The results raise doubts about the effectiveness of management of the firms with luxury fashion brand. This study has a limitation that the research has performed only for 2013 and this paper suggests that there is room for improvement in the current research methodology.

  • PDF

Effect of Firm's Activities on Their Performances (혁신활동이 기업의 경영성과에 미치는 영향)

  • Kim, Kwang-Doo;Hong, Woon-Sun
    • Journal of Korea Technology Innovation Society
    • /
    • v.14 no.2
    • /
    • pp.373-404
    • /
    • 2011
  • The purpose of research is to reveal the effect of innovation to enterprises' economic performance. The kind of this study has begun since 1960s and lively progressed then. The fmal theoretical result of the effect of innovation to the performance came positive in compare to the mixed results came out in empirical analysis. There are several reason why empirical results are different to the theoretical results. However the major factor is that of using imperfect statistics and inappropriateness of analysis method. This study used a population (1990~2008) provided from Korean Intellectual Property Office, KIPO for patent and also used a population (1990~2008) provided from Korea Investors Service, KIS for research and development. The contribution of this study is enormous statistical analysis. This study used principal component analysis made innovativeness index for appropriate index sampling, and made effort to minimize the error by using appropriate quantile regression for both to panel analysis and rapidly developed company analysis. Dividing the final results into two parts, the growth and the profit, the effect of technological innovation to the firm's growth is not significant to the panel analysis but heavily significant to the upper 10% of high growth firm. By classifying large company and small and medium enterprise, it is significant to upper 10% of high growth firm for large company and generally significant to small and medium enterprise. But for both lower 10% of low growth firms and 25% of low ranking firms are negatively effected, and for high growth firms larger than the medians are positively effected. Especially for upper 10% of high growth firms are mostly effected. It is more effective to the profitability than the growth. The effect to the profit for every enterprises are not significant, but effected significant to the larger enterprises than 25% of low ranking enterprises especially most effective to the upper 10% of high-profit enterprises. The analysis for the large company, it was significant and positively effected to the upper 10% of high profit enterprises and 25% of low ranking enterprises, but the negatively effected for the low-profit enterprises. For the small and medium enterprises, it is negatively effected for both 10% of low ranking enterprises and 25% of low ranking enterprises. However it is positively effective and significant for the high ranking enterprises than median, especially for those high growth firms. It is meaningful to recognize significancy by quantile, but more implicative result is to finding more effectiveness to the small and medium enterprises than to the large company.

  • PDF

Environmental Studies in the Lower Part of the Han River Vl. The Statistical Analysis of Eutrophication Factors (한강 하류의 환경학적 연구 Vl. 부영양 요인의 통계적 해석)

  • Jung, Seung-Won;Hue, Hoi-Kwon;Lee, Jin-Hwan
    • Korean Journal of Ecology and Environment
    • /
    • v.37 no.1 s.106
    • /
    • pp.78-86
    • /
    • 2004
  • In order to reveal the relationship between the concentration of chlorophyll- a and the environmental factors affecting eutrophication, the present study was biweekly conducted at G stations in the lower part of the Han river during the period from Feb. 24,2001 to Feb. 9,2002. Water temperature was changed from $0.5^{\circ}C$ to $26.4^{\circ}C$, pH was 5.77${\sim}$8.99, DO 3.15${\sim}$14,36 mg $L^{-1}$, BOD 0.90${\sim}$7.45 mg $L^{-1}$, and COD 1.16${\sim}$9.13 mg $L^{-1}$. TN and TP were ranged from 1.68${\sim}$20.96 mg $L^{-1}$, and 0.02 ${\sim}$ 1.17 mg $L^{-1}$, respectively. $NH_4\;^+$-N, $NO_3\;^-$-N, and $PO_4\;^{3-}$-P were ranged from 0.56${\sim}$3.60 mg $L^{-1}$, 0.03${\sim}$7.29 mg $L^{-1}$, and 0.002${\sim}$0.754 mg $L^{-1}$. Chlorophyll- a was extensively changed from 2.29 ${\mu}g\;L^{-1}$ to 136.28 ${\mu}g\;L^{-1}$ by month and stations. Results of nutrients indicated the eutrophic level in this area and water quality was the gradual worsening in the lower stations than those of upper stations during the period studied. The Pearson correlation analysis between the concentration of chlorophyll- a and the environmental factors indicated that BOD, COD, pH, $NH_4\;^+$-N, TP, TN, conductivity and $PO_4\;^{3-}$-P were positive correlation, but $NO_3\;^-$-N was negative. The environmental factors investigated using the principal component method could be triparted. The first factor group included conductivity, BOD, COD, TN, TP, $NH_4\;^+$-N, $PO_4\;^{3-}$-P and SS, the second WT and DO, and the third pH and $NO_3\;^-$-N. Using the stepwise regression analysis, chlorophyll- a was under the influence of conductivity, $PO_4\;^{3-}$-P, $>NO_3\;^-$-N and $NH_4\;^+$-N Chlorophyll-a = 0.3661 ${\times}$ (Conductivity) - 0.3592 ${\times}$ ($PO_4\;^{3-}$-P) - 0.3449 ${\times}$ ($NO_3\;^-$-N)+0.4362 ${\times}$ ($NH_4\;^+$-N.

Building battery deterioration prediction model using real field data (머신러닝 기법을 이용한 납축전지 열화 예측 모델 개발)

  • Choi, Keunho;Kim, Gunwoo
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.243-264
    • /
    • 2018
  • Although the worldwide battery market is recently spurring the development of lithium secondary battery, lead acid batteries (rechargeable batteries) which have good-performance and can be reused are consumed in a wide range of industry fields. However, lead-acid batteries have a serious problem in that deterioration of a battery makes progress quickly in the presence of that degradation of only one cell among several cells which is packed in a battery begins. To overcome this problem, previous researches have attempted to identify the mechanism of deterioration of a battery in many ways. However, most of previous researches have used data obtained in a laboratory to analyze the mechanism of deterioration of a battery but not used data obtained in a real world. The usage of real data can increase the feasibility and the applicability of the findings of a research. Therefore, this study aims to develop a model which predicts the battery deterioration using data obtained in real world. To this end, we collected data which presents change of battery state by attaching sensors enabling to monitor the battery condition in real time to dozens of golf carts operated in the real golf field. As a result, total 16,883 samples were obtained. And then, we developed a model which predicts a precursor phenomenon representing deterioration of a battery by analyzing the data collected from the sensors using machine learning techniques. As initial independent variables, we used 1) inbound time of a cart, 2) outbound time of a cart, 3) duration(from outbound time to charge time), 4) charge amount, 5) used amount, 6) charge efficiency, 7) lowest temperature of battery cell 1 to 6, 8) lowest voltage of battery cell 1 to 6, 9) highest voltage of battery cell 1 to 6, 10) voltage of battery cell 1 to 6 at the beginning of operation, 11) voltage of battery cell 1 to 6 at the end of charge, 12) used amount of battery cell 1 to 6 during operation, 13) used amount of battery during operation(Max-Min), 14) duration of battery use, and 15) highest current during operation. Since the values of the independent variables, lowest temperature of battery cell 1 to 6, lowest voltage of battery cell 1 to 6, highest voltage of battery cell 1 to 6, voltage of battery cell 1 to 6 at the beginning of operation, voltage of battery cell 1 to 6 at the end of charge, and used amount of battery cell 1 to 6 during operation are similar to that of each battery cell, we conducted principal component analysis using verimax orthogonal rotation in order to mitigate the multiple collinearity problem. According to the results, we made new variables by averaging the values of independent variables clustered together, and used them as final independent variables instead of origin variables, thereby reducing the dimension. We used decision tree, logistic regression, Bayesian network as algorithms for building prediction models. And also, we built prediction models using the bagging of each of them, the boosting of each of them, and RandomForest. Experimental results show that the prediction model using the bagging of decision tree yields the best accuracy of 89.3923%. This study has some limitations in that the additional variables which affect the deterioration of battery such as weather (temperature, humidity) and driving habits, did not considered, therefore, we would like to consider the them in the future research. However, the battery deterioration prediction model proposed in the present study is expected to enable effective and efficient management of battery used in the real filed by dramatically and to reduce the cost caused by not detecting battery deterioration accordingly.