• Title/Summary/Keyword: Predictive Power

Search Result 710, Processing Time 0.022 seconds

A study on the development of severity-adjusted mortality prediction model for discharged patient with acute stroke using machine learning (머신러닝을 이용한 급성 뇌졸중 퇴원 환자의 중증도 보정 사망 예측 모형 개발에 관한 연구)

  • Baek, Seol-Kyung;Park, Jong-Ho;Kang, Sung-Hong;Park, Hye-Jin
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.19 no.11
    • /
    • pp.126-136
    • /
    • 2018
  • The purpose of this study was to develop a severity-adjustment model for predicting mortality in acute stroke patients using machine learning. Using the Korean National Hospital Discharge In-depth Injury Survey from 2006 to 2015, the study population with disease code I60-I63 (KCD 7) were extracted for further analysis. Three tools were used for the severity-adjustment of comorbidity: the Charlson Comorbidity Index (CCI), the Elixhauser comorbidity index (ECI), and the Clinical Classification Software (CCS). The severity-adjustment models for mortality prediction in patients with acute stroke were developed using logistic regression, decision tree, neural network, and support vector machine methods. The most common comorbid disease in stroke patients were hypertension, uncomplicated (43.8%) in the ECI, and essential hypertension (43.9%) in the CCS. Among the CCI, ECI, and CCS, CCS had the highest AUC value. CCS was confirmed as the best severity correction tool. In addition, the AUC values for variables of CCS including main diagnosis, gender, age, hospitalization route, and existence of surgery were 0.808 for the logistic regression analysis, 0.785 for the decision tree, 0.809 for the neural network and 0.830 for the support vector machine. Therefore, the best predictive power was achieved by the support vector machine technique. The results of this study can be used in the establishment of health policy in the future.

Time series clustering for AMI data in household smart grid (스마트그리드 환경하의 가정용 AMI 자료를 위한 시계열 군집분석 연구)

  • Lee, Jin-Young;Kim, Sahm
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.6
    • /
    • pp.791-804
    • /
    • 2020
  • Residential electricity consumption can be predicted more accurately by utilizing the realtime household electricity consumption reference that can be collected by the AMI as the ICT developed under the smart grid circumstance. This paper studied the model that predicts residential power load using the ARIMA, TBATS, NNAR model based on the data of hour unit amount of household electricity consumption, and unlike forecasting the consumption of the whole households at once, it computed the anticipated amount of the electricity consumption by aggregating the predictive value of each established model of cluster that was collected by the households which show the similiar load profile. Especially, as the typical time series data, the electricity consumption data chose the clustering analysis method that is appropriate to the time series data. Therefore, Dynamic Time Warping and Periodogram based method is used in this paper. By the result, forecasting the residential elecrtricity consumption by clustering the similiar household showed better performance than forecasting at once and in summertime, NNAR model performed best, and in wintertime, it was TBATS model. Lastly, clustering method showed most improvements in forecasting capability when the DTW method that was manifested the difference between the patterns of each cluster was used.

Analysis of Intrinsic Patterns of Time Series Based on Chaos Theory: Focusing on Roulette and KOSPI200 Index Future (카오스 이론 기반 시계열의 내재적 패턴분석: 룰렛과 KOSPI200 지수선물 데이터 대상)

  • Lee, HeeChul;Kim, HongGon;Kim, Hee-Woong
    • Knowledge Management Research
    • /
    • v.22 no.4
    • /
    • pp.119-133
    • /
    • 2021
  • As a large amount of data is produced in each industry, a number of time series pattern prediction studies are being conducted to make quick business decisions. However, there is a limit to predicting specific patterns in nonlinear time series data due to the uncertainty inherent in the data, and there are difficulties in making strategic decisions in corporate management. In addition, in recent decades, various studies have been conducted on data such as demand/supply and financial markets that are suitable for industrial purposes to predict time series data of irregular random walk models, but predict specific rules and achieve sustainable corporate objectives There are difficulties. In this study, the prediction results were compared and analyzed using the Chaos analysis method for roulette data and financial market data, and meaningful results were derived. And, this study confirmed that chaos analysis is useful for finding a new method in analyzing time series data. By comparing and analyzing the characteristics of roulette games with the time series of Korean stock index future, it was derived that predictive power can be improved if the trend is confirmed, and it is meaningful in determining whether nonlinear time series data with high uncertainty have a specific pattern.

Major environmental factors and traits of invasive alien plants determining their spatial distribution

  • Oh, Minwoo;Heo, Yoonjeong;Lee, Eun Ju;Lee, Hyohyemi
    • Journal of Ecology and Environment
    • /
    • v.45 no.4
    • /
    • pp.277-286
    • /
    • 2021
  • Background: As trade increases, the influx of various alien species and their spread to new regions are prevalent and no longer a special problem. Anthropogenic activities and climate changes have made the distribution of alien species out of their native range common. As a result, alien species can be easily found anywhere, and they have nothing but only a few differences in intensity. The prevalent distribution of alien species adversely affects the ecosystem, and a strategic management plan must be established to control them effectively. To this end, hot spots and cold spots were analyzed according to the degree of distribution of invasive alien plants, and major environmental factors related to hot spots were found. We analyzed the 10,287 distribution points of 126 species of alien plants collected through the national survey of alien species by the hierarchical model of species communities (HMSC) framework. Results: The explanatory and fourfold cross-validation predictive power of the model were 0.91 and 0.75 as AUC values, respectively. The hot spots of invasive plants were found in the Seoul metropolitan area, Daegu metropolitan city, Chungcheongbuk-do Province, southwest shore, and Jeju island. Generally, the hot spots were found where the higher maximum temperature of summer, precipitation of winter, and road density are observed, but temperature seasonality, annual temperature range, precipitation of the summer, and distance to river and sea were negatively related to the hot spots. According to the model, the functional traits accounted for 55% of the variance explained by the environmental factors. The species with higher specific leaf areas were more found where temperature seasonality was low. Taller species preferred the bigger annual temperature range. The heavier seed mass was only preferred when the max temperature of summer exceeded 29 ℃. Conclusions: In this study, hot spots were places where 2.1 times more alien plants were distributed on average than non-hot spots (33.5 vs 15.7 species). The hot spots of invasive plants were expected to appear in less stressful climate conditions, such as low fluctuation of temperature and precipitation. Also, the disturbance by anthropogenic factors or water flow had positive influences on the hot spots. These results were consistent with the previous reports about the ruderal or competitive strategies of invasive plants instead of the stress-tolerant strategy. The functional traits are closely related to the ecological strategies of plants by shaping the response of species to various environmental filters, and our result confirmed this. Therefore, in order to effectively control alien plants, it is judged that the occurrence of disturbed sites in which alien plants can grow in large quantities is minimized, and the river management of waterfronts is required.

Predicting Habitat Suitability of Carnivorous Alert Alien Freshwater Fish (포식성 유입주의 어류에 대한 서식처 적합도 평가)

  • Taeyong, Shim;Zhonghyun, Kim;Jinho, Jung
    • Ecology and Resilient Infrastructure
    • /
    • v.10 no.1
    • /
    • pp.11-19
    • /
    • 2023
  • Alien species are known to threaten regional biodiversity globally, which has increased global interest regarding introduction of alien species. The Ministry of Environment of Korea designated species that have not yet been introduced into the country with potential threat as alert alien species to prevent damage to the ecosystem. In this study, potential habitats of Esox lucius and Maccullochella peelii, which are predatory and designated as alert alien fish, were predicted on a national basis. Habitat suitability was evaluated using EHSM (Ecological Habitat Suitability Model), and water temperature data were input to calculate Physiological Habitat Suitability (PHS). The prediction results have shown that PHS of the two fishes were mainly controlled by heat or cold stress, which resulted in biased habitat distribution. E. lucius was predicted to prefer the basins at high latitudes (Han and Geum River), while M. peelii preferred metropolitan areas. Through these differences, it was expected that the invasion pattern of each alien fish can be different due to thermal preference. Further studies are required to enhance the model's predictive power, and future predictions under climate change scenarios are required to aid establishing sustainable management plans.

Employee's Business Outlook Disclosed Through Social Media And Employment Growth : The Case of Jobplanet (소셜미디어를 통한 직원의 기업전망 평가와 고용증가와의 상관성 : 잡플래닛 기업전망을 대상으로)

  • Byeongsoo, Kim;Ju Young, Kang
    • Smart Media Journal
    • /
    • v.11 no.10
    • /
    • pp.9-21
    • /
    • 2022
  • The recent expansion of the use of social media has served as an opportunity to express users' opinions in real time in various fields such as society, economy, politics, and culture, and brought many platforms that provide various information about companies. Among them, Glassdoor.com which started 2008 in US provides users with evaluations of the current and the former employees of their companies and also provides a outlooks for the company's growth Such a platform has the utility of providing necessary information to whom want to find a job or change jobs. In addition to this, variable studies have shown that the company information provided through these platforms is useful for investors as well. In this study, it was tested whether the corporate growth prospects of employees provided by Jobplanet, a platform with a typical function similar to Glassdoor.com in Korea, have predictive power to predict actual corporate growth. The forecast provided by Jobplanet and the company's financial indicator data received from FnGuide were collected and composed of panel data and analyzed using fixed effect model regression analysis. As a result, it was found that companies with positive prospects had higher employment growth than companies with negative prospects. When the outlook was neutral, the employment growth rate was higher than that of companies with a negative outlook.

Apartment Price Prediction Using Deep Learning and Machine Learning (딥러닝과 머신러닝을 이용한 아파트 실거래가 예측)

  • Hakhyun Kim;Hwankyu Yoo;Hayoung Oh
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.2
    • /
    • pp.59-76
    • /
    • 2023
  • Since the COVID-19 era, the rise in apartment prices has been unconventional. In this uncertain real estate market, price prediction research is very important. In this paper, a model is created to predict the actual transaction price of future apartments after building a vast data set of 870,000 from 2015 to 2020 through data collection and crawling on various real estate sites and collecting as many variables as possible. This study first solved the multicollinearity problem by removing and combining variables. After that, a total of five variable selection algorithms were used to extract meaningful independent variables, such as Forward Selection, Backward Elimination, Stepwise Selection, L1 Regulation, and Principal Component Analysis(PCA). In addition, a total of four machine learning and deep learning algorithms were used for deep neural network(DNN), XGBoost, CatBoost, and Linear Regression to learn the model after hyperparameter optimization and compare predictive power between models. In the additional experiment, the experiment was conducted while changing the number of nodes and layers of the DNN to find the most appropriate number of nodes and layers. In conclusion, as a model with the best performance, the actual transaction price of apartments in 2021 was predicted and compared with the actual data in 2021. Through this, I am confident that machine learning and deep learning will help investors make the right decisions when purchasing homes in various economic situations.

A Study on Estimating the Crossing Speed of Mobility Handicapped for the Activation of the Smart Crossing System (스마트횡단시스템 활성화를 위한 교통약자의 횡단속도 추정)

  • Hyung Kyu Kim;Sang Cheal Byun;Yeo Hwan Yoon;Jae Seok Kim
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.21 no.6
    • /
    • pp.87-96
    • /
    • 2022
  • The traffic vulnerable, including elderly pedestrians, have a relatively low walking speed and slow cognitive response time due to reduced physical ability. Although a smart crossing system has been developed and operated to improve problem, it is difficult to operate a signal that reflects the appropriate walking speed for each pedestrian. In this study, a neural network model and a multiple regression model-based traversing speed estimation model were developed using image information collected in an area with a high percentage of traffic vulnerability. to support the provision of optimal walking signals according to real-time traffic weakness. actual traffic data collected from the urban traffic network of Paju-si, Gyeonggi-do were used. The performance of the model was evaluated through seven selected indicators, including correlation coefficient and mean absolute error. The multiple linear regression model had a correlation coefficient of 0.652 and 0.182; the neural network model had a correlation coefficient of 0.823 and 0.105. The neural network model showed higher predictive power.

A Study on the Effect of Investor Sentiment and Liquidity on Momentum and Stock Returns (투자자 심리와 유동성이 모멘텀과 주식수익률에 미치는 영향 연구)

  • In-Su, Kim
    • Journal of Industrial Convergence
    • /
    • v.20 no.11
    • /
    • pp.75-83
    • /
    • 2022
  • This study analyzes whether investor sentiment and liquidity explain the momentum phenomenon in the Korean stock market and whether it is a risk factor for the asset pricing model. The empirical analysis used the monthly returns of non-financial companies listed on the stock market during the period 2000-2021. As a result of the analysis, first, it was found that there is a momentum effect in Korea. This is the same result as the previous study, and since 2000, the momentum effect has been accepted as a general phenomenon in the Korean stock market. Second, if we look at the portfolio based on investor sentiment, investor sentiment is influencing momentum. In particular, when investor sentiment is negative, the return on the winner portfolio is high. Third, as a result of the analysis based on liquidity, the momentum effect disappears and a reversal effect appears. Fourth, it was found that investor sentiment and liquidity influence the momentum effect. This is a result of the strong momentum effect in the illiquid stock group with negative investor sentiment. Fifth, as a result of analyzing the effect of each factor on stock returns, it was found that both investor psychology and liquidity factors have a significant impact on returns. The estimated results provide evidence that the inclusion of these two factors in the Carhart four-factor model significantly increases the predictive power of the model. Therefore, it can be said that investor sentiment factors and liquidity factors are important factors in determining stock returns.

Predicting Site Quality by Partial Least Squares Regression Using Site and Soil Attributes in Quercus mongolica Stands (신갈나무 임분의 입지 및 토양 속성을 이용한 부분최소제곱 회귀의 지위추정 모형)

  • Choonsig Kim;Gyeongwon Baek;Sang Hoon Chung;Jaehong Hwang;Sang Tae Lee
    • Journal of Korean Society of Forest Science
    • /
    • v.112 no.1
    • /
    • pp.23-31
    • /
    • 2023
  • Predicting forest productivity is essential to evaluate sustainable forest management or to enhance forest ecosystem services. Ordinary least squares (OLS) and partial least squares (PLS) regression models were used to develop predictive models for forest productivity (site index) from the site characteristics and soil profile, along with soil physical and chemical properties, of 112 Quercus mongolica stands. The adjusted coefficients of determination (adjusted R2) in the regression models were higher for the site characteristics and soil profile of B horizon (R2=0.32) and of A horizon (R2=0.29) than for the soil physical and chemical properties of B horizon (R2=0.21) and A horizon (R2=0.09). The PLS models (R2=0.20-0.32) were better predictors of site index than the OLS models (R2=0.09-0.31). These results suggest that the regression models for Q. mongolica can be applied to predict the forest productivity, but new variables may need to be developed to enhance the explanatory power of regression models.