• Title/Summary/Keyword: data value prediction

Search Result 1,091, Processing Time 0.034 seconds

Optimization of Support Vector Machines for Financial Forecasting (재무예측을 위한 Support Vector Machine의 최적화)

  • Kim, Kyoung-Jae;Ahn, Hyun-Chul
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.4
    • /
    • pp.241-254
    • /
    • 2011
  • Financial time-series forecasting is one of the most important issues because it is essential for the risk management of financial institutions. Therefore, researchers have tried to forecast financial time-series using various data mining techniques such as regression, artificial neural networks, decision trees, k-nearest neighbor etc. Recently, support vector machines (SVMs) are popularly applied to this research area because they have advantages that they don't require huge training data and have low possibility of overfitting. However, a user must determine several design factors by heuristics in order to use SVM. For example, the selection of appropriate kernel function and its parameters and proper feature subset selection are major design factors of SVM. Other than these factors, the proper selection of instance subset may also improve the forecasting performance of SVM by eliminating irrelevant and distorting training instances. Nonetheless, there have been few studies that have applied instance selection to SVM, especially in the domain of stock market prediction. Instance selection tries to choose proper instance subsets from original training data. It may be considered as a method of knowledge refinement and it maintains the instance-base. This study proposes the novel instance selection algorithm for SVMs. The proposed technique in this study uses genetic algorithm (GA) to optimize instance selection process with parameter optimization simultaneously. We call the model as ISVM (SVM with Instance selection) in this study. Experiments on stock market data are implemented using ISVM. In this study, the GA searches for optimal or near-optimal values of kernel parameters and relevant instances for SVMs. This study needs two sets of parameters in chromosomes in GA setting : The codes for kernel parameters and for instance selection. For the controlling parameters of the GA search, the population size is set at 50 organisms and the value of the crossover rate is set at 0.7 while the mutation rate is 0.1. As the stopping condition, 50 generations are permitted. The application data used in this study consists of technical indicators and the direction of change in the daily Korea stock price index (KOSPI). The total number of samples is 2218 trading days. We separate the whole data into three subsets as training, test, hold-out data set. The number of data in each subset is 1056, 581, 581 respectively. This study compares ISVM to several comparative models including logistic regression (logit), backpropagation neural networks (ANN), nearest neighbor (1-NN), conventional SVM (SVM) and SVM with the optimized parameters (PSVM). In especial, PSVM uses optimized kernel parameters by the genetic algorithm. The experimental results show that ISVM outperforms 1-NN by 15.32%, ANN by 6.89%, Logit and SVM by 5.34%, and PSVM by 4.82% for the holdout data. For ISVM, only 556 data from 1056 original training data are used to produce the result. In addition, the two-sample test for proportions is used to examine whether ISVM significantly outperforms other comparative models. The results indicate that ISVM outperforms ANN and 1-NN at the 1% statistical significance level. In addition, ISVM performs better than Logit, SVM and PSVM at the 5% statistical significance level.

Development of a Traffic Accident Prediction Model and Determination of the Risk Level at Signalized Intersection (신호교차로에서의 사고예측모형개발 및 위험수준결정 연구)

  • 홍정열;도철웅
    • Journal of Korean Society of Transportation
    • /
    • v.20 no.7
    • /
    • pp.155-166
    • /
    • 2002
  • Since 1990s. there has been an increasing number of traffic accidents at intersection. which requires more urgent measures to insure safety on intersection. This study set out to analyze the road conditions, traffic conditions and traffic operation conditions on signalized intersection. to identify the elements that would impose obstructions in safety, and to develop a traffic accident prediction model to evaluate the safety of an intersection using the cop relation between the elements and an accident. In addition, the focus was made on suggesting appropriate traffic safety policies by dealing with the danger elements in advance and on enhancing the safety on the intersection in developing a traffic accident prediction model fir a signalized intersection. The data for the study was collected at an intersection located in Wonju city from January to December 2001. It consisted of the number of accidents, the road conditions, the traffic conditions, and the traffic operation conditions at the intersection. The collected data was first statistically analyzed and then the results identified the elements that had close correlations with accidents. They included the area pattern, the use of land, the bus stopping activities, the parking and stopping activities on the road, the total volume, the turning volume, the number of lanes, the width of the road, the intersection area, the cycle, the sight distance, and the turning radius. These elements were used in the second correlation analysis. The significant level was 95% or higher in all of them. There were few correlations between independent variables. The variables that affected the accident rate were the number of lanes, the turning radius, the sight distance and the cycle, which were used to develop a traffic accident prediction model formula considering their distribution. The model formula was compared with a general linear regression model in accuracy. In addition, the statistics of domestic accidents were investigated to analyze the distribution of the accidents and to classify intersections according to the risk level. Finally, the results were applied to the Spearman-rank correlation coefficient to see if the model was appropriate. As a result, the coefficient of determination was highly significant with the value of 0.985 and the ranks among the intersections according to the risk level were appropriate too. The actual number of accidents and the predicted ones were compared in terms of the risk level and they were about the same in the risk level for 80% of the intersections.

A Study of Accumulated Ecosystem Carbon in Mt. Deogyusan, Korea (덕유산의 생태계 탄소축적량 산정에 관한 연구)

  • Jeong, Seok-hee;Eom, Ji-young;Jang, Ji-hye;Lee, Jae-ho;Cho, Koo-hyun;Lee, Jae-seok
    • Korean Journal of Environmental Biology
    • /
    • v.33 no.4
    • /
    • pp.459-467
    • /
    • 2015
  • Understanding of a carbon storage in a regional scale ecosystem is a very important data for predicting change of global carbon cycle. Therefore, the real data collected in the various ecosystems are a very useful for enhancing accuracy of model prediction. We tried to estimate total accumulated ecosystem carbon in Deogyusan National Park (DNP) with naturally well preserved ecosystem. In DNP, vegetations were classified to four main communities with Quercus mongolica community (12,636.9 ha, 54.8%), Quercus variabilis community (2,987.0 ha, 13.0%), Pinus densiflora community (5,758.0 ha, 25.0%), and Quercus serrata community (402.9 ha,1.7%). Biomass and soil carbons were estimated by the biomass allometric equations based on the DBH and carbon contents of litter and soil (0~30 cm) layers collected in 3 plots ($30cm{\times}30cm$) in each community. The biomass and soil carbons were shown as high value as 1,759,000 tC and 7,776,000 tC, respectively, in Quercus mongolia community in DNP area. In Quercus mongolica, Quercus variabilis, Quercus serrata, Pinus densiflora communities, the accumulated ecosystem carbon were shown 9,536,000 tC, 1,405,000 tC, 147,000 tC, 346,000 tC, respectively. Also, the total ecosystem carbon was estimated with 11,434,000 tC in DNP.

Analysis of Building Vulnerabilities to Typhoon Disaster Based on Damage Loss Data (태풍 재해에 대한 건물 취약성의 피해손실 데이터 기반 분석)

  • Ahn, Sung-Jin;Kim, Tae-Hui;Son, Ki-Young;Kim, Ji-Myong
    • Journal of the Korea Institute of Building Construction
    • /
    • v.19 no.6
    • /
    • pp.529-538
    • /
    • 2019
  • Typhoons can cause significant financial damage worldwide. For this reason, states, local governments and insurance companies attempt to quantify and mitigate the financial risks related to these natural disasters by developing a typhoon risk assessment model. As such, the importance of typhoon risk assessment models is increasing, and it is also important to reflect local vulnerabilities to enable sophisticated assessments. Although a practical study of economic losses associated with natural disasters has identified essential risk indicators, comprehensive studies covering the correlation between vulnerability and economic loss are still needed. The purpose of this study is to identify typhoon damage indicators and to develop evaluation indicators for typhoon damage prediction functions, utilizing the loses from Typhoon Maemi as data. This study analyzes actual loss records of Typhoon Maemi provided by local insurance companies to prepare for a scenario of maximum losses. To create a vulnerability function, the authors used the wind speed and distance from the coast and the total value of property, construction type, floors, and underground floor indicators. The results and metrics of this study provide practical guidelines for government agencies and insurance companies in developing vulnerability functions that reflect the actual financial losses and regional vulnerabilities of buildings.

Development and Application of the Mode Choice Models According to Zone Sizes (분석대상 규모에 따른 수단분담모형의 추정과 적용에 관한 연구)

  • Kim, Ju-Yeong;Lee, Seung-Jae;Kim, Do-Gyeong;Jeon, Jang-U
    • Journal of Korean Society of Transportation
    • /
    • v.29 no.6
    • /
    • pp.97-106
    • /
    • 2011
  • Mode choice model is an essential element for estimating- the demand of new means of transportation in the planning stage as well as in the establishment phase. In general, current demand analysis model developed for the mode choice analysis applies common parameters of utility function in each region which causes inaccuracy in forecasting mode choice behavior. Several critical problems from using common parameters are: a common parameter set can not reflect different distribution of coefficient for travel time and travel cost by different population. Consequently, the resulting model fails to accurately explain policy variables such as travel time and travel cost. In particular, the nonlinear logit model applied to aggregation data is vulnerable to the aggregation error. The purpose of this paper is to consider the regional characteristics by adopting the parameters fitted to each area, so as to reduce prediction errors and enhance accuracy of the resulting mode choice model. In order to estimate parameter of each area, this study used Household Travel Survey Data of Metropolitan Transportation Authority. For the verification of the model, the value of time by marginal rate of substitution is evaluated and statistical test for resulting coefficients is also carried out. In order to crosscheck the applicability and reliability of the model, changes in mode choice are analyzed when Seoul subway line 9 is newly opened and the results are compared with those from the existing model developed without considering the regional characteristics.

Physiological responses involved in reactive oxygen species (ROS) of rice plant under alone or multi artificial stress conditions

  • Kim, Yoonha;Waqas, Muhammad;Khan, Abdul Latif;Mun, Bong-Gyu;Yun, Byung-Wook;Lee, In-Jung
    • Proceedings of the Korean Society of Crop Science Conference
    • /
    • 2017.06a
    • /
    • pp.203-203
    • /
    • 2017
  • The Earth's climate is rapidly changing because of increasing carbon dioxide content in atmosphere so, climate prediction models anticipate that earth surface temperature will rise by 3 to $5^{\circ}C$ in next 50 to 100 years. Therefore, frequency of un-expected weather events such as drought, salinity, low or high temperature and flooding etc. will be increasing worldwide. Furthermore, increased atmosphere temperature can influence pests and pathogens spread as well. Therefore, to protect enormous grain loss from unexpected weather conditions, studies related with combine stress conditions like abiotic plus biotic stress condition are really required. Thus, our research focused on physiological responses under combined abiotic and biotic stress condition in rice plant. To induce uniform stress condition, we used NaCl (100 mM) and salicylic acid (0.5 and 1.0 mM SA) as each stress a stimulator. Each artificial abiotic and biotic stress inducer was applied to hydroponically grown rice seedlings alone or together for four day. The data were collected in a time-dependent manner [1, 2, 3 and 4 day(s) after treatment (DAT)] and were matched with our anticipation that shoot length and shoot fresh weight was decreased in solo and combined abiotic and biotic stress condition. The lipid peroxidation content was significantly increased ($1.5{\pm}0.2$ to $2.7{\pm}0.1mg$ mg of $MDA\;g^{-1}FW$) in the first two days in both stress exposed plants, and showed the opposite trend ($0.5{\pm}0.01$ to $0.1{\pm}0.001mg$ of $MDA\;g^{-1}FW$) in last two days under multi stress condition. Superoxide dismutase (SOD) activity did not showed difference in only biotic stress condition (alone 0.5 and 1.0 mM SA) as compared to control however, it was significantly increased in multi stress condition or solo abiotic stress condition whereas, catalase (CAT), and ascorbate peroxidase (APX) activities were significantly decreased in solo biotic and combined abiotic and biotic condition. In particular, both enzymes activities were more decreased in multi stress condition as compared to solo biotic stress condition. The results for relative mRNA expression level of CAT and APX enzymes were in agreement with results of spectrophotometric values. Correlation value between each stress condition and phenotypic data showed that biotic stress condition showed high correlation with activity of CAT and APX whilst, abiotic stress condition revealed significant correlation with SOD activity.

  • PDF

A study on applicability of volumetric water content to predict shallow failure (표층붕괴 예측을 위한 체적함수비 적용성 연구)

  • Suk, Jae-Wook;Song, Hyo-Sung;Kang, Hyo-Sub;Kim, Ho-Jong
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.20 no.12
    • /
    • pp.737-746
    • /
    • 2019
  • Most landslides in the country are shallow failures triggered by intense rainfall. Many researchers have revealed the possibility of predicting shallow failure through the volumetric water content (VWC). This study examined how to determine shallow failure using the gradient characteristics of the volumetric water content. For this, flume experiments were conducted using weathered granite soil. To confirm the saturation state of the surface layer under a rainfall intensity of 30 and 50mm/hr, VWC sensors were installed at depths of 10 and 20 cm on the upper, middle and lower slope. The test results showed that a shallow failure determination using VWC could be applied limitedly according to the slope degree. In addition, the effective cumulative rainfall due to the rainfall infiltration velocity is considered the main factor for the failure time. The failure prediction using the gradient of the VWC depends on the installation location and depth of the sensor. According to the experimental data, the measured value at 20 cm below the slope was most effective. Therefore, an analysis method of VWC and the method of selecting the installation location confirmed through this study can provide important data for presenting the measurement criteria using VWC in the future.

Selection of Optimal Vegetation Indices for Estimation of Barley & Wheat Growth based on Remote Sensing - An Application of Unmanned Aerial Vehicle and Field Investigation Data - (원격탐사 기반 맥류 작황 추정을 위한 최적 식생지수 선정 - UAV와 현장 측정자료를 활용하여 -)

  • Na, Sang-il;Park, Chan-won;Cheong, Young-kuen;Kang, Chon-sik;Choi, In-bae;Lee, Kyung-do
    • Korean Journal of Remote Sensing
    • /
    • v.32 no.5
    • /
    • pp.483-497
    • /
    • 2016
  • Unmanned Aerial Vehicle (UAV) imagery are being assessed for analyzing within field spatial variability for agricultural precision management, because UAV imagery may be acquired quickly during critical periods of rapid crop growth. This study refers to the derivation of barley and wheat growth prediction equation by using UAV derived vegetation index. UAV imagery was taken on the test plots six times from late February to late June during the barley and wheat growing season. The field spectral reflectance during growing period for the 5 variety (Keunal-bori, Huinchalssal-bori, Saechalssal-bori, Keumkang and Jopum) were measured using ground spectroradiometer and three growth parameters, including plant height, shoot dry weight and number of tiller were investigated for each ground survey. Among the 6 Vegetation Indices (VI), the RVI, NDVI, NGRDI and GLI between measured and image derived showed high relationship with the coefficient of determination respectively. Using the field investigation data, the vegetation indices regression curves were derived, and the growth parameters were tried to compare with the VIs value.

Estimation of Countermeasures and Efficient Use of Volume of Artificial Reefs Deployed in Fishing Grounds (어초어장으로 시설된 사각형어초의 수량 산정 및 유효공용적 평가)

  • Kim, Ho-Sang;Lee, Jeong-Woo;Kim, Jong-Ryeol;Yoon, Han-Sam
    • Journal of the Korean Society for Marine Environment & Energy
    • /
    • v.12 no.3
    • /
    • pp.181-187
    • /
    • 2009
  • To estimate the status and volume of artificial reefs(ARs) deployed at the sea bottom in fishing grounds, this study assessed the initial volume of ARs, the cubic volume of AR groups, and the porosity of each AR using image data collected during a survey using a multi-beam echo sounder(MBES) and a side scan sonar(SSS). These results were compared with data collected during diver surveys and used to develop a new method and prediction formulas for countermeasures, facility volume, and efficient use of volume for deployed ARs(cubic concrete). The field survey results for nine ARs deployed in the Busan Sea region were calculated, and the average value of coefficient k(indicating the efficient use of volume ratio) among ARs was 0.753, and the correlation between coefficient k and year(Yr) of deployment was calculated as k=0.0023Yr+0.725. The relationship between these two factors was poor. In years following the deployment of artificial reefs, coefficient k and year of deployment were not correlated, in spite of the hardening ground due to subsidence and the reduced distance between ARs. Consequently, it is reasonable to suppose that coefficient k was defined by bottom surface conditions and initial deployment conditions.

  • PDF

Construction of High-Resolution Topographical Map of Macro-tidal Malipo beach through Integration of Terrestrial LiDAR Measurement and MBES Survey at inter-tidal zone (대조차 만리포 해안의 지상 LiDAR와 MBES를 이용한 정밀 지형/수심 측량 및 조간대 접합을 통한 정밀 지형도 작성)

  • Shim, Jae-Seol;Kim, Jin-Ah;Kim, Seon-Jeong;Kim, Sang-Ik
    • Journal of Korean Society of Coastal and Ocean Engineers
    • /
    • v.22 no.1
    • /
    • pp.58-66
    • /
    • 2010
  • In this paper, we have constructed high-resolution topographical map of macro-tidal Malipo beach through integration of terrestrial LiDAR measurement and MBES survey data at inter-tidal zone. To acquire the enough information of inter-tidal zone, we have done terrestrial LiDAR measurement mounted on the roof of vehicle with DGPS through go-stop-scan method at the ebb tide and MBES depth surveying with tide gauge and eye staff measurement for tide correction and MSL calculation at the high tide all together. To integrate two kinds of data, we have unified the vertical coordination standard to Incheon MSL. The mean error of overlapped inter-tidal zone is about 2~6 cm. To verify the accuracy of terrestrial LiDAR, RTK-DGPS measurement have done simultaneously and the difference of Z value RMSE is about 4~7 cm. The resolution of Malipo topographical map is 50 cm and it has constructed to DEM (Digital Elevation Model) based on GIS. Now it has used as an input topography information for the storm-surge inundation prediction models. Also it will be possible to use monitoring of beach process through the long-term periodic measurement and GIS-based 3D spatial analysis calculating the erosion and deposition considering with the artificial beach transition and coastal environmental parameters.