• Title/Summary/Keyword: 앙상블 예측

Search Result 323, Processing Time 0.025 seconds

Data Fusion, Ensemble and Clustering for the Severity Classification of Road Traffic Accident in Korea (데이터융합, 앙상블과 클러스터링을 이용한 교통사고 심각도 분류분석)

  • 손소영;이성호
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 2000.04a
    • /
    • pp.597-600
    • /
    • 2000
  • 계속적인 증가 추세를 보이고 있는 교통량으로 인해 환경 문제뿐 아니라 교통사고로 인한 사상자 및 물적피해가 상당량으로 집계되고 있다. 본 논문에서는 데이터융합 및 앙상블 클러스터링방법을 이용한 교통사고 심각도 분류분석방법을 제안함으로서 교통사고예방에 기여하고자 한다. 이를 위하여 신경망과 Decision-Tree기법을 이용하여 얻은 물적피해와 신체상해가 발생할 확률을 융합하는 전형적인 데이터 융합기법(템스터-쉐퍼, 베이지안 방법, 로지스틱융합방법)을 사용하였다. 또한, 분류정확도를 향상시키고자 Bootstrap 재추출 방법을 이용해 얻어진 여러 개의 분류예측 결과 중 다수의 분류결과를 선택하는 앙상블 (arcing, bagging)기법을 적용하였다. 더불어, 본 연구에서는 클러스터링 방법을 제시하고, 이 방법이 기존의 융합기법, 앙상블기법과 비교한 결과, 분류예측면에서 정확도가 향상됨을 보였다.

  • PDF

Dynamic Web Information Predictive System Using Ensemble Support Vector Machine (앙상블 SVM을 이용한 동적 웹 정보 예측 시스템)

  • Park, Chang-Hee;Yoon, Kyung-Bae
    • The KIPS Transactions:PartB
    • /
    • v.11B no.4
    • /
    • pp.465-470
    • /
    • 2004
  • Web Information Predictive Systems have the restriction such as they need users profiles and visible feedback information for obtaining the necessary information. For overcoming this restrict, this study designed and implemented Dynamic Web Information Predictive System using Ensemble Support Vector Machine to be able to predict the web information and provide the relevant information every user needs most by click stream data and user feedback information, which have some clues based on the data. The result of performance test using Dynamic Web Information Predictive System using Ensemble Support Vector Machine against the existing Web Information Predictive System has preyed that this study s method is an excellence solution.

Vacant Technology Forecasting using Ensemble Model (앙상블모형을 이용한 공백기술예측)

  • Jun, Sung-Hae
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.21 no.3
    • /
    • pp.341-346
    • /
    • 2011
  • A vacant technology forecasting is an important issue in management of technology. The forecast of vacant technology leads to the growth of nation and company. So, we need the results of technology developments until now to predict the vacant technology. Patent is an objective thing of the results in research and development of technology. We study a predictive method for forecasting the vacant technology quantitatively using patent data in this paper. We propose an ensemble model that is to vote some clustering criteria because we can't guarantee a model is optimal. Therefore, an objective and accurate forecasting model of vacant technology is researched in our paper. This model combines statistical analysis methods with machine learning algorithms. To verify our performance evaluation objectively, we make experiments using patent documents of diverse technology fields.

Development of Artificial Neural Network Model for Prediction of Water Quality Parameters in Large Rivers with Tributary Inflow (지천유입이 있는 대하천에서 수질예측을 위한 인공신경망모델의 개발)

  • Seo, Il Won;Yun, Se Hun
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2017.05a
    • /
    • pp.141-141
    • /
    • 2017
  • 본 연구에서는 대하천의 8개의 수질인자(수온, 용존산소, 수소이온농도, 전기전도도, 총질소, 총인, 탁도, 클로로필-a)를 예측할 수 있는 인공신경망모델을 개발하였다. 인공신경망모델(ANN)은 수질데이터가 가지는 불확실성 및 비정상성, 복잡한 상호관련성에 효과적으로 대응할 수 있는 데이터기반 모델이다. 데이터기반 모델의 특성상 예측정확도를 높이기 위해서 양질의 입력데이터를 구성하는 것이 가장 중요하다. 때문에 각각의 수질인자뿐만 아니라 기상학적 인자 또한 예측을 위한 입력자료로 사용하였으며, 요인분석 및 층화표층추출법을 적용하여 입력데이터를 구성하였고 앙상블기법을 이용하여 추가적으로 예측의 정확도를 향상시켰다. 개발된 모델을 이용하여 지천유입이 있는 북한강의 수질자료를 예측한 결과 탁도를 제외한 7개의 수질인자 모두 0.85 이상의 설명력을 보였으며, 실측값과 예보값을 비교해본 결과 평균적으로 10% 미만의 에러값을 나타냈다. 요인분석을 통하여 연관성있는 인자를 입력인자로 추가한 경우 향상된 결과값을 보였주었으며, 앙상블기법을 적용한 결과 정확도 면에서 큰 향상을 보여주었다.

  • PDF

Future projections of extreme precipitation by using CMIP6 database at finer scales over South Korea (CMIP6 기후변화 자료를 이용한 국내 미래 극한강우의 예측)

  • Kim, Jongho;Van Doi, Manh
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2021.06a
    • /
    • pp.368-368
    • /
    • 2021
  • 기후 변화로 인한 극한사상의 크기와 빈도 변화를 예측하는 것은 수공 인프라 설계에 있어 주된 관심사 중 하나이다. 보통 극한사상에 대한 강도, 빈도, 지속시간에 대한 정보가 필요하며, 이는 일반적으로 IDF(Intensity-Duration-Frequency) 곡선으로부터 추출된다. 최근 CMIP(Coupled Model Intercomparison Project) 6단계에서 새로운 이산화탄소 배출 시나리오와 업데이트된 기후모델을 이용하여 미래의 기후에 대한 예측 시계열을 발표했으므로, 미래 기후 변화 시나리오를 기반으로 IDF 곡선을 새로 추정하고 미래 기간의 변화를 평가할 필요가 있다. 본 연구에서는 한국의 40개 지역에 대해 일단위 자료를 시단위로 축소(downscaling)한 후, 확률론적 일기생성기(stochastic weather generator)를 이용하여 30년 시단위 시계열을 100개의 앙상블로 생성하였다. 생성된 시계열로부터 연최대강수량 시계열을 재구성하여 GEV 분포와 gumbel 분포에 적용하였다. 적합도 검정(Anderson-Darling(AD) 검정 및 Kolmogorov-Smirnov(KS) 검정)을 수행하였으며, 과거 자료를 기반으로 생성된 IDF 곡선과 비교 검증하였다. CMIP5의 기후변화 자료를 사용한 결과와 CMIP6 기후변화의 결과를 비교하였으며, 본 연구의 주요 결과는 다음과 같다. (1) 향후 강우 강도는 증가할 것이며 강우 강도의 증가는 말기에 현저하게 관찰될 것이다. (2) 시간별 강우 강도의 미래 변화가 일단위 강우 강도보다 더 크다. (3) 강우 강도의 불확실성을 정량화하기 위해 앙상블을 사용해야 한다. (4) 강우 강도의 미래 변화에 대한 공간적인 경향이 확인된다. 시단위 시계열 앙상블을 생성하여 추정된 IDF 곡선에 대한 정보는 기후 변화의 영향을 평가하고 적절한 적응 및 대응 전략을 개발하는 데 도움이 될 것이다.

  • PDF

Analysis of ensemble streamflow prediction effect on deriving dam releases for water supply (용수공급을 위한 댐 방류량 결정에서의 앙상블 유량 예측 효과 분석)

  • Kim, Yeonju;Kim, Gi Joo;Kim, Young-Oh
    • Journal of Korea Water Resources Association
    • /
    • v.56 no.12
    • /
    • pp.969-980
    • /
    • 2023
  • Since the 2000s, ensemble streamflow prediction (ESP) has been actively utilized in South Korea, primarily for hydrological forecasting purposes. Despite its notable success in hydrological forecasting, the original objective of enhancing water resources system management has been relatively overlooked. Consequently, this study aims to demonstrate the utility of ESP in water resources management by creating a simple hypothetical exercise for dam operators and applying it to actual multi-purpose dams in South Korea. The hypothetical exercise showed that even when the means of ESP are identical, different costs can result from varying standard deviations. Subsequently, using sampling stochastic dynamic programming (SSDP) and considering the capacity-inflow ratio (CIR), optimal release patterns were derived for Soyang Dam (CIR = 1.345) and Chungju Dam (CIR = 0.563) based on types W and P. For this analysis, Type W was defined with standard deviation equal to the mean inflow, and Type P with standard deviation ten times of the mean inflow. Simulated operations were conducted from 2020 to 2022 using the derived optimal releases. The results indicate that in the case of Dam Chungju, more aggressive optimal release patterns were derived under types with smaller standard deviations, and the simulated operations demonstrated satisfactory outcomes. Similarly, Soyang Dam exhibited similar results in terms of optimal release, but there was no significant difference in the simulation between types W and P due to its large CIR. Ultimately, this study highlights that even with the same mean values, the standard deviation of ESP impacts optimal release patterns and outcomes in simulation. Additionally, it underscores that systems with smaller CIRs are more sensitive to such uncertainties. Based on these findings, there is potential for improvements in South Korea's current operational practices, which rely solely on single representative values for water resources management.

Bankruptcy prediction using an improved bagging ensemble (개선된 배깅 앙상블을 활용한 기업부도예측)

  • Min, Sung-Hwan
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.4
    • /
    • pp.121-139
    • /
    • 2014
  • Predicting corporate failure has been an important topic in accounting and finance. The costs associated with bankruptcy are high, so the accuracy of bankruptcy prediction is greatly important for financial institutions. Lots of researchers have dealt with the topic associated with bankruptcy prediction in the past three decades. The current research attempts to use ensemble models for improving the performance of bankruptcy prediction. Ensemble classification is to combine individually trained classifiers in order to gain more accurate prediction than individual models. Ensemble techniques are shown to be very useful for improving the generalization ability of the classifier. Bagging is the most commonly used methods for constructing ensemble classifiers. In bagging, the different training data subsets are randomly drawn with replacement from the original training dataset. Base classifiers are trained on the different bootstrap samples. Instance selection is to select critical instances while deleting and removing irrelevant and harmful instances from the original set. Instance selection and bagging are quite well known in data mining. However, few studies have dealt with the integration of instance selection and bagging. This study proposes an improved bagging ensemble based on instance selection using genetic algorithms (GA) for improving the performance of SVM. GA is an efficient optimization procedure based on the theory of natural selection and evolution. GA uses the idea of survival of the fittest by progressively accepting better solutions to the problems. GA searches by maintaining a population of solutions from which better solutions are created rather than making incremental changes to a single solution to the problem. The initial solution population is generated randomly and evolves into the next generation by genetic operators such as selection, crossover and mutation. The solutions coded by strings are evaluated by the fitness function. The proposed model consists of two phases: GA based Instance Selection and Instance based Bagging. In the first phase, GA is used to select optimal instance subset that is used as input data of bagging model. In this study, the chromosome is encoded as a form of binary string for the instance subset. In this phase, the population size was set to 100 while maximum number of generations was set to 150. We set the crossover rate and mutation rate to 0.7 and 0.1 respectively. We used the prediction accuracy of model as the fitness function of GA. SVM model is trained on training data set using the selected instance subset. The prediction accuracy of SVM model over test data set is used as fitness value in order to avoid overfitting. In the second phase, we used the optimal instance subset selected in the first phase as input data of bagging model. We used SVM model as base classifier for bagging ensemble. The majority voting scheme was used as a combining method in this study. This study applies the proposed model to the bankruptcy prediction problem using a real data set from Korean companies. The research data used in this study contains 1832 externally non-audited firms which filed for bankruptcy (916 cases) and non-bankruptcy (916 cases). Financial ratios categorized as stability, profitability, growth, activity and cash flow were investigated through literature review and basic statistical methods and we selected 8 financial ratios as the final input variables. We separated the whole data into three subsets as training, test and validation data set. In this study, we compared the proposed model with several comparative models including the simple individual SVM model, the simple bagging model and the instance selection based SVM model. The McNemar tests were used to examine whether the proposed model significantly outperforms the other models. The experimental results show that the proposed model outperforms the other models.

Enhancing of Red Tide Blooms Prediction using Ensemble Train (앙상블 학습을 이용한 적조 발생 예측의 성능향상)

  • Park, Sun;Jeong, Min-A;Lee, Seong-Ro
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.49 no.1
    • /
    • pp.41-48
    • /
    • 2012
  • Red tide is a natural phenomenon temporary blooming harmful algal with changing sea color from normal to red, which fish and shellfish die en masse. It also give a bad influence to coastal environment and sea ecosystem. The damage of sea farming by a red tide has been occurred each year which it cost much to prevent disasters of red tide blooms. Red tide damage and prevention cost of red tide disasters can be minimized by means of prediction of red tide blooms. In this paper, we proposed the red tide blooms prediction method using ensemble train. The proposed method use the bagging and boosting ensemble train methods for enhancing red tide prediction and forecast. The experimental results demonstrate that the proposed method achieves a better red tide prediction performance than other single classifiers.

Path Loss Prediction Using an Ensemble Learning Approach

  • Beom Kwon;Eonsu Noh
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.2
    • /
    • pp.1-12
    • /
    • 2024
  • Predicting path loss is one of the important factors for wireless network design, such as selecting the installation location of base stations in cellular networks. In the past, path loss values were measured through numerous field tests to determine the optimal installation location of the base station, which has the disadvantage of taking a lot of time to measure. To solve this problem, in this study, we propose a path loss prediction method based on machine learning (ML). In particular, an ensemble learning approach is applied to improve the path loss prediction performance. Bootstrap dataset was utilized to obtain models with different hyperparameter configurations, and the final model was built by ensembling these models. We evaluated and compared the performance of the proposed ensemble-based path loss prediction method with various ML-based methods using publicly available path loss datasets. The experimental results show that the proposed method outperforms the existing methods and can predict the path loss values accurately.

Appraisal of spatial characteristics and applicability of the predicted ensemble rainfall data (강우앙상블 예측자료의 공간적 특성 및 적용성 평가)

  • Lee, Sang-Hyeop;Seong, Yeon-Jeong;Kim, Gyeong-Tak;Jeong, Yeong-Hun
    • Journal of Korea Water Resources Association
    • /
    • v.53 no.11
    • /
    • pp.1025-1037
    • /
    • 2020
  • This study attempted to evaluate the spatial characteristics and applicability of the predicted ensemble rainfall data used for heavy rain alarms. Limited area ENsemble prediction System (LENS) has 13 rainfall ensemble members, so it is possible to use a probabilistic method in issuing heavy rain warnings. However, the accessibility of LENS data is very low, so studies on the applicability of rainfall prediction data are insufficient. In this study, the evaluation index was calculated by comparing one point value and the area average value with the observed value according to the heavy rain warning system used for each administrative district. In addition, the accuracy of each ensemble member according to the LENS issuance time was evaluated. LENS showed the uncertainty of over or under prediction by member. Area-based prediction showed higher predictability than point-based prediction. In addition, the LENS data that predicts the upcoming 72-hour rainfall showed good predictive performance for rainfall events that may have an impact on a water disaster. In the future, the predicted rainfall data from LENS are expected to be used as basic data to prepare for floods in administrative districts or watersheds.