• Title/Summary/Keyword: Curse of dimensionality

Search Result 58, Processing Time 0.027 seconds

Comparison of the Performance of Clustering Analysis using Data Reduction Techniques to Identify Energy Use Patterns

  • Song, Kwonsik;Park, Moonseo;Lee, Hyun-Soo;Ahn, Joseph
    • International conference on construction engineering and project management
    • /
    • 2015.10a
    • /
    • pp.559-563
    • /
    • 2015
  • Identification of energy use patterns in buildings has a great opportunity for energy saving. To find what energy use patterns exist, clustering analysis has been commonly used such as K-means and hierarchical clustering method. In case of high dimensional data such as energy use time-series, data reduction should be considered to avoid the curse of dimensionality. Principle Component Analysis, Autocorrelation Function, Discrete Fourier Transform and Discrete Wavelet Transform have been widely used to map the original data into the lower dimensional spaces. However, there still remains an ongoing issue since the performance of clustering analysis is dependent on data type, purpose and application. Therefore, we need to understand which data reduction techniques are suitable for energy use management. This research aims find the best clustering method using energy use data obtained from Seoul National University campus. The results of this research show that most experiments with data reduction techniques have a better performance. Also, the results obtained helps facility managers optimally control energy systems such as HVAC to reduce energy use in buildings.

  • PDF

A cross-entropy algorithm based on Quasi-Monte Carlo estimation and its application in hull form optimization

  • Liu, Xin;Zhang, Heng;Liu, Qiang;Dong, Suzhen;Xiao, Changshi
    • International Journal of Naval Architecture and Ocean Engineering
    • /
    • v.13 no.1
    • /
    • pp.115-125
    • /
    • 2021
  • Simulation-based hull form optimization is a typical HEB (high-dimensional, expensive computationally, black-box) problem. Conventional optimization algorithms easily fall into the "curse of dimensionality" when dealing with HEB problems. A recently proposed Cross-Entropy (CE) optimization algorithm is an advanced stochastic optimization algorithm based on a probability model, which has the potential to deal with high-dimensional optimization problems. Currently, the CE algorithm is still in the theoretical research stage and rarely applied to actual engineering optimization. One reason is that the Monte Carlo (MC) method is used to estimate the high-dimensional integrals in parameter update, leading to a large sample size. This paper proposes an improved CE algorithm based on quasi-Monte Carlo (QMC) estimation using high-dimensional truncated Sobol subsequence, referred to as the QMC-CE algorithm. The optimization performance of the proposed algorithm is better than that of the original CE algorithm. With a set of identical control parameters, the tests on six standard test functions and a hull form optimization problem show that the proposed algorithm not only has faster convergence but can also apply to complex simulation optimization problems.

Polyclass in Data Mining (데이터 마이닝에서의 폴리클라스)

  • 구자용;박헌진;최대우
    • The Korean Journal of Applied Statistics
    • /
    • v.13 no.2
    • /
    • pp.489-503
    • /
    • 2000
  • Data mining means data analysis and model selection using various types of data in order to explore useful information and knowledge for making decisions. Examples of data mining include scoring for credit analysis of a new customer and scoring for churn management, where the customers with high scores are given special attention. In this paper, scoring is interpreted as a modeling process of the conditional probability and polyclass scoring method is described. German credit data, a PC communication company data and a mobile communication company data are used to compare the performance of polyclass scoring method with that of the scoring method based on a tree model.

  • PDF

Indexing and Matching Scheme for Content-based Image Retrieval based on Extendible Hash (효과적인 이미지 검색을 위한 연장 해쉬(Extendible hash) 기반 인덱싱 및 검색 기법)

  • Tak, Yoon-Sik;Hwang, Een-Jun
    • Journal of IKEEE
    • /
    • v.14 no.4
    • /
    • pp.339-345
    • /
    • 2010
  • So far, many researches have been done to index high-dimensional feature values for fast content-based image retrieval. Still, many existing indexing schemes are suffering from performance degradation due to the curse of dimensionality problem. As an alternative, heuristic algorithms have been proposed to calculate the result with 'high probability' at the cost of accuracy. In this paper, we propose a new extendible hash-based indexing scheme for high-dimensional feature values. Our indexing scheme provides several advantages compared to the traditional high-dimensional index structures in terms of search performance and accuracy preservation. Through extensive experiments, we show that our proposed indexing scheme achieves outstanding performance.

Oil Price Forecasting Based on Machine Learning Techniques (기계학습기법에 기반한 국제 유가 예측 모델)

  • Park, Kang-Hee;Hou, Tianya;Shin, Hyun-Jung
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.37 no.1
    • /
    • pp.64-73
    • /
    • 2011
  • Oil price prediction is an important issue for the regulators of the government and the related industries. When employing the time series techniques for prediction, however, it becomes difficult and challenging since the behavior of the series of oil prices is dominated by quantitatively unexplained irregular external factors, e.g., supply- or demand-side shocks, political conflicts specific to events in the Middle East, and direct or indirect influences from other global economical indices, etc. Identifying and quantifying the relationship between oil price and those external factors may provide more relevant prediction than attempting to unclose the underlying structure of the series itself. Technically, this implies the prediction is to be based on the vectoral data on the degrees of the relationship rather than the series data. This paper proposes a novel method for time series prediction of using Semi-Supervised Learning that was originally designed only for the vector types of data. First, several time series of oil prices and other economical indices are transformed into the multiple dimensional vectors by the various types of technical indicators and the diverse combination of the indicator-specific hyper-parameters. Then, to avoid the curse of dimensionality and redundancy among the dimensions, the wellknown feature extraction techniques, PCA and NLPCA, are employed. With the extracted features, a timepointspecific similarity matrix of oil prices and other economical indices is built and finally, Semi-Supervised Learning generates one-timepoint-ahead prediction. The series of crude oil prices of West Texas Intermediate (WTI) was used to verify the proposed method, and the experiments showed promising results : 0.86 of the average AUC.

Feature Selection of Fuzzy Pattern Classifier by using Fuzzy Mapping (퍼지 매핑을 이용한 퍼지 패턴 분류기의 Feature Selection)

  • Roh, Seok-Beom;Kim, Yong Soo;Ahn, Tae-Chon
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.24 no.6
    • /
    • pp.646-650
    • /
    • 2014
  • In this paper, in order to avoid the deterioration of the pattern classification performance which results from the curse of dimensionality, we propose a new feature selection method. The newly proposed feature selection method is based on Fuzzy C-Means clustering algorithm which analyzes the data points to divide them into several clusters and the concept of a function with fuzzy numbers. When it comes to the concept of a function where independent variables are fuzzy numbers and a dependent variable is a label of class, a fuzzy number should be related to the only one class label. Therefore, a good feature is a independent variable of a function with fuzzy numbers. Under this assumption, we calculate the goodness of each feature to pattern classification problem. Finally, in order to evaluate the classification ability of the proposed pattern classifier, the machine learning data sets are used.

Control of pH Neutralization Process using Simulation Based Dynamic Programming (ICCAS 2003)

  • Kim, Dong-Kyu;Yang, Dae-Ryook
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2003.10a
    • /
    • pp.2617-2622
    • /
    • 2003
  • The pH neutralization process has long been taken as a representative benchmark problem of nonlinear chemical process control due to its nonlinearity and time-varying nature. For general nonlinear processes, it is difficult to control with a linear model-based control method so nonlinear controls must be considered. Among the numerous approaches suggested, the most rigorous approach is the dynamic optimization. However, as the size of the problem grows, the dynamic programming approach is suffered from the curse of dimensionality. In order to avoid this problem, the Neuro-Dynamic Programming (NDP) approach was proposed by Bertsekas and Tsitsiklis (1996). The NDP approach is to utilize all the data collected to generate an approximation of optimal cost-to-go function which was used to find the optimal input movement in real time control. The approximation could be any type of function such as polynomials, neural networks and etc. In this study, an algorithm using NDP approach was applied to a pH neutralization process to investigate the feasibility of the NDP algorithm and to deepen the understanding of the basic characteristics of this algorithm. As the global approximator, the neural network which requires training and k-nearest neighbor method which requires querying instead of training are investigated. The global approximator requires optimal control strategy. If the optimal control strategy is not available, suboptimal control strategy can be used even though the laborious Bellman iterations are necessary. For pH neutralization process it is rather easy to devise an optimal control strategy. Thus, we used an optimal control strategy and did not perform the Bellman iteration. Also, the effects of constraints on control moves are studied. From the simulations, the NDP method outperforms the conventional PID control.

  • PDF

MCMC Particle Filter based Multiple Preceeding Vehicle Tracking System for Intelligent Vehicle (MCMC 기반 파티클 필터를 이용한 지능형 자동차의 다수 전방 차량 추적 시스템)

  • Choi, Baehoon;An, Jhonghyun;Cho, Minho;Kim, Euntai
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.25 no.2
    • /
    • pp.186-190
    • /
    • 2015
  • Intelligent vehicle plans motion and navigate itself based on the surrounding environment perception. Hence, the precise environment recognition is an essential part of self-driving vehicle. There exist many vulnerable road users (e.g. vehicle, pedestrians) on vehicular driving environment, the vehicle must percept all the dynamic obstacles accurately for safety. In this paper, we propose an multiple vehicle tracking algorithm using microwave radar. Our proposed system includes various special features. First, exceptional radar measurement model for vehicle, concentrated on the corner, is described by mixture density network (MDN), and applied to particle filter weighting. Also, to conquer the curse of dimensionality of particle filter and estimate the time-varying number of multi-target states, reversible jump markov chain monte carlo (RJMCMC) is used to sampling step of the proposed algorithm. The robustness of the proposed algorithm is demonstrated through several computer simulations.

Reinforcement Learning-based Dynamic Weapon Assignment to Multi-Caliber Long-Range Artillery Attacks (다종 장사정포 공격에 대한 강화학습 기반의 동적 무기할당)

  • Hyeonho Kim;Jung Hun Kim;Joohoe Kong;Ji Hoon Kyung
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.45 no.4
    • /
    • pp.42-52
    • /
    • 2022
  • North Korea continues to upgrade and display its long-range rocket launchers to emphasize its military strength. Recently Republic of Korea kicked off the development of anti-artillery interception system similar to Israel's "Iron Dome", designed to protect against North Korea's arsenal of long-range rockets. The system may not work smoothly without the function assigning interceptors to incoming various-caliber artillery rockets. We view the assignment task as a dynamic weapon target assignment (DWTA) problem. DWTA is a multistage decision process in which decision in a stage affects decision processes and its results in the subsequent stages. We represent the DWTA problem as a Markov decision process (MDP). Distance from Seoul to North Korea's multiple rocket launchers positioned near the border, limits the processing time of the model solver within only a few second. It is impossible to compute the exact optimal solution within the allowed time interval due to the curse of dimensionality inherently in MDP model of practical DWTA problem. We apply two reinforcement-based algorithms to get the approximate solution of the MDP model within the time limit. To check the quality of the approximate solution, we adopt Shoot-Shoot-Look(SSL) policy as a baseline. Simulation results showed that both algorithms provide better solution than the solution from the baseline strategy.

Convergence performance comparison using combination of ML-SVM, PCA, VBM and GMM for detection of AD (알츠하이머 병의 검출을 위한 ML-SVM, PCA, VBM, GMM을 결합한 융합적 성능 비교)

  • Alam, Saurar;Kwon, Goo-Rak
    • Journal of the Korea Convergence Society
    • /
    • v.7 no.4
    • /
    • pp.1-7
    • /
    • 2016
  • Structural MRI(sMRI) imaging is used to extract morphometric features after Grey Matter (GM), White Matter (WM) for several univariate and multivariate method, and Cerebro-spinal Fluid (CSF) segmentation. A new approach is applied for the diagnosis of very mild to mild AD. We propose the classification method of Alzheimer disease patients from normal controls by combining morphometric features and Gaussian Mixture Models parameters along with MMSE (Mini Mental State Examination) score. The combined features are fed into Multi-kernel SVM classifier after getting rid of curse of dimensionality using principal component analysis. The experimenral results of the proposed diagnosis method yield up to 96% stratification accuracy with Multi-kernel SVM along with high sensitivity and specificity above 90%.