• Title/Summary/Keyword: Vector data model

Search Result 1,198, Processing Time 0.029 seconds

Statistical Radial Basis Function Model for Pattern Classification (패턴분류를 위한 통계적 RBF 모델)

  • Choi Jun-Hyeog;Rim Kee-Wook;Lee Jung-Hyun
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.41 no.1
    • /
    • pp.1-8
    • /
    • 2004
  • According to the development of the Internet and the pervasion of Data Base, it is not easy to search for necessary information from the huge amounts of data. In order to do efficient analysis of a large amounts of data, this paper proposes a method for pattern classification based on the effective strategy for dimension reduction for narrowing down the whole data to what users wants to search for. To analyze data effectively, Radial Basis Function Networks based on VC-dimension of Support Vector Machine, a model of statistical teaming, is proposed in this paper. The model of Radial Basis Function Networks currently used performed the preprocessing of Perceptron model whereas the model proposed in this paper, performing independent analysis on VD-dimension, classifies each datum putting precise labels on it. The comparison and estimation of various models by using Machine Learning Data shows that the model proposed in this paper proves to be more efficient than various sorts of algorithm previously used.

Forest Vertical Structure Mapping from Bi-Seasonal Sentinel-2 Images and UAV-Derived DSM Using Random Forest, Support Vector Machine, and XGBoost

  • Young-Woong Yoon;Hyung-Sup Jung
    • Korean Journal of Remote Sensing
    • /
    • v.40 no.2
    • /
    • pp.123-139
    • /
    • 2024
  • Forest vertical structure is vital for comprehending ecosystems and biodiversity, in addition to fundamental forest information. Currently, the forest vertical structure is predominantly assessed via an in-situ method, which is not only difficult to apply to inaccessible locations or large areas but also costly and requires substantial human resources. Therefore, mapping systems based on remote sensing data have been actively explored. Recently, research on analyzing and classifying images using machine learning techniques has been actively conducted and applied to map the vertical structure of forests accurately. In this study, Sentinel-2 and digital surface model images were obtained on two different dates separated by approximately one month, and the spectral index and tree height maps were generated separately. Furthermore, according to the acquisition time, the input data were separated into cases 1 and 2, which were then combined to generate case 3. Using these data, forest vetical structure mapping models based on random forest, support vector machine, and extreme gradient boost(XGBoost)were generated. Consequently, nine models were generated, with the XGBoost model in Case 3 performing the best, with an average precision of 0.99 and an F1 score of 0.91. We confirmed that generating a forest vertical structure mapping model utilizing bi-seasonal data and an appropriate model can result in an accuracy of 90% or higher.

A Study on Customer Segmentation Prediction Model using Support Vector Machine (Support Vector Machine을 이용한 고객이탈 예측모형에 관한 연구)

  • Seo Kwang Kyu
    • Journal of the Korea Safety Management & Science
    • /
    • v.7 no.1
    • /
    • pp.199-210
    • /
    • 2005
  • Customer segmentation prediction has attracted a lot of research interests in previous literature, and recent studies have shown that artificial neural networks (ANN) method achieved better performance than traditional statistical ones. However, ANN approaches have suffered from difficulties with generalization, producing models that can overfit the data. This paper employs a relatively new machine learning technique, support vector machines (SVM), to the customer segmentation prediction problem in an attempt to provide a model with better explanatory power. To evaluate the prediction accuracy of SVM, we compare its performance with logistic regression analysis and ANN. The experiment results with real data of insurance company show that SVM superiors to them.

A Study on Word Vector Models for Representing Korean Semantic Information

  • Yang, Hejung;Lee, Young-In;Lee, Hyun-jung;Cho, Sook Whan;Koo, Myoung-Wan
    • Phonetics and Speech Sciences
    • /
    • v.7 no.4
    • /
    • pp.41-47
    • /
    • 2015
  • This paper examines whether the Global Vector model is applicable to Korean data as a universal learning algorithm. The main purpose of this study is to compare the global vector model (GloVe) with the word2vec models such as a continuous bag-of-words (CBOW) model and a skip-gram (SG) model. For this purpose, we conducted an experiment by employing an evaluation corpus consisting of 70 target words and 819 pairs of Korean words for word similarities and analogies, respectively. Results of the word similarity task indicated that the Pearson correlation coefficients of 0.3133 as compared with the human judgement in GloVe, 0.2637 in CBOW and 0.2177 in SG. The word analogy task showed that the overall accuracy rate of 67% in semantic and syntactic relations was obtained in GloVe, 66% in CBOW and 57% in SG.

A Study on Predicting Construction Cost of Educational Building Project at early stage Using Support Vector Machine Technique (서포트벡터머신을 이용한 교육시설 초기 공사비 예측에 관한 연구)

  • Shin, Jae-Min;Kim, Gwang-Hee
    • The Journal of Sustainable Design and Educational Environment Research
    • /
    • v.11 no.3
    • /
    • pp.46-54
    • /
    • 2012
  • The accuracy of cost estimation at an early stage in school building project is one of the critical factors for successful completion. So various of techniques are developed to predict the construction cost accurately and expeditely. Among the techniques, Support Vector Machine(SVM) has an excellent ability for generalization performance. Therefore, the purpose of this study is to construct the prediction model for construction cost of educational building project using support vector machine technique. And to verify the accuracy of prediction model for construction cost. The performance data used in this study are 217 school building project cost which have been completed from 2004 to 2007 in Gyeonggi-Do, Korea. The result shows that average error rate was 7.48% for SVM prediction model. So using SVM model on predicting construction cost of educational building project will be a considerably effective way at the early project stage.

Type I projection sum of squares by weighted least squares (가중최소제곱법에 의한 제1종 사영제곱합)

  • Choi, Jaesung
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.2
    • /
    • pp.423-429
    • /
    • 2014
  • This paper discusses a method for getting Type I sums of squares by projections under a two-way fixed-effects model when variances of errors are not equal. The method of weighted least squares is used to estimate the parameters of the assumed model. The model is fitted to the data in a sequential manner by using the model comparison technique. The vector space generated by the model matrix can be composed of orthogonal vector subspaces spanned by submatrices consisting of column vectors related to the parameters. It is discussed how to get the Type I sums of squares by using the projections into the orthogonal vector subspaces.

A Causality Analysis of the Hairtail Price by Distribution Channel Using a Vector Autoregressive Model (VAR 모형을 이용한 유통단계별 갈치가격의 인과성 분석)

  • Kim, Cheol-Hyun;Nam, Jong-Oh
    • The Journal of Fisheries Business Administration
    • /
    • v.46 no.1
    • /
    • pp.93-107
    • /
    • 2015
  • This study aims to analyze causalities among Hairtail prices by distribution channel using a vector autoregressive model. This study applies unit-root test for stability of data, uses Granger causality test to know interaction among Hairtail Prices by distribution channel, and employes the vector autoregressive model to estimate statistical impacts among t-2 period variables used in model. Analyzing results of this study are as follows. First, ADF, PP, and KPSS tests show that the change rate of Hairtail price by distribution channel differentiated by logarithm is stable. Second, a Granger causality test presents that the producer price of Hairtail leads the wholesale price and then the wholesale price leads the consumer price. Third, the vector autoregressive model suggests that the change rate of Hairtail producer price of t-2 period variables statistically, significantly impacts change rates of own, wholesale, and consumer prices at current period. Fourth, the impulse response analysis indicates that impulse responses of the structural shocks with a respectively distribution channel of the Hairtail prices are relatively more powerful in own distribution channel than in other distribution channels. Fifth, a forecast error variance decomposition of the Hairtail prices points out that the own price has relatively more powerful influence than other prices.

Comparison of the forecasting models with real estate price index (주택가격지수 모형의 비교연구)

  • Lim, Seong Sik
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.6
    • /
    • pp.1573-1583
    • /
    • 2016
  • It is necessary to check mutual correlations between related variables because housing prices are influenced by a lot of variables of the economy both internally and externally. In this paper, employing the Granger causality test, we have validated interrelated relationship between the variables. In addition, there is cointegration associations in the results of the cointegration test between the variables. Therefore, an analysis using a vector error correction model including an error correction term has been attempted. As a result of the empirical comparative analysis of the forecasting performance with ARIMA and VAR models, it is confirmed that the forecasting performance by vector error correction model is superior to those of the former two models.

Design of Low Complexity Human Anxiety Classification Model based on Machine Learning (기계학습 기반 저 복잡도 긴장 상태 분류 모델)

  • Hong, Eunjae;Park, Hyunggon
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.66 no.9
    • /
    • pp.1402-1408
    • /
    • 2017
  • Recently, services for personal biometric data analysis based on real-time monitoring systems has been increasing and many of them have focused on recognition of emotions. In this paper, we propose a classification model to classify anxiety emotion using biometric data actually collected from people. We propose to deploy the support vector machine to build a classification model. In order to improve the classification accuracy, we propose two data pre-processing procedures, which are normalization and data deletion. The proposed algorithms are actually implemented based on Real-time Traffic Flow Measurement structure, which consists of data collection module, data preprocessing module, and creating classification model module. Our experiment results show that the proposed classification model can infers anxiety emotions of people with the accuracy of 65.18%. Moreover, the proposed model with the proposed pre-processing techniques shows the improved accuracy, which is 78.77%. Therefore, we can conclude that the proposed classification model based on the pre-processing process can improve the classification accuracy with lower computation complexity.

Partially linear support vector orthogonal quantile regression with measurement errors

  • Hwang, Changha
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.1
    • /
    • pp.209-216
    • /
    • 2015
  • Quantile regression models with covariate measurement errors have received a great deal of attention in both the theoretical and the applied statistical literature. A lot of effort has been devoted to develop effective estimation methods for such quantile regression models. In this paper we propose the partially linear support vector orthogonal quantile regression model in the presence of covariate measurement errors. We also provide a generalized approximate cross-validation method for choosing the hyperparameters and the ratios of the error variances which affect the performance of the proposed model. The proposed model is evaluated through simulations.