• Title/Summary/Keyword: sample selection model

Search Result 197, Processing Time 0.022 seconds

A Study on the Optimal Probe Path Generation for Sculptured Surface Inspection Using the Coordinate Measuring Machine (3차원 측정기를 이용한 자유곡면 측정시 최적의 경로 결정에 관한 연구)

  • Cho, Myung-Wo;Yi, Seung-Jong;Kim, Moon-Ki
    • Journal of the Korean Society for Precision Engineering
    • /
    • v.12 no.10
    • /
    • pp.121-129
    • /
    • 1995
  • The objective of this research is to develop an effective inspection planning strategy for sculptured surfaces by using 3-dimensional Coordinate Measuring Machine (CMM). First, the CAD/CAM database is generated by using the Bezier surface patch mathod and variable cutter step size approach for design and machining of the workpiece model. Then, optimum measuring point locations are determained based on the mean curvature analysis to obtain more effective inspection results for the given sample numbers. An optimal probe sequence generation method is proposed by implementing the Traveling Salesperson (TSP) algorithm and new guide point selection methods are suggested based on the concepts of the variable distance between the first and second guide points. Finally, simulation study and experimental work show the effectiveness of the proposed strategy.

  • PDF

Automatic Augmentation Technique of an Autoencoder-based Numerical Training Data (오토인코더 기반 수치형 학습데이터의 자동 증강 기법)

  • Jeong, Ju-Eun;Kim, Han-Joon;Chun, Jong-Hoon
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.22 no.5
    • /
    • pp.75-86
    • /
    • 2022
  • This study aims to solve the problem of class imbalance in numerical data by using a deep learning-based Variational AutoEncoder and to improve the performance of the learning model by augmenting the learning data. We propose 'D-VAE' to artificially increase the number of records for a given table data. The main features of the proposed technique go through discretization and feature selection in the preprocessing process to optimize the data. In the discretization process, K-means are applied and grouped, and then converted into one-hot vectors by one-hot encoding technique. Subsequently, for memory efficiency, sample data are generated with Variational AutoEncoder using only features that help predict with RFECV among feature selection techniques. To verify the performance of the proposed model, we demonstrate its validity by conducting experiments by data augmentation ratio.

Application of GIS to Select Viewpoints for Landscape Analysis (경관분석 조망점 선정을 위한 GIS의 적용방안)

  • Kang, Tae-Hyun;Leem, Youn-Taik;Lee, Sang-Ho
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.16 no.2
    • /
    • pp.101-113
    • /
    • 2013
  • The concern on environmental quality makes the landscape analysis more important than before ever. For the landscape analysis, selection of viewpoint is one of most important stage. Because of its subjectiveness, the conventional viewpoint selection method often missed some viewpoints of importance. The purpose of this study is to develop a viewpoint selection method for landscape analysis using GIS data and techniques. During the viewpoint selection process, spatial and attribute data from several GIS systems were hired. Query and overlay methods were mainly adapted for analysis to find out meaningful viewpoints. The 3D simulation analysis on DEM(Digital Elevation Model) was used for every selected viewpoint to examine wether the view target is screened out or not. Application study at a sample site showed some omissions of good viewpoints without any screening. It also exhibited the possibility to reduce time and cost for the viewpoint selection process of landscape analysis. For the progress of applicability, GIS data analysis process have to be improved and more modules such as automatic screening analysis system on selected viewpoint have to be developed.

Bivariate long range dependent time series forecasting using deep learning (딥러닝을 이용한 이변량 장기종속시계열 예측)

  • Kim, Jiyoung;Baek, Changryong
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.1
    • /
    • pp.69-81
    • /
    • 2019
  • We consider bivariate long range dependent (LRD) time series forecasting using a deep learning method. A long short-term memory (LSTM) network well-suited to time series data is applied to forecast bivariate time series; in addition, we compare the forecasting performance with bivariate fractional autoregressive integrated moving average (FARIMA) models. Out-of-sample forecasting errors are compared with various performance measures for functional MRI (fMRI) data and daily realized volatility data. The results show a subtle difference in the predicted values of the FIVARMA model and VARFIMA model. LSTM is computationally demanding due to hyper-parameter selection, but is more stable and the forecasting performance is competitively good to that of parametric long range dependent time series models.

Optimization of Support Vector Machines for Financial Forecasting (재무예측을 위한 Support Vector Machine의 최적화)

  • Kim, Kyoung-Jae;Ahn, Hyun-Chul
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.4
    • /
    • pp.241-254
    • /
    • 2011
  • Financial time-series forecasting is one of the most important issues because it is essential for the risk management of financial institutions. Therefore, researchers have tried to forecast financial time-series using various data mining techniques such as regression, artificial neural networks, decision trees, k-nearest neighbor etc. Recently, support vector machines (SVMs) are popularly applied to this research area because they have advantages that they don't require huge training data and have low possibility of overfitting. However, a user must determine several design factors by heuristics in order to use SVM. For example, the selection of appropriate kernel function and its parameters and proper feature subset selection are major design factors of SVM. Other than these factors, the proper selection of instance subset may also improve the forecasting performance of SVM by eliminating irrelevant and distorting training instances. Nonetheless, there have been few studies that have applied instance selection to SVM, especially in the domain of stock market prediction. Instance selection tries to choose proper instance subsets from original training data. It may be considered as a method of knowledge refinement and it maintains the instance-base. This study proposes the novel instance selection algorithm for SVMs. The proposed technique in this study uses genetic algorithm (GA) to optimize instance selection process with parameter optimization simultaneously. We call the model as ISVM (SVM with Instance selection) in this study. Experiments on stock market data are implemented using ISVM. In this study, the GA searches for optimal or near-optimal values of kernel parameters and relevant instances for SVMs. This study needs two sets of parameters in chromosomes in GA setting : The codes for kernel parameters and for instance selection. For the controlling parameters of the GA search, the population size is set at 50 organisms and the value of the crossover rate is set at 0.7 while the mutation rate is 0.1. As the stopping condition, 50 generations are permitted. The application data used in this study consists of technical indicators and the direction of change in the daily Korea stock price index (KOSPI). The total number of samples is 2218 trading days. We separate the whole data into three subsets as training, test, hold-out data set. The number of data in each subset is 1056, 581, 581 respectively. This study compares ISVM to several comparative models including logistic regression (logit), backpropagation neural networks (ANN), nearest neighbor (1-NN), conventional SVM (SVM) and SVM with the optimized parameters (PSVM). In especial, PSVM uses optimized kernel parameters by the genetic algorithm. The experimental results show that ISVM outperforms 1-NN by 15.32%, ANN by 6.89%, Logit and SVM by 5.34%, and PSVM by 4.82% for the holdout data. For ISVM, only 556 data from 1056 original training data are used to produce the result. In addition, the two-sample test for proportions is used to examine whether ISVM significantly outperforms other comparative models. The results indicate that ISVM outperforms ANN and 1-NN at the 1% statistical significance level. In addition, ISVM performs better than Logit, SVM and PSVM at the 5% statistical significance level.

The Characteristics and Biomass Distribution in Crown of Larix olgensis in Northeastern China

  • Chen, Dongsheng;Li, Fengri
    • Journal of Korean Society of Forest Science
    • /
    • v.99 no.2
    • /
    • pp.204-212
    • /
    • 2010
  • This study was performed in 22 unthinned Larix olgensis plantations in northeast China. Data were collected on 95 sample trees of different canopy positions and the diameter at breast height ($d_{1.3}$) ranged from 5.7 cm to 40.2 cm. The individual tree models for the prediction of vertical distribution of live crown, branch and needle biomass were built. Our study showed that the crown, branch and needle biomass distributions were most in the location of 60% crown length. These results were also parallel to previous crown studies. The cumulative relative biomass of live crown, branch and needle were fitted by the sigmoid shape curve and the fitting results were quite well. Meanwhile, we developed the crown ratio and width models. Tree height was the most important predictor for crown ratio model. A negative competition factor, ccf and bas which reflected the effect of suppression on a tree, reduced the crown ratio estimates. The height-diameter ratio was a significant predictor. The higher the height-diameter ratio, the higher crown ratio is. Diameter at breast height is the strongest predictor in crown width model. The models can be used for the planning of harvesting operations, for the selection of feasible harvesting methods, and for the estimation of nutrient removals of different harvesting practices.

Using the corrected Akaike's information criterion for model selection (모형 선택에서의 수정된 AIC 사용에 대하여)

  • Song, Eunjung;Won, Sungho;Lee, Woojoo
    • The Korean Journal of Applied Statistics
    • /
    • v.30 no.1
    • /
    • pp.119-133
    • /
    • 2017
  • Corrected Akaike's information criterion (AICc) is known to have better finite sample properties. However, Akaike's information criterion (AIC) is still widely used to select an optimal prediction model among several candidate models due to of a lack of research on benefits obtained using AICc. In this paper, we compare the performance of AIC and AICc through numerical simulations and confirm the advantage of using AICc. In addition, we also consider the performance of quasi Akaike's information criterion (QAIC) and the corrected quasi Akaike's information criterion (QAICc) for binomial and Poisson data under overdispersion phenomenon.

Visual Tracking Using Improved Multiple Instance Learning with Co-training Framework for Moving Robot

  • Zhou, Zhiyu;Wang, Junjie;Wang, Yaming;Zhu, Zefei;Du, Jiayou;Liu, Xiangqi;Quan, Jiaxin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.11
    • /
    • pp.5496-5521
    • /
    • 2018
  • Object detection and tracking is the basic capability of mobile robots to achieve natural human-robot interaction. In this paper, an object tracking system of mobile robot is designed and validated using improved multiple instance learning algorithm. The improved multiple instance learning algorithm which prevents model drift significantly. Secondly, in order to improve the capability of classifiers, an active sample selection strategy is proposed by optimizing a bag Fisher information function instead of the bag likelihood function, which dynamically chooses most discriminative samples for classifier training. Furthermore, we integrate the co-training criterion into algorithm to update the appearance model accurately and avoid error accumulation. Finally, we evaluate our system on challenging sequences and an indoor environment in a laboratory. And the experiment results demonstrate that the proposed methods can stably and robustly track moving object.

Selection of Optimal Models for Predicting the Distribution of Invasive Alien Plants Species (IAPS) in Forest Genetic Resource Reserves (산림생태계 보호구역에서 외래식물 분포 예측을 위한 최적 모형의 선발)

  • Lim, Chi-hong;Jung, Song-hie;Jung, Su-young;Kim, Nam-shin;Cho, Yong-chan
    • Korean Journal of Environment and Ecology
    • /
    • v.34 no.6
    • /
    • pp.589-600
    • /
    • 2020
  • Effective conservation and management of protected areas require monitoring the settlement of invasive alien species and reducing their dispersion capacity. We simulated the potential distribution of invasive alien plant species (IAPS) using three representative species distribution models (Bioclim, GLM, and MaxEnt) based on the IAPS distribution in the forest genetic resource reserve (2,274ha) in Uljin-gun, Korea. We then selected the realistic and suitable species distribution model that reflects the local region and ecological management characteristics based on the simulation results. The simulation predicted the tendency of the IAPS distributed along the linear landscape elements, such as roads, and including some forest harvested area. The statistical comparison of the prediction and accuracy of each model tested in this study showed that the GLM and MaxEnt models generally had high performance and accuracy compared to the Bioclim model. The Bioclim model calculated the largest potential distribution area, followed by GLM and MaxEnt in that order. The Phenomenological review of the simulation results showed that the sample size more significantly affected the GLM and Bioclim models, while the MaxEnt model was the most consistent regardless of the sample size. The optimal model overall for predicting the distribution of IAPS among the three models was the MaxEnt model. The model selection approach based on detailed flora distribution data presented in this study is expected to be useful for efficiently managing the conservation areas and identifying the realistic and precise species distribution model reflecting local characteristics.

Exploratory Case Study for Key Successful Factors of Producy Service System (Product-Service System(PSS) 성공과 실패요인에 관한 탐색적 사례 연구)

  • Park, A-Rum;Jin, Dong-Su;Lee, Kyoung-Jun
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.4
    • /
    • pp.255-277
    • /
    • 2011
  • Product Service System(PSS), which is an integrated combination of product and service, provides new value to customer and makes companies sustainable as well. The objective of this paper draws Critical Successful Factors(CSF) of PSS through multiple case study. First, we review various concepts and types in PSS and Platform business literature currently available on this topic. Second, after investigating various cases with the characteristics of PSS and platform business, we select four cases of 'iPod of Apple', 'Kindle of Amazon', 'Zune of Microsoft', and 'e-book reader of Sony'. Then, the four cases are categorized as successful and failed cases according to criteria of case selection and PSS classification. We consider two methodologies for the case selection, i.e., 'Strategies for the Selection of Samples and Cases' proposed by Bent(2006) and the seven case selection procedures proposed by Jason and John(2008). For case selection, 'Stratified sample and Paradigmatic cases' is adopted as one of several options for sampling. Then, we use the seven case selection procedures such as 'typical', 'diverse', 'extreme', 'deviant', 'influential', 'most-similar', and 'mostdifferent' and among them only three procedures of 'diverse', 'most?similar', and 'most-different' are applied for the case selection. For PSS classification, the eight PSS types, suggested by Tukker(2004), of 'product related', 'advice and consulancy', 'product lease', 'product renting/sharing', 'product pooling', 'activity management', 'pay per service unit', 'functional result' are utilized. We categorize the four selected cases as a product oriented group because the cases not only sell a product, but also offer service needed during the use phase of the product. Then, we analyze the four cases by using cross-case pattern that Eisenhardt(1991) suggested. Eisenhardt(1991) argued that three processes are required for avoiding reaching premature or even false conclusion. The fist step includes selecting categories of dimensions and finding within-group similarities coupled with intergroup difference. In the second process, pairs of cases are selected and listed. The second step forces researchers to find the subtle similarities and differences between cases. The third process is to divide the data by data source. The result of cross-case pattern indicates that the similarities of iPod and Kindle as successful cases are convenient user interface, successful plarform strategy, and rich contents. The differences between the successful cases are that, wheares iPod has been recognized as the culture code, Kindle has implemented a low price as its main strategy. Meanwhile, the similarities of Zune and PRS series as failed cases are lack of sufficient applications and contents. The differences between the failed cases are that, wheares Zune adopted an undifferentiated strategy, PRS series conducted high-price strategy. From the analysis of the cases, we generate three hypotheses. The first hypothesis assumes that a successful PSS system requires convenient user interface. The second hypothesis assumes that a successful PSS system requires a reciprocal(win/win) business model. The third hypothesis assumes that a successful PSS system requires sufficient quantities of applications and contents. To verify the hypotheses, we uses the cross-matching (or pattern matching) methodology. The methodology matches three key words (user interface, reciprocal business model, contents) of the hypotheses to the previous papers related to PSS, digital contents, and Information System (IS). Finally, this paper suggests the three implications from analyzed results. A successful PSS system needs to provide differentiated value for customers such as convenient user interface, e.g., the simple design of iTunes (iPod) and the provision of connection to Kindle Store without any charge. A successful PSS system also requires a mutually benefitable business model as Apple and Amazon implement a policy that provides a reasonable proft sharing for third party. A successful PSS system requires sufficient quantities of applications and contents.