Search | Korea Science

Variable Selection with Regression Trees

Chang, Young-Jae
- The Korean Journal of Applied Statistics
- /
- v.23 no.2
- /
- pp.357-366
- /
- 2010
Many tree algorithms have been developed for regression problems. Although they are regarded as good algorithms, most of them suffer from loss of prediction accuracy when there are many noise variables. To handle this problem, we propose the multi-step GUIDE, which is a regression tree algorithm with a variable selection process. The multi-step GUIDE performs better than some of the well-known algorithms such as Random Forest and MARS. The results based on simulation study shows that the multi-step GUIDE outperforms other algorithms in terms of variable selection and prediction accuracy. It generally selects the important variables correctly with relatively few noise variables and eventually gives good prediction accuracy.
https://doi.org/10.5351/KJAS.2010.23.2.357 인용 PDF KSCI

연결강도분석을 이용한 통합된 부도예측용 신경망모형

Lee Woongkyu;Lim Young Ha
- Proceedings of the Korea Association of Information Systems Conference
- /
- 2002.11a
- /
- pp.289-312
- /
- 2002
This study suggests the Link weight analysis approach to choose input variables and an integrated model to make more accurate bankruptcy prediction model. the Link weight analysis approach is a method to choose input variables to analyze each input node's link weight which is the absolute value of link weight between an input nodes and a hidden layer. There are the weak-linked neurons elimination method, the strong-linked neurons selection method in the link weight analysis approach. The Integrated Model is a combined type adapting Bagging method that uses the average value of the four models, the optimal weak-linked-neurons elimination method, optimal strong-linked neurons selection method, decision-making tree model, and MDA. As a result, the methods suggested in this study - the optimal strong-linked neurons selection method, the optimal weak-linked neurons elimination method, and the integrated model - show much higher accuracy than MDA and decision making tree model. Especially the integrated model shows much higher accuracy than MDA and decision making tree model and shows slightly higher accuracy than the optimal weak-linked neurons elimination method and the optimal strong-linked neurons selection method.
PDF

A Study on the Verification of Significance of Assessment Items for Selecting Start-ups: Focusing on Project Fostering Start-ups through Leading Universities (창업기업 선정평가지표 유의성 검증에 관한 연구: 창업선도대학육성사업을 중심으로)

Jung, Kyung Hee;Sung, Chang So
- Asia-Pacific Journal of Business Venturing and Entrepreneurship
- /
- v.13 no.4
- /
- pp.13-22
- /
- 2018
In this study, we examined the accuracy of the assessment items for selecting start-ups used in the project to support start-ups and verified their validity in determining whether they are appropriate assessment items based on selection criteria. The results of 973 start-ups that applied for the project fostering startup leading universities were collected and logistic regression was performed using SPSS 18.0. The study results are summarized as follows. First, the differences in characteristics of start-ups were identified in terms of selection. Second, the impact of selection by assessment items was gender in 2015, capability of the founder, business establishment in 2016, performance and potential in the global market, and business startup in 2017. Third, the overall selection accuracy analysis for the last three years confirmed that the accuracy of the selection is lower each year and that the accuracy of the selection is lower than the accuracy of the non-selection. This means that the current assessment items for selecting start-ups are inaccurate for selection, and that changes in the items due to changes in the start-up environment each year have led to lower accuracy of selection. It is meaningful that this study raised the importance of assessment items and the need for improvement of assessment items for the screening functions of good start-ups to enhance efficiency of the policies for startup support.
PDF KSCI

Optimal Variable Selection in a Thermal Error Model for Real Time Error Compensation (실시간 오차 보정을 위한 열변형 오차 모델의 최적 변수 선택)

Hwang, Seok-Hyun;Lee, Jin-Hyeon;Yang, Seung-Han
- Journal of the Korean Society for Precision Engineering
- /
- v.16 no.3 s.96
- /
- pp.215-221
- /
- 1999
The object of the thermal error compensation system in machine tools is improving the accuracy of a machine tool through real time error compensation. The accuracy of the machine tool totally depends on the accuracy of thermal error model. A thermal error model can be obtained by appropriate combination of temperature variables. The proposed method for optimal variable selection in the thermal error model is based on correlation grouping and successive regression analysis. Collinearity matter is improved with the correlation grouping and the judgment function which minimizes residual mean square is used. The linear model is more robust against measurement noises than an engineering judgement model that includes the higher order terms of variables. The proposed method is more effective for the applications in real time error compensation because of the reduction in computational time, sufficient model accuracy, and the robustness.
PDF

A Hybrid Selection Method of Helpful Unlabeled Data Applicable for Semi-Supervised Learning Algorithm

Le, Thanh-Binh;Kim, Sang-Woon
- IEIE Transactions on Smart Processing and Computing
- /
- v.3 no.4
- /
- pp.234-239
- /
- 2014
This paper presents an empirical study on selecting a small amount of useful unlabeled data to improve the classification accuracy of semi-supervised learning algorithms. In particular, a hybrid method of unifying the simply recycled selection method and the incrementally-reinforced selection method was considered and evaluated empirically. The experimental results, which were obtained from well-known benchmark data sets using semi-supervised support vector machines, demonstrated that the hybrid method works better than the traditional ones in terms of the classification accuracy.
https://doi.org/10.5573/IEIESPC.2014.3.4.234 인용 PDF KSCI

Comparison of different digital shade selection methodologies in terms of accuracy

Nursen Sahin;Cagri Ural
- The Journal of Advanced Prosthodontics
- /
- v.16 no.1
- /
- pp.38-47
- /
- 2024
PURPOSE. This study aims to evaluate the accuracy of different shade selection techniques and determine the matching success of crown restorations fabricated using digital shade selection techniques. MATERIALS AND METHODS. Teeth numbers 11 and 21 were prepared on a typodont model. For the #11 tooth, six different crowns were fabricated with randomly selected colors and set as the target crowns. The following four test groups were established: Group C, where the visual shade selection was performed using the Vita 3D Master Shade Guide and the group served as the control; Group Ph, where the shade selection was performed under the guidance of dental photography; Group S, where the shade selection was performed by measuring the target tooth color using a spectrophotometer; and Group I, where the shade selection was performed by scanning the test specimens and target crowns using an intraoral scanner. Based on the test groups, 24 crowns were fabricated using different shade selection techniques. The ΔE values were calculated according to the CIEDE2000 (2:1:1) formula. The collected data were analyzed by means of a one-way analysis of variance. RESULTS. For the four test groups (Groups C, Ph, S, and I), the following mean ΔE values were obtained: 2.74, 3.62, 2.13, and 3.5, respectively. No significant differences were found among the test groups. CONCLUSION. Although there was no statistically significant difference among the shade selection techniques, Group S had relatively lower ΔE values. Moreover, according to the test results, the spectrophotometer shade selection technique may provide more successful clinical results.
https://doi.org/10.4047/jap.2024.16.1.38 인용 PDF

Robustness of Selection Indices in Murrah Buffaloes

Gandhi, R.S.;Joshi, B.K.
- Asian-Australasian Journal of Animal Sciences
- /
- v.17 no.2
- /
- pp.159-163
- /
- 2004
Data pertaining to first lactation records of 316 Murrah buffaloes, progeny of 47 sires, maintained at NDRI Farm for a period of 18 years were analysed to construct selection indices and to examine their robustness by changing the relative economic values of different economic traits. A total of 120 selection indices were constructed for three sets of relative economic values ( 40 for each set) considering different combinations of seven first lactation traits viz. age at first calving (AFC), first lactation 305 day or less milk yield (FLMY), first lactation length (FLL), first calving interval (FCI), milk yield per day of first lactation length (MY/FLL), milk yield per day of first calving interval (MY/FCI) and milk yield per day age at second calving (MY/ASC). The three sets of relative economic values were based on economic values of different traits, 1% standard deviation of different traits and regression of different traits on FLMY. The 'optimum' indices for the first two sets had five traits each namely AFC, FLMY, FLL, FCI and MY/ASC giving improvement in aggregate genotype of Rupees 269.11 and Rs. 174.88, respectively. The accuracy of selection from both indices was 70.79 and 69.39%, respectively. The 'best' selection index from the third set of data again had five traits (AFC, FLMY, FLL, FCI and MY/FLL) giving genetic gain of Rs. 124.16 and accuracy of selection of 71.81%. The critcal levels or break-even points for FLMY for varying levels of AFC and FCI estimated from the "optimum index" suggested the need of enhancement of present production level of the herd or reduction of AFC or FCI. It was concluded that economic values of various first lactation traits were the most appropriate to construct selection indices as compared to other criteria of assigning relative economic weights in Murrah buffaloes.
https://doi.org/10.5713/ajas.2004.159 인용 PDF KSCI

Evaluation of Attribute Selection Methods and Prior Discretization in Supervised Learning

Cha, Woon Ock;Huh, Moon Yul
- Communications for Statistical Applications and Methods
- /
- v.10 no.3
- /
- pp.879-894
- /
- 2003
We evaluated the efficiencies of applying attribute selection methods and prior discretization to supervised learning, modelled by C4.5 and Naive Bayes. Three databases were obtained from UCI data archive, which consisted of continuous attributes except for one decision attribute. Four methods were used for attribute selection : MDI, ReliefF, Gain Ratio and Consistency-based method. MDI and ReliefF can be used for both continuous and discrete attributes, but the other two methods can be used only for discrete attributes. Discretization was performed using the Fayyad and Irani method. To investigate the effect of noise included in the database, noises were introduced into the data sets up to the extents of 10 or 20%, and then the data, including those either containing the noises or not, were processed through the steps of attribute selection, discretization and classification. The results of this study indicate that classification of the data based on selected attributes yields higher accuracy than in the case of classifying the full data set, and prior discretization does not lower the accuracy.
https://doi.org/10.5351/CKSS.2003.10.3.879 인용 PDF KSCI

ModifiedFAST: A New Optimal Feature Subset Selection Algorithm

Nagpal, Arpita;Gaur, Deepti
- Journal of information and communication convergence engineering
- /
- v.13 no.2
- /
- pp.113-122
- /
- 2015
Feature subset selection is as a pre-processing step in learning algorithms. In this paper, we propose an efficient algorithm, ModifiedFAST, for feature subset selection. This algorithm is suitable for text datasets, and uses the concept of information gain to remove irrelevant and redundant features. A new optimal value of the threshold for symmetric uncertainty, used to identify relevant features, is found. The thresholds used by previous feature selection algorithms such as FAST, Relief, and CFS were not optimal. It has been proven that the threshold value greatly affects the percentage of selected features and the classification accuracy. A new performance unified metric that combines accuracy and the number of features selected has been proposed and applied in the proposed algorithm. It was experimentally shown that the percentage of selected features obtained by the proposed algorithm was lower than that obtained using existing algorithms in most of the datasets. The effectiveness of our algorithm on the optimal threshold was statistically validated with other algorithms.
https://doi.org/10.6109/jicce.2015.13.2.113 인용 PDF KSCI KPUBS HTML

Effective Multi-label Feature Selection based on Large Offspring Set created by Enhanced Evolutionary Search Process

Lim, Hyunki;Seo, Wangduk;Lee, Jaesung
- Journal of the Korea Society of Computer and Information
- /
- v.23 no.9
- /
- pp.7-13
- /
- 2018
Recent advancement in data gathering technique improves the capability of information collecting, thus allowing the learning process between gathered data patterns and application sub-tasks. A pattern can be associated with multiple labels, demanding multi-label learning capability, resulting in significant attention to multi-label feature selection since it can improve multi-label learning accuracy. However, existing evolutionary multi-label feature selection methods suffer from ineffective search process. In this study, we propose a evolutionary search process for the task of multi-label feature selection problem. The proposed method creates large set of offspring or new feature subsets and then retains the most promising feature subset. Experimental results demonstrate that the proposed method can identify feature subsets giving good multi-label classification accuracy much faster than conventional methods.
https://doi.org/10.9708/jksci.2018.23.09.007 인용 PDF KSCI

Search Result 1,156, Processing Time 0.033 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)