• Title/Summary/Keyword: neighbor selection

Search Result 130, Processing Time 0.018 seconds

Calculating Attribute Weights in K-Nearest Neighbor Algorithms using Information Theory (정보이론을 이용한 K-최근접 이웃 알고리즘에서의 속성 가중치 계산)

  • Lee Chang-Hwan
    • Journal of KIISE:Software and Applications
    • /
    • v.32 no.9
    • /
    • pp.920-926
    • /
    • 2005
  • Nearest neighbor algorithms classify an unseen input instance by selecting similar cases and use the discovered membership to make predictions about the unknown features of the input instance. The usefulness of the nearest neighbor algorithms have been demonstrated sufficiently in many real-world domains. In nearest neighbor algorithms, it is an important issue to assign proper weights to the attributes. Therefore, in this paper, we propose a new method which can automatically assigns to each attribute a weight of its importance with respect to the target attribute. The method has been implemented as a computer program and its effectiveness has been tested on a number of machine learning databases publicly available.

A Neighbor Selection Technique for Improving Efficiency of Local Search in Load Balancing Problems (부하평준화 문제에서 국지적 탐색의 효율향상을 위한 이웃해 선정 기법)

  • 강병호;조민숙;류광렬
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.2
    • /
    • pp.164-172
    • /
    • 2004
  • For a local search algorithm to find a bettor quality solution it is required to generate and evaluate a sufficiently large number of candidate solutions as neighbors at each iteration, demanding quite an amount of CPU time. This paper presents a method of selectively generating only good-looking candidate neighbors, so that the number of neighbors can be kept low to improve the efficiency of search. In our method, a newly generated candidate solution is probabilistically selected to become a neighbor based on the quality estimation determined heuristically by a very simple evaluation of the generated candidate. Experimental results on the problem of load balancing for production scheduling have shown that our candidate selection method outperforms other random or greedy selection methods in terms of solution quality given the same amount of CPU time.

Study on failure mode prediction of reinforced concrete columns based on class imbalanced dataset

  • Mingyi Cai;Guangjun Sun;Bo Chen
    • Earthquakes and Structures
    • /
    • v.27 no.3
    • /
    • pp.177-189
    • /
    • 2024
  • Accurately predicting the failure modes of reinforced concrete (RC) columns is essential for structural design and assessment. In this study, the challenges of imbalanced datasets and complex feature selection in machine learning (ML) methods were addressed through an optimized ML approach. By combining feature selection and oversampling techniques, the prediction of seismic failure modes in rectangular RC columns was improved. Two feature selection methods were used to identify six input parameters. To tackle class imbalance, the Borderline-SMOTE1 algorithm was employed, enhancing the learning capabilities of the models for minority classes. Eight ML algorithms were trained and fine-tuned using k-fold shuffle split cross-validation and grid search. The results showed that the artificial neural network model achieved 96.77% accuracy, while k-nearest neighbor, support vector machine, and random forest models each achieved 95.16% accuracy. The balanced dataset led to significant improvements, particularly in predicting the flexure-shear failure mode, with accuracy increasing by 6%, recall by 8%, and F1 scores by 7%. The use of the Borderline-SMOTE1 algorithm significantly improved the recognition of samples at failure mode boundaries, enhancing the classification performance of models like k-nearest neighbor and decision tree, which are highly sensitive to data distribution and decision boundaries. This method effectively addressed class imbalance and selected relevant features without requiring complex simulations like traditional methods, proving applicable for discerning failure modes in various concrete members under seismic action.

Capacity aware Scalable Video Coding in P2P on Demand Streaming Systems

  • Xing, Changyou;Chen, Ming;Hu, Chao
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.7 no.9
    • /
    • pp.2268-2283
    • /
    • 2013
  • Scalable video coding can handle peer heterogeneity of P2P streaming applications, but there is still a lack of comprehensive studies on how to use it to improve video playback quality. In this paper we propose a capacity aware scalable video coding mechanism for P2P on demand streaming system. The proposed mechanism includes capacity based neighbor selection, adaptive data scheduling and streaming layer adjustment, and can enable each peer to select appropriate streaming layers and acquire streaming chunks with proper sequence, along with choosing specific peers to provide them. Simulation results show that the presented mechanism can decrease the system's startup and playback delay, and increase the video playback quality as well as playback continuity, and thus it provides a better quality of experience for users.

Nearest Neighbor Based Prototype Classification Preserving Class Regions

  • Hwang, Doosung;Kim, Daewon
    • Journal of Information Processing Systems
    • /
    • v.13 no.5
    • /
    • pp.1345-1357
    • /
    • 2017
  • A prototype selection method chooses a small set of training points from a whole set of class data. As the data size increases, the selected prototypes play a significant role in covering class regions and learning a discriminate rule. This paper discusses the methods for selecting prototypes in a classification framework. We formulate a prototype selection problem into a set covering optimization problem in which the sets are composed with distance metric and predefined classes. The formulation of our problem makes us draw attention only to prototypes per class, not considering the other class points. A training point becomes a prototype by checking the number of neighbors and whether it is preselected. In this setting, we propose a greedy algorithm which chooses the most relevant points for preserving the class dominant regions. The proposed method is simple to implement, does not have parameters to adapt, and achieves better or comparable results on both artificial and real-world problems.

A Refined Neighbor Selection Algorithm for Clustering-Based Collaborative Filtering (클러스터링기반 협동적필터링을 위한 정제된 이웃 선정 알고리즘)

  • Kim, Taek-Hun;Yang, Sung-Bong
    • The KIPS Transactions:PartD
    • /
    • v.14D no.3 s.113
    • /
    • pp.347-354
    • /
    • 2007
  • It is not easy for the customers to search the valuable information on the goods among countless items available in the Internet. In order to save time and efforts in searching the goods the customers want, it is very important for a recommender system to have a capability to predict accurately customers' preferences. In this paper we present a refined neighbor selection algorithm for clustering based collaborative filtering in recommender systems. The algorithm exploits a graph approach and searches more efficiently for set of influential customers with respect to a given customer; it searches with concepts of weighted similarity and ranked clustering. The experimental results show that the recommender systems using the proposed method find the proper neighbors and give a good prediction quality.

Image Feature Point Selection Method Using Nearest Neighbor Distance Ratio Matching (최인접 거리 비율 정합을 이용한 영상 특징점 선택 방법)

  • Lee, Jun-Woo;Jeong, Jea-Hyup;Kang, Jong-Wook;Na, Sang-Il;Jeong, Dong-Seok
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.49 no.12
    • /
    • pp.124-130
    • /
    • 2012
  • In this paper, we propose a feature point selection method for MPEG CDVS CE-7 which is processing on International Standard task. Among a large number of extracted feature points, more important feature points which is used in image matching should be selected for the compactness of image descriptor. The proposed method is that remove the feature point in the extraction phase which is filtered by nearest neighbor distance ratio matching in the matching phase. We can avoid the waste of the feature point and employ additional feature points by the proposed method. The experimental results show that our proposed method can obtain true positive rate improvement about 2.3% in pair-wise matching test compared with Test Model.

A Design of HPPS(Hybrid Preference Prediction System) for Customer-Tailored Service (고객 맞춤 서비스를 위한 HPPS(Hybrid Preference Prediction System) 설계)

  • Jeong, Eun-Hee;Lee, Byung-Kwan
    • Journal of Korea Multimedia Society
    • /
    • v.14 no.11
    • /
    • pp.1467-1477
    • /
    • 2011
  • This paper proposes a HPPS(Hybrid Preference Prediction System) design using the analysis of user profile and of the similarity among users precisely to predict the preference for custom-tailored service. Contrary to the existing NBCFA(Neighborhood Based Collaborative Filtering Algorithm), this paper is designed using these following rules. First, if there is no neighbor's commodity rating value in a preference prediction formula, this formula uses the rating average value for a commodity. Second, this formula reflects the weighting value through the analysis of a user's characteristics. Finally, when the nearest neighbor is selected, we consider the similarity, the commodity rating, and the rating frequency. Therefore, the first and second preference prediction formula made HPPS improve the precision by 97.24%, and the nearest neighbor selection method made HPPS improve the precision by 75%, compared with the existing NBCFA.

Centroid and Nearest Neighbor based Class Imbalance Reduction with Relevant Feature Selection using Ant Colony Optimization for Software Defect Prediction

  • B., Kiran Kumar;Gyani, Jayadev;Y., Bhavani;P., Ganesh Reddy;T, Nagasai Anjani Kumar
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.10
    • /
    • pp.1-10
    • /
    • 2022
  • Nowadays software defect prediction (SDP) is most active research going on in software engineering. Early detection of defects lowers the cost of the software and also improves reliability. Machine learning techniques are widely used to create SDP models based on programming measures. The majority of defect prediction models in the literature have problems with class imbalance and high dimensionality. In this paper, we proposed Centroid and Nearest Neighbor based Class Imbalance Reduction (CNNCIR) technique that considers dataset distribution characteristics to generate symmetry between defective and non-defective records in imbalanced datasets. The proposed approach is compared with SMOTE (Synthetic Minority Oversampling Technique). The high-dimensionality problem is addressed using Ant Colony Optimization (ACO) technique by choosing relevant features. We used nine different classifiers to analyze six open-source software defect datasets from the PROMISE repository and seven performance measures are used to evaluate them. The results of the proposed CNNCIR method with ACO based feature selection reveals that it outperforms SMOTE in the majority of cases.

Prototype based Classification by Generating Multidimensional Spheres per Class Area (클래스 영역의 다차원 구 생성에 의한 프로토타입 기반 분류)

  • Shim, Seyong;Hwang, Doosung
    • Journal of the Korea Society of Computer and Information
    • /
    • v.20 no.2
    • /
    • pp.21-28
    • /
    • 2015
  • In this paper, we propose a prototype-based classification learning by using the nearest-neighbor rule. The nearest-neighbor is applied to segment the class area of all the training data into spheres within which the data exist from the same class. Prototypes are the center of spheres and their radii are computed by the mid-point of the two distances to the farthest same class point and the nearest another class point. And we transform the prototype selection problem into a set covering problem in order to determine the smallest set of prototypes that include all the training data. The proposed prototype selection method is based on a greedy algorithm that is applicable to the training data per class. The complexity of the proposed method is not complicated and the possibility of its parallel implementation is high. The prototype-based classification learning takes up the set of prototypes and predicts the class of test data by the nearest neighbor rule. In experiments, the generalization performance of our prototype classifier is superior to those of the nearest neighbor, Bayes classifier, and another prototype classifier.