• Title/Summary/Keyword: Nearest neighbor algorithm

Search Result 334, Processing Time 0.024 seconds

Customer Relationship Management in Telecom Market using an Optimized Case-based Reasoning (최적화 사례기반추론을 이용한 통신시장 고객관계관리)

  • An, Hyeon-Cheol;Kim, Gyeong-Jae
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2006.11a
    • /
    • pp.285-288
    • /
    • 2006
  • Most previous studies on improving the effectiveness of CBR have focused on the similarity function aspect or optimization of case features and their weights. However, according to some of the prior research, finding the optimal k parameter for the k-nearest neighbor (k-NN) is also crucial for improving the performance of the CBR system. Nonetheless, there have been few attempts to optimize the number of neighbors, especially using artificial intelligence (AI) techniques. In this study, we introduce a genetic algorithm (GA) to optimize the number of neighbors that combine, as well as the weight of each feature. The new model is applied to the real-world case of a major telecommunication company in Korea in order to build the prediction model for the customer profitability level. Experimental results show that our GA-optimized CBR approach outperforms other AI techniques for this mulriclass classification problem.

  • PDF

A study on Web-based Video Panoramic Virtual Reality for Hose Cyber Shell Museum (비디오 파노라마 가상현실을 기반으로 하는 호서 사이버 패류 박물관의 연구)

  • Hong, Sung-Soo;khan, Irfan;Kim, Chang-ki
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2012.11a
    • /
    • pp.1468-1471
    • /
    • 2012
  • It is always a dream to recreate the experience of a particular place, the Panorama Virtual Reality has been interpreted as a kind of technology to create virtual environments and the ability to maneuver angle for and select the path of view in a dynamic scene. In this paper we examined an efficient algorithm for Image registration and stitching of captured imaged from a video stream. Two approaches are studied in this paper. First, dynamic programming is used to spot the ideal key points, match these points to merge adjacent images together, later image blending is use for smooth color transitions. In second approach, FAST and SURF detection are used to find distinct features in the images and a nearest neighbor algorithm is used to match corresponding features, estimate homography with matched key points using RANSAC. The paper also covers the automatically choosing (recognizing, comparing) images to stitching method.

An Improvement of Finding Neighbors in Flocking Behaviors by Using a Simple Heuristic (단순한 휴리스틱을 사용하여 무리 짓기에서 이웃 에이전트 탐색방법의 성능 개선)

  • Jiang, Zi Shun;Lee, Jae-Moon
    • Journal of Korea Game Society
    • /
    • v.11 no.5
    • /
    • pp.23-30
    • /
    • 2011
  • Flocking behaviors are frequently used in games and computer graphics for realistic simulation of massive crowds. Since simulation of massive crowds in real time is a computationally intensive task, there were many researches on efficient algorithm. In this paper, we find experimentally the fact that there are unnecessary computations in the previous efficient flocking algorithm, and propose a noble algorithm that overcomes the weakness of the previous algorithm with a simple heuristic. A number of experiments were conducted to evaluate the performance of the proposed algorithm. The experimental results showed that the proposed algorithm outperformed the previous efficient algorithm by about 21% on average.

Medical Diagnosis Problem Solving Based on the Combination of Genetic Algorithms and Local Adaptive Operations (유전자 알고리즘 및 국소 적응 오퍼레이션 기반의 의료 진단 문제 자동화 기법 연구)

  • Lee, Ki-Kwang;Han, Chang-Hee
    • Journal of Intelligence and Information Systems
    • /
    • v.14 no.2
    • /
    • pp.193-206
    • /
    • 2008
  • Medical diagnosis can be considered a classification task which classifies disease types from patient's condition data represented by a set of pre-defined attributes. This study proposes a hybrid genetic algorithm based classification method to develop classifiers for multidimensional pattern classification problems related with medical decision making. The classification problem can be solved by identifying separation boundaries which distinguish the various classes in the data pattern. The proposed method fits a finite number of regional agents to the data pattern by combining genetic algorithms and local adaptive operations. The local adaptive operations of an agent include expansion, avoidance and relocation, one of which is performed according to the agent's fitness value. The classifier system has been tested with well-known medical data sets from the UCI machine learning database, showing superior performance to other methods such as the nearest neighbor, decision tree, and neural networks.

  • PDF

An Algorithms for Tournament-based Big Data Analysis (토너먼트 기반의 빅데이터 분석 알고리즘)

  • Lee, Hyunjin
    • Journal of Digital Contents Society
    • /
    • v.16 no.4
    • /
    • pp.545-553
    • /
    • 2015
  • While all of the data has a value in itself, most of the data that is collected in the real world is a random and unstructured. In order to extract useful information from the data, it is need to use the data transform and analysis algorithms. Data mining is used for this purpose. Today, there is not only need for a variety of data mining techniques to analyze the data but also need for a computational requirements and rapid analysis time for huge volume of data. The method commonly used to store huge volume of data is to use the hadoop. A method for analyzing data in hadoop is to use the MapReduce framework. In this paper, we developed a tournament-based MapReduce method for high efficiency in developing an algorithm on a single machine to the MapReduce framework. This proposed method can apply many analysis algorithms and we showed the usefulness of proposed tournament based method to apply frequently used data mining algorithms k-means and k-nearest neighbor classification.

Design of an Efficient Parallel High-Dimensional Index Structure (효율적인 병렬 고차원 색인구조 설계)

  • Park, Chun-Seo;Song, Seok-Il;Sin, Jae-Ryong;Yu, Jae-Su
    • Journal of KIISE:Databases
    • /
    • v.29 no.1
    • /
    • pp.58-71
    • /
    • 2002
  • Generally, multi-dimensional data such as image and spatial data require large amount of storage space. There is a limit to store and manage those large amount of data in single workstation. If we manage the data on parallel computing environment which is being actively researched these days, we can get highly improved performance. In this paper, we propose a parallel high-dimensional index structure that exploits the parallelism of the parallel computing environment. The proposed index structure is nP(processor)-n$\times$mD(disk) architecture which is the hybrid type of nP-nD and lP-nD. Its node structure increases fan-out and reduces the height of a index tree. Also, A range search algorithm that maximizes I/O parallelism is devised, and it is applied to K-nearest neighbor queries. Through various experiments, it is shown that the proposed method outperforms other parallel index structures.

Biometrics Based on Multi-View Features of Teeth Using Principal Component Analysis (주성분분석을 이용한 치아의 다면 특징 기반 생체식별)

  • Chang, Chan-Wuk;Kim, Myung-Su;Shin, Young-Suk
    • Korean Journal of Cognitive Science
    • /
    • v.18 no.4
    • /
    • pp.445-455
    • /
    • 2007
  • We present a new biometric identification system based on multi-view features of teeth using principal components analysis(PCA). The multi-view features of teeth consist of the frontal view, the left side view and the right side view. In this paper, we try to stan the foundations of a dental biometrics for secure access in real life environment. We took the pictures of the three views teeth in the experimental environment designed specially and 42 principal components as the features for individual identification were developed. The classification for individual identification based on the nearest neighbor(NN) algorithm is created with the distance between the multi-view teeth and the multi-view teeth rotated. The identification performance after rotating two degree of test data is 95.2% on the left side view teeth and 91.3% on the right side view teeth as the average values.

  • PDF

Enhancing Classification Performance of Temporal Keyword Data by Using Moving Average-based Dynamic Time Warping Method (이동 평균 기반 동적 시간 와핑 기법을 이용한 시계열 키워드 데이터의 분류 성능 개선 방안)

  • Jeong, Do-Heon
    • Journal of the Korean Society for information Management
    • /
    • v.36 no.4
    • /
    • pp.83-105
    • /
    • 2019
  • This study aims to suggest an effective method for the automatic classification of keywords with similar patterns by calculating pattern similarity of temporal data. For this, large scale news on the Web were collected and time series data composed of 120 time segments were built. To make training data set for the performance test of the proposed model, 440 representative keywords were manually classified according to 8 types of trend. This study introduces a Dynamic Time Warping(DTW) method which have been commonly used in the field of time series analytics, and proposes an application model, MA-DTW based on a Moving Average(MA) method which gives a good explanation on a tendency of trend curve. As a result of the automatic classification by a k-Nearest Neighbor(kNN) algorithm, Euclidean Distance(ED) and DTW showed 48.2% and 66.6% of maximum micro-averaged F1 score respectively, whereas the proposed model represented 74.3% of the best micro-averaged F1 score. In all respect of the comprehensive experiments, the suggested model outperformed the methods of ED and DTW.

A Fast Motion Estimation Scheme using Spatial and Temporal Characteristics (시공간 특성을 이용한 고속 움직임 백터 예측 방법)

  • 노대영;장호연;오승준;석민수
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.40 no.4
    • /
    • pp.237-247
    • /
    • 2003
  • The Motion Estimation (ME) process is an important part of a video encoding systems since they can significantly reduce bitrate with keeping the output quality of an encoded sequence. Unfortunately this process may dominate the encoding time using straightforward full search algorithm (FS). Up to now, many fast algorithms can reduce the computation complexity by limiting the number of searching locations. This is accomplished at the expense of less accuracy of motion estimation. In this paper, we introduce a new fast motion estimation method based on the spatio-temporal correlation of adjacent blocks. A reliable predicted motion vector (RPMV) is defined. The reliability of RPMV is shown on the basis of motion vectors achieved by FS. The scalar and the direction of RPMV are used in our proposed scheme. The experimental results show that the proposed method Is about l1~14% faster than the nearest neighbor method which is a wellknown conventional fast scheme.

A Scalable Clustering Method for Categorical Sequences (범주형 시퀀스들에 대한 확장성 있는 클러스터링 방법)

  • Oh, Seung-Joon;Kim, Jae-Yearn
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.14 no.2
    • /
    • pp.136-141
    • /
    • 2004
  • There has been enormous growth in the amount of commercial and scientific data, such as retail transactions, protein sequences, and web-logs. Such datasets consist of sequence data that have an inherent sequential nature. However, few clustering algorithms consider sequentiality. In this paper, we study how to cluster sequence datasets. We propose a new similarity measure to compute the similarity between two sequences. We also present an efficient method for determining the similarity measure and develop a clustering algorithm. Due to the high computational complexity of hierarchical clustering algorithms for clustering large datasets, a new clustering method is required. Therefore, we propose a new scalable clustering method using sampling and a k-nearest-neighbor method. Using a real dataset and a synthetic dataset, we show that the quality of clusters generated by our proposed approach is better than that of clusters produced by traditional algorithms.