• Title/Summary/Keyword: K-nearest neighbor algorithm

Search Result 265, Processing Time 0.028 seconds

Text-independent Speaker Identification Using Soft Bag-of-Words Feature Representation

  • Jiang, Shuangshuang;Frigui, Hichem;Calhoun, Aaron W.
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.14 no.4
    • /
    • pp.240-248
    • /
    • 2014
  • We present a robust speaker identification algorithm that uses novel features based on soft bag-of-word representation and a simple Naive Bayes classifier. The bag-of-words (BoW) based histogram feature descriptor is typically constructed by summarizing and identifying representative prototypes from low-level spectral features extracted from training data. In this paper, we define a generalization of the standard BoW. In particular, we define three types of BoW that are based on crisp voting, fuzzy memberships, and possibilistic memberships. We analyze our mapping with three common classifiers: Naive Bayes classifier (NB); K-nearest neighbor classifier (KNN); and support vector machines (SVM). The proposed algorithms are evaluated using large datasets that simulate medical crises. We show that the proposed soft bag-of-words feature representation approach achieves a significant improvement when compared to the state-of-art methods.

Design of k-Nearest Neighbor Query Processing Algorithm Based on Order-Preserving Encryption (순서 유지 암호화 기반의 k-최근접 질의처리 알고리즘 설계)

  • Kim, Yong-Ki;Choi, KiSeok
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2012.11a
    • /
    • pp.1410-1411
    • /
    • 2012
  • 최근 모바일 사용자의 안전한 위치기반 서비스의 사용을 위한 아웃소싱 데이터베이스에서 객체 및 사용자의 위치 정보를 보호하는 연구가 위치 데이터를 보호하기 위한 연구가 활발히 진행되고 있다. 그러나 기존 연구는 불필요한 객체 정보를 요구하기 때문에, 높은 질의 처리 시간을 지니는 단점을 지닌다. 이러한 문제점을 해결하기 위해, 본 논문에서는 기준 POI를 중심으로 객체의 방향성 정보와 변환된 거리를 이용하여, 사용자와 객체의 정보를 보호하는 k-최근접 질의처리 알고리즘을 제안한다.

Assessment of Forest Biomass using k-Neighbor Techniques - A Case Study in the Research Forest at Kangwon National University - (k-NN기법을 이용한 산림바이오매스 자원량 평가 - 강원대학교 학술림을 대상으로 -)

  • Seo, Hwanseok;Park, Donghwan;Yim, Jongsu;Lee, Jungsoo
    • Journal of Korean Society of Forest Science
    • /
    • v.101 no.4
    • /
    • pp.547-557
    • /
    • 2012
  • This study purposed to estimate the forest biomass using k-Nearest Neighbor (k-NN) algorithm. Multiple data sources were used for the analysis such as forest type map, field survey data and Landsat TM data. The accuracy of forest biomass was evaluated with the forest stratification, horizontal reference area (HRA) and spatial filtering. Forests were divided into 3 types such as conifers, broadleaved, and Korean pine (Pinus koriansis) forests. The applied radii of HRA were 4 km, 5 km and 10 km, respectively. The estimated biomass and mean bias for conifers forest was 222 t/ha and 1.8 t/ha when the value of k=8, the radius of HRA was 4 km, and $5{\times}5$ modal was filtered. The estimated forest biomass of Korean pine was 245 t/ha when the value of k=8, the radius of HRA was 4km. The estimated mean biomass and mean bias for broadleaved forests were 251 t/ha and -1.6 t/ha, respectively, when the value of k=6, the radius of HRA was 10 km. The estimated total forest biomass by k-NN method was 799,000t and 237 t/ha. The estimated mean biomass by ${\kappa}NN$method was about 1t/ha more than that of filed survey data.

Clustering Techniques for XML Data Using Data Mining

  • Kim, Chun-Sik
    • Proceedings of the CALSEC Conference
    • /
    • 2005.03a
    • /
    • pp.189-194
    • /
    • 2005
  • Many studies have been conducted to classify documents, and to extract useful information from documents. However, most search engines have used a keyword based method. This method does not search and classify documents effectively. This paper identifies structures of XML document based on the fact that the XML document has a structural document using a set theory, which is suggested by Broder, and attempts a test for clustering XML document by applying a k-nearest neighbor algorithm. In addition, this study investigates the effectiveness of the clustering technique for large scaled data, compared to the existing bitmap method, by applying a test, which reveals a difference between the clause based documents instead of using a type of vector, in order to measure the similarity between the existing methods.

  • PDF

Gesture Recognition Using Higher Correlation Feature Information and PCA

  • Kim, Jong-Min;Lee, Kee-Jun
    • Journal of Integrative Natural Science
    • /
    • v.5 no.2
    • /
    • pp.120-126
    • /
    • 2012
  • This paper describes the algorithm that lowers the dimension, maintains the gesture recognition and significantly reduces the eigenspace configuration time by combining the higher correlation feature information and Principle Component Analysis. Since the suggested method doesn't require a lot of computation than the method using existing geometric information or stereo image, the fact that it is very suitable for building the real-time system has been proved through the experiment. In addition, since the existing point to point method which is a simple distance calculation has many errors, in this paper to improve recognition rate the recognition error could be reduced by using several successive input images as a unit of recognition with K-Nearest Neighbor which is the improved Class to Class method.

An Efficient Distributed Nearest Neighbor Heuristic for the Traveling Salesman Problem (외판원 문제를 위한 효율적인 분산 최근접 휴리스틱 알고리즘)

  • Kim, Jung-Sook;Lee, Hee-Young
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2000.10b
    • /
    • pp.1373-1376
    • /
    • 2000
  • 외판원 문제(Traveling Salesman Problem)는 주어진 n개의 도시들과 그 도시들간의 거리 비용이 주어졌을 매, 처음 출발도시에서부터 정확히 한 도시는 한 번씩만 방문하여 다시 출발도시로 돌아오면서 방문한 도시들을 연결하는 최소의 비용이 드는 경로를 찾는 문제로 최적해(optimal value)를 구하는 것은 전형적인 NP-완전 문제중의 하나이다[2,4,5, 8]. 따라서 이들의 수행시간을 줄이고자 하는 연구가 많이 진행된다. 본 논문에서는 외판원 문제의 최적의 해를 구하는데. 휴리스틱 알고리즘인 최근접 휴리스틱을 이용한다. 물론 수행 시간을 줄이고자 최적화 문제에서 좋은 성능을 보이는 유전 알고리즘 (Genetic Algorithm)으로 얻은 근사해(near optimal)를 초기 분기 함수로 사용하고, 근거리 통신망(Local Area Network)에 기반한 분산 처리 환경에서 여러 프로세서에 분산시켜 병렬성을 살린다.

  • PDF

Systematic Approach for Detecting Text in Images Using Supervised Learning

  • Nguyen, Minh Hieu;Lee, GueeSang
    • International Journal of Contents
    • /
    • v.9 no.2
    • /
    • pp.8-13
    • /
    • 2013
  • Locating text data in images automatically has been a challenging task. In this approach, we build a three stage system for text detection purpose. This system utilizes tensor voting and Completed Local Binary Pattern (CLBP) to classify text and non-text regions. While tensor voting generates the text line information, which is very useful for localizing candidate text regions, the Nearest Neighbor classifier trained on discriminative features obtained by the CLBP-based operator is used to refine the results. The whole algorithm is implemented in MATLAB and applied to all images of ICDAR 2011 Robust Reading Competition data set. Experiments show the promising performance of this method.

An Implementation of the Olfactory Recognition Contents for Ubiquitous (유비쿼터스를 위한 후각 인식 컨텐츠 구현)

  • Lee, Hyeon Gu;Rho, Yong Wan
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.4 no.3
    • /
    • pp.85-90
    • /
    • 2008
  • Recently, with the sensor technology, research about the electronic nose system which imitated the olfactory organ are being pushed actively. But, in case of general electronic nose system, an aroma is measured at the laboratory space where blocked external environment and is analyzed a part of measured data. In this paper, we propose the system which can measure and recognize an aroma in natural environment. We propose the Entropy algorithm which can detect the sensor reaction section among the continuous detection processing about an aroma. And we implement the aroma recognition system using the PCA(Principal Components Analysis) and K-NN(K-Nearest Neighbor) about the detected aroma. In order to evaluate the performance, we measured the aroma pattern, about 9 aroma oil, 50 times respectively. And we experimented the aroma detection and recognition using this. There was an error of 0.2s in the aroma detection and we get 84.3% recognition rate of the aroma recognition.

Developing Web Site for Setting a Price of Accommodation (숙소의 적정 가격 결정을 위한 Web Site 개발)

  • Cho, Kyu Cheol;Roh, Hyun Jin;Song, Woo Hyeon
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2020.07a
    • /
    • pp.247-248
    • /
    • 2020
  • 호스트가 숙소 가격을 정할 때, 기존 숙박 플랫폼들이 제공하는 최적화된 가격을 참고하기 위해선 숙소의 유형, 편의 시설 제공 여부 등 많은 단계를 거쳐야하므로 불편하다. 본 논문은 호스트가 보다 편리하게 자신의 숙소에 최적화된 가격을 알 수 있도록 하는 '숙소의 적정 가격 결정을 위한 웹 사이트'를 개발하였다. 이 웹을 통해 호스트는 더 간편하게 자신의 숙소에 대한 적정 가격을 알고 가격 산정 시 참고할 수 있다.

  • PDF

DGR-Tree : An Efficient Index Structure for POI Search in Ubiquitous Location Based Services (DGR-Tree : u-LBS에서 POI의 검색을 위한 효율적인 인덱스 구조)

  • Lee, Deuk-Woo;Kang, Hong-Koo;Lee, Ki-Young;Han, Ki-Joon
    • Journal of Korea Spatial Information System Society
    • /
    • v.11 no.3
    • /
    • pp.55-62
    • /
    • 2009
  • Location based Services in the ubiquitous computing environment, namely u-LBS, use very large and skewed spatial objects that are closely related to locational information. It is especially essential to achieve fast search, which is looking for POI(Point of Interest) related to the location of users. This paper examines how to search large and skewed POI efficiently in the u-LBS environment. We propose the Dynamic-level Grid based R-Tree(DGR-Tree), which is an index for point data that can reduce the cost of stationary POI search. DGR-Tree uses both R-Tree as a primary index and Dynamic-level Grid as a secondary index. DGR-Tree is optimized to be suitable for point data and solves the overlapping problem among leaf nodes. Dynamic-level Grid of DGR-Tree is created dynamically according to the density of POI. Each cell in Dynamic-level Grid has a leaf node pointer for direct access with the leaf node of the primary index. Therefore, the index access performance is improved greatly by accessing the leaf node directly through Dynamic-level Grid. We also propose a K-Nearest Neighbor(KNN) algorithm for DGR-Tree, which utilizes Dynamic-level Grid for fast access to candidate cells. The KNN algorithm for DGR-Tree provides the mechanism, which can access directly to cells enclosing given query point and adjacent cells without tree traversal. The KNN algorithm minimizes sorting cost about candidate lists with minimum distance and provides NEB(Non Extensible Boundary), which need not consider the extension of candidate nodes for KNN search.

  • PDF