• Title/Summary/Keyword: k-Nearest neighbors

Search Result 209, Processing Time 0.029 seconds

An Efficient Adaptive Bitmap-based Selective Tuning Scheme for Spatial Queries in Broadcast Environments

  • Song, Doo-Hee;Park, Kwang-Jin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.5 no.10
    • /
    • pp.1862-1878
    • /
    • 2011
  • With the advances in wireless communication technology and the advent of smartphones, research on location-based services (LBSs) is being actively carried out. In particular, several spatial index methods have been proposed to provide efficient LBSs. However, finding an optimal indexing method that balances query performance and index size remains a challenge in the case of wireless environments that have limited channel bandwidths and device resources (computational power, memory, and battery power). Thus, mechanisms that make existing spatial indexing techniques more efficient and highly applicable in resource-limited environments should be studied. Bitmap-based Spatial Indexing (BSI) has been designed to support LBSs, especially in wireless broadcast environments. However, the access latency in BSI is extremely large because of the large size of the bitmap, and this may lead to increases in the search time. In this paper, we introduce a Selective Bitmap-based Spatial Indexing (SBSI) technique. Then, we propose an Adaptive Bitmap-based Spatial Indexing (ABSI) to improve the tuning time in the proposed SBSI scheme. The ABSI is applied to the distribution of geographical objects in a grid by using the Hilbert curve (HC). With the information in the ABSI, grid cells that have no objects placed, (i.e., 0-bit information in the spatial bitmap index) are not tuned during a search. This leads to an improvement in the tuning time on the client side. We have carried out a performance evaluation and demonstrated that our SBSI and ABSI techniques outperform the existing bitmap-based DSI (B DSI) technique.

A Kinematic Approach to Answering Similarity Queries on Complex Human Motion Data (운동학적 접근 방법을 사용한 복잡한 인간 동작 질의 시스템)

  • Han, Hyuck;Kim, Shin-Gyu;Jung, Hyung-Soo;Yeom, Heon-Y.
    • Journal of Internet Computing and Services
    • /
    • v.10 no.4
    • /
    • pp.1-11
    • /
    • 2009
  • Recently there has arisen concern in both the database community and the graphics society about data retrieval from large motion databases because the high dimensionality of motion data implies high costs. In this circumstance, finding an effective distance measure and an efficient query processing method for such data is a challenging problem. This paper presents an elaborate motion query processing system, SMoFinder (Similar Motion Finder), which incorporates a novel kinematic distance measure and an efficient indexing strategy via adaptive frame segmentation. To this end, we regard human motions as multi-linkage kinematics and propose the weighted Minkowski distance metric. For efficient indexing, we devise a new adaptive segmentation method that chooses representative frames among similar frames and stores chosen frames instead of all frames. For efficient search, we propose a new search method that processes k-nearest neighbors queries over only representative frames. Our experimental results show that the size of motion databases is reduced greatly (${\times}1/25$) but the search capability of SMoFinder is equal to or superior to that of other systems.

  • PDF

Classifying Indian Medicinal Leaf Species Using LCFN-BRNN Model

  • Kiruba, Raji I;Thyagharajan, K.K;Vignesh, T;Kalaiarasi, G
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.10
    • /
    • pp.3708-3728
    • /
    • 2021
  • Indian herbal plants are used in agriculture and in the food, cosmetics, and pharmaceutical industries. Laboratory-based tests are routinely used to identify and classify similar herb species by analyzing their internal cell structures. In this paper, we have applied computer vision techniques to do the same. The original leaf image was preprocessed using the Chan-Vese active contour segmentation algorithm to efface the background from the image by setting the contraction bias as (v) -1 and smoothing factor (µ) as 0.5, and bringing the initial contour close to the image boundary. Thereafter the segmented grayscale image was fed to a leaky capacitance fired neuron model (LCFN), which differentiates between similar herbs by combining different groups of pixels in the leaf image. The LFCN's decay constant (f), decay constant (g) and threshold (h) parameters were empirically assigned as 0.7, 0.6 and h=18 to generate the 1D feature vector. The LCFN time sequence identified the internal leaf structure at different iterations. Our proposed framework was tested against newly collected herbal species of natural images, geometrically variant images in terms of size, orientation and position. The 1D sequence and shape features of aloe, betel, Indian borage, bittergourd, grape, insulin herb, guava, mango, nilavembu, nithiyakalyani, sweet basil and pomegranate were fed into the 5-fold Bayesian regularization neural network (BRNN), K-nearest neighbors (KNN), support vector machine (SVM), and ensemble classifier to obtain the highest classification accuracy of 91.19%.

Prediction Model of CNC Processing Defects Using Machine Learning (머신러닝을 이용한 CNC 가공 불량 발생 예측 모델)

  • Han, Yong Hee
    • Journal of the Korea Convergence Society
    • /
    • v.13 no.2
    • /
    • pp.249-255
    • /
    • 2022
  • This study proposed an analysis framework for real-time prediction of CNC processing defects using machine learning-based models that are recently attracting attention as processing defect prediction methods, and applied it to CNC machines. Analysis shows that the XGBoost, CatBoost, and LightGBM models have the same best accuracy, precision, recall, F1 score, and AUC, of which the LightGBM model took the shortest execution time. This short run time has practical advantages such as reducing actual system deployment costs, reducing the probability of CNC machine damage due to rapid prediction of defects, and increasing overall CNC machine utilization, confirming that the LightGBM model is the most effective machine learning model for CNC machines with only basic sensors installed. In addition, it was confirmed that classification performance was maximized when an ensemble model consisting of LightGBM, ExtraTrees, k-Nearest Neighbors, and logistic regression models was applied in situations where there are no restrictions on execution time and computing power.

Molecular Dynamics Simulation Studies of Benzene, Toluene, and p-Xylene in NpT Ensemble: Thermodynamic, Structural, and Dynamic Properties

  • Kim, Ja-Hun;Lee, Song-Hi
    • Bulletin of the Korean Chemical Society
    • /
    • v.23 no.3
    • /
    • pp.447-453
    • /
    • 2002
  • In this paper we have presented the results of thermodynamic, structural, and dynamic properties of model systems for liquid benzene, toluene and p-xylene in an isobaric-isothermal (NpT) ensemble at 283.15, 303.15, 323.15, and 343.15 K using molecular dynamics (MD) simulation. This work is initiated to compensate for our previous canonical (NVT) ensemble MD simulations [Bull. Kor. Chem. Soc. 2001, 23, 441] for the same systems in which the calculated pressures were too low. The calculated pressures in the NpT ensemble MD simulations are close to 1 atm and the volume of each system increases with increasing temperature. The first and second peaks in the center of mass g(r) diminish gradually and the minima increase as usual for the three liquids as the temperature increases. The three peaks of the site-site gC-C(r) at 283.15 K support the perpendicular structure of nearest neighbors in liquid benzene. Two self-diffusion coefficients of liquid benzene via the Einstein equation and via the Green-Kubo relation are in excellent agreement with the experimental measures. The self-diffusion coefficients of liquid toluene and p-xylene are in accord with the trend that the self-diffusion coefficient decreases with increasing number of methyl group. The friction constants calculated from the force auto-correlation (FAC) function with the assumption that the fast random force correlation ends at time which the FAC has the first negative value give a correct qualitative trends: decrease with increase of temperature and increase with the number of methyl group. The friction constants calculated from the FAC's are always less than those obtained from the friction-diffusion relation which reflects that the random FAC decays slower than the total FAC as described by Kubo [Rep. Prog. Phys. 1966, 29, 255].

An Analysis Scheme Design of Customer Spending Pattern using Text Mining (텍스트 마이닝을 이용한 소비자 소비패턴 분석 기법 설계)

  • Jeong, Eun-Hee;Lee, Byung-Kwan
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.11 no.2
    • /
    • pp.181-188
    • /
    • 2018
  • In this paper, we propose an analysis scheme of customer spending pattern using text mining. In proposed consumption pattern analysis scheme, first we analyze user's rating similarity using Pearson correlation, second we analyze user's review similarity using TF-IDF cosine similarity, third we analyze the consistency of the rating and review using Sendiwordnet. And we select the nearest neighbors using rating similarity and review similarity, and provide the recommended list that is proper with consumption pattern. The precision of recommended list are 0.79 for the Pearson correlation, 0.73 for the TF-IDF, and 0.82 for the proposed consumption pattern. That is, the proposed consumption pattern analysis scheme can more accurately analyze consumption pattern because it uses both quantitative rating and qualitative reviews of consumers.

Supervised Rank Normalization for Support Vector Machines (SVM을 위한 교사 랭크 정규화)

  • Lee, Soojong;Heo, Gyeongyong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.18 no.11
    • /
    • pp.31-38
    • /
    • 2013
  • Feature normalization as a pre-processing step has been widely used in classification problems to reduce the effect of different scale in each feature dimension and error as a result. Most of the existing methods, however, assume some distribution function on feature distribution. Even worse, existing methods do not use the labels of data points and, as a result, do not guarantee the optimality of the normalization results in classification. In this paper, proposed is a supervised rank normalization which combines rank normalization and a supervised learning technique. The proposed method does not assume any feature distribution like rank normalization and uses class labels of nearest neighbors in classification to reduce error. SVM, in particular, tries to draw a decision boundary in the middle of class overlapping zone, the reduction of data density in that area helps SVM to find a decision boundary reducing generalized error. All the things mentioned above can be verified through experimental results.

Performance Comparison of Machine Learning in the Various Kind of Prediction (다양한 종류의 예측에서 머신러닝 성능 비교)

  • Park, Gwi-Man;Bae, Young-Chul
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.14 no.1
    • /
    • pp.169-178
    • /
    • 2019
  • Now a day, we can perform various predictions by applying machine learning, which is a field of artificial intelligence; however, the finding of best algorithm in the field is always the problem. This paper predicts monthly power trading amount, monthly power trading amount of money, monthly index of production extension, final consumption of energy, and diesel for automotive using machine learning supervised algorithms. Then, we find most fit algorithm among them for each case. To do this we show the probability of predicting the value for monthly power trading amount and monthly power trading amount of money, monthly index of production extension, final consumption of energy, and diesel for automotive. Then, we try to average each predicting values. Finally, we confirm which algorithm is the most superior algorithm among them.

Variational Bayesian multinomial probit model with Gaussian process classification on mice protein expression level data (가우시안 과정 분류에 대한 변분 베이지안 다항 프로빗 모형: 쥐 단백질 발현 데이터에의 적용)

  • Donghyun Son;Beom Seuk Hwang
    • The Korean Journal of Applied Statistics
    • /
    • v.36 no.2
    • /
    • pp.115-127
    • /
    • 2023
  • Multinomial probit model is a popular model for multiclass classification and choice model. Markov chain Monte Carlo (MCMC) method is widely used for estimating multinomial probit model, but its computational cost is high. However, it is well known that variational Bayesian approximation is more computationally efficient than MCMC, because it uses subsets of samples. In this study, we describe multinomial probit model with Gaussian process classification and how to employ variational Bayesian approximation on the model. This study also compares the results of variational Bayesian multinomial probit model to the results of naive Bayes, K-nearest neighbors and support vector machine for the UCI mice protein expression level data.

Personalized Size Recommender System for Online Apparel Shopping: A Collaborative Filtering Approach

  • Dongwon Lee
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.8
    • /
    • pp.39-48
    • /
    • 2023
  • This study was conducted to provide a solution to the problem of sizing errors occurring in online purchases due to discrepancies and non-standardization in clothing sizes. This paper discusses an implementation approach for a machine learning-based recommender system capable of providing personalized sizes to online consumers. We trained multiple validated collaborative filtering algorithms including Non-Negative Matrix Factorization (NMF), Singular Value Decomposition (SVD), k-Nearest Neighbors (KNN), and Co-Clustering using purchasing data derived from online commerce and compared their performance. As a result of the study, we were able to confirm that the NMF algorithm showed superior performance compared to other algorithms. Despite the characteristic of purchase data that includes multiple buyers using the same account, the proposed model demonstrated sufficient accuracy. The findings of this study are expected to contribute to reducing the return rate due to sizing errors and improving the customer experience on e-commerce platforms.