• Title/Summary/Keyword: k-nearest neighbor method

Search Result 313, Processing Time 0.025 seconds

A Nonparametric Procedure for Bioassay by using Conditional Quantile Processes

  • Kim, Ho
    • Communications for Statistical Applications and Methods
    • /
    • v.3 no.3
    • /
    • pp.179-186
    • /
    • 1996
  • Bioequivanence models arise typically in bioassays when new preparations are compared against standard ones by means of responses on some biological organisms. Relative potency measures provide nice interpretations for such bioequivalence and their estimation constitutes the prime interest of such studies. A conditional quantile process based on the k-nearest neighbor method is proposed for this purpose. An alternative procedure based on Kolmogrov-Smirnov type estimator has also been considered along with. ARIC ultrasound data are analyzed as examples.

  • PDF

Forecasting of Motorway Path Travel Time by Using DSRC and TCS Information (DSRC와 TCS 정보를 이용한 고속도로 경로통행시간 예측)

  • Chang, Hyun-ho;Yoon, Byoung-jo
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.37 no.6
    • /
    • pp.1033-1041
    • /
    • 2017
  • Path travel time based on departure time (PTTDP) is key information in advanced traveler information systems (ATIS). Despite the necessity, forecasting PTTDP is still one of challenges which should be successfully conquered in the forecasting area of intelligent transportation systems (ITS). To address this problem effectively, a methodology to dynamically predict PTTDP between motorway interchanges is proposed in this paper. The method was developed based on the relationships between traffic demands at motorway tollgates and PTTDPs between TGs in the motorway network. Two different data were used as the input of the model: traffic demand data and path travel time data are collected by toll collection system (TCS) and dedicated short range communication (DSRC), respectively. The proposed model was developed based on k-nearest neighbor, one of data mining techniques, in order for the real applications of motorway information systems. In a feasible test with real-world data, the proposed method performed effectively by means of prediction reliability and computational running time to the level of real application of current ATIS.

Performance Improvement of Automatic Basal Cell Carcinoma Detection Using Half Hanning Window (Half Hanning 윈도우 전처리를 통한 기저 세포암 자동 검출 성능 개선)

  • Park, Aa-Ron;Baek, Seong-Joong;Min, So-Hee;You, Hong-Yoen;Kim, Jin-Young;Hong, Sung-Hoon
    • The Journal of the Korea Contents Association
    • /
    • v.6 no.12
    • /
    • pp.105-112
    • /
    • 2006
  • In this study, we propose a simple preprocessing method for classification of basal cell carcinoma (BCC), which is one of the most common skin cancer. The preprocessing step consists of data clipping with a half Hanning window and dimension reduction with principal components analysis (PCA). The application of the half Hanning window deemphasizes the peak near $1650cm^{-1}$ and improves classification performance by lowering the false negative ratio. Classification results with various classifiers are presented to show the effectiveness of the proposed method. The classifiers include maximum a posteriori probability (MAP), k-nearest neighbor (KNN), probabilistic neural network (PNN), multilayer perceptron(MLP), support vector machine (SVM) and minimum squared error (MSE) classification. Classification results with KNN involving 216 spectra preprocessed with the proposed method gave 97.3% sensitivity, which is very promising results for automatic BCC detection.

  • PDF

Onion yield estimation using spatial panel regression model (공간 패널 회귀모형을 이용한 양파 생산량 추정)

  • Choi, Sungchun;Baek, Jangsun
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.5
    • /
    • pp.873-885
    • /
    • 2016
  • Onions are grown in a few specific regions of Korea that depend on the climate and the regional characteristic of the production area. Therefore, when onion yields are to be estimated, it is reasonable to use a statistical model in which both the climate and the region are considered simultaneously. In this paper, using a spatial panel regression model, we predicted onion yields with the different weather conditions of the regions. We used the spatial auto regressive (SAR) model that reflects the spatial lag, and panel data of several climate variables for 13 main onion production areas from 2006 to 2015. The spatial weight matrix was considered for the model by the threshold value method and the nearest neighbor method, respectively. Autocorrelation was detected to be significant for the best fitted model using the nearest neighbor method. The random effects model was chosen by the Hausman test, and the significant climate variables of the model were the cumulative duration time of sunshine (January), the average relative humidity (April), the average minimum temperature (June), and the cumulative precipitation (November).

Short-term Traffic States Prediction Using k-Nearest Neighbor Algorithm: Focused on Urban Expressway in Seoul (k-NN 알고리즘을 활용한 단기 교통상황 예측: 서울시 도시고속도로 사례)

  • KIM, Hyungjoo;PARK, Shin Hyoung;JANG, Kitae
    • Journal of Korean Society of Transportation
    • /
    • v.34 no.2
    • /
    • pp.158-167
    • /
    • 2016
  • This study evaluates potential sources of errors in k-NN(k-nearest neighbor) algorithm such as procedures, variables, and input data. Previous research has been thoroughly reviewed for understanding fundamentals of k-NN algorithm that has been widely used for short-term traffic states prediction. The framework of this algorithm commonly includes historical data smoothing, pattern database, similarity measure, k-value, and prediction horizon. The outcomes of this study suggests that: i) historical data smoothing is recommended to reduce random noise of measured traffic data; ii) the historical database should contain traffic state information on both normal and event conditions; and iii) trial and error method can improve the prediction accuracy by better searching for the optimum input time series and k-value. The study results also demonstrates that predicted error increases with the duration of prediction horizon and rapidly changing traffic states.

Pattern Recognition System Combining KNN rules and New Feature Weighting algorithm (KNN 규칙과 새로운 특징 가중치 알고리즘을 결합한 패턴 인식 시스템)

  • Lee Hee-Sung;Kim Euntai;Kim Dongyeon
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.42 no.4 s.304
    • /
    • pp.43-50
    • /
    • 2005
  • This paper proposes a new pattern recognition system combining the new adaptive feature weighting based on the genetic algorithm and the modified KNN(K Nearest-Neighbor) rules. The new feature weighting proposed herein avoids the overfitting and finds the Proper feature weighting value by determining the middle value of weights using GA. New GA operators are introduced to obtain the high performance of the system. Moreover, a class dependent feature weighting strategy is employed. Whilst the classical methods use the same feature space for all classes, the Proposed method uses a different feature space for each class. The KNN rule is modified to estimate the class of test pattern using adaptive feature space. Experiments were performed with the unconstrained handwritten numeral database of Concordia University in Canada to show the performance of the proposed method.

Adaptive Scene Classification based on Semantic Concepts and Edge Detection (시멘틱개념과 에지탐지 기반의 적응형 이미지 분류기법)

  • Jamil, Nuraini;Ahmed, Shohel;Kim, Kang-Seok;Kang, Sang-Jil
    • Journal of Intelligence and Information Systems
    • /
    • v.15 no.2
    • /
    • pp.1-13
    • /
    • 2009
  • Scene classification and concept-based procedures have been the great interest for image categorization applications for large database. Knowing the category to which scene belongs, we can filter out uninterested images when we try to search a specific scene category such as beach, mountain, forest and field from database. In this paper, we propose an adaptive segmentation method for real-world natural scene classification based on a semantic modeling. Semantic modeling stands for the classification of sub-regions into semantic concepts such as grass, water and sky. Our adaptive segmentation method utilizes the edge detection to split an image into sub-regions. Frequency of occurrences of these semantic concepts represents the information of the image and classifies it to the scene categories. K-Nearest Neighbor (k-NN) algorithm is also applied as a classifier. The empirical results demonstrate that the proposed adaptive segmentation method outperforms the Vogel and Schiele's method in terms of accuracy.

  • PDF

A Hybrid Under-sampling Approach for Better Bankruptcy Prediction (부도예측 개선을 위한 하이브리드 언더샘플링 접근법)

  • Kim, Taehoon;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.2
    • /
    • pp.173-190
    • /
    • 2015
  • The purpose of this study is to improve bankruptcy prediction models by using a novel hybrid under-sampling approach. Most prior studies have tried to enhance the accuracy of bankruptcy prediction models by improving the classification methods involved. In contrast, we focus on appropriate data preprocessing as a means of enhancing accuracy. In particular, we aim to develop an effective sampling approach for bankruptcy prediction, since most prediction models suffer from class imbalance problems. The approach proposed in this study is a hybrid under-sampling method that combines the k-Reverse Nearest Neighbor (k-RNN) and one-class support vector machine (OCSVM) approaches. k-RNN can effectively eliminate outliers, while OCSVM contributes to the selection of informative training samples from majority class data. To validate our proposed approach, we have applied it to data from H Bank's non-external auditing companies in Korea, and compared the performances of the classifiers with the proposed under-sampling and random sampling data. The empirical results show that the proposed under-sampling approach generally improves the accuracy of classifiers, such as logistic regression, discriminant analysis, decision tree, and support vector machines. They also show that the proposed under-sampling approach reduces the risk of false negative errors, which lead to higher misclassification costs.

Relation Based Bayesian Network for NBNN

  • Sun, Mingyang;Lee, YoonSeok;Yoon, Sung-eui
    • Journal of Computing Science and Engineering
    • /
    • v.9 no.4
    • /
    • pp.204-213
    • /
    • 2015
  • Under the conditional independence assumption among local features, the Naive Bayes Nearest Neighbor (NBNN) classifier has been recently proposed and performs classification without any training or quantization phases. While the original NBNN shows high classification accuracy without adopting an explicit training phase, the conditional independence among local features is against the compositionality of objects indicating that different, but related parts of an object appear together. As a result, the assumption of the conditional independence weakens the accuracy of classification techniques based on NBNN. In this work, we look into this issue, and propose a novel Bayesian network for an NBNN based classification to consider the conditional dependence among features. To achieve our goal, we extract a high-level feature and its corresponding, multiple low-level features for each image patch. We then represent them based on a simple, two-level layered Bayesian network, and design its classification function considering our Bayesian network. To achieve low memory requirement and fast query-time performance, we further optimize our representation and classification function, named relation-based Bayesian network, by considering and representing the relationship between a high-level feature and its low-level features into a compact relation vector, whose dimensionality is the same as the number of low-level features, e.g., four elements in our tests. We have demonstrated the benefits of our method over the original NBNN and its recent improvement, and local NBNN in two different benchmarks. Our method shows improved accuracy, up to 27% against the tested methods. This high accuracy is mainly due to consideration of the conditional dependences between high-level and its corresponding low-level features.

Evaluation of Rain Gauge Distribution Characteristics by Altitude using Optimization Technique (최적화 기법을 통한 강우관측소의 고도별 분포특성 검토)

  • Lee, Ji Ho;Kim, Jong Geun;Joo, Hong Jun;Jun, Hwan Don
    • Journal of Wetlands Research
    • /
    • v.19 no.1
    • /
    • pp.103-111
    • /
    • 2017
  • In this study, we estimate the NNI(Nearest Neighbor Index) which is considered altitude of rain gauge network as a method for evaluating appropriateness of spatial distribution and the current rain gauge network is evaluated. The altitude is divided by equal-area-ratio and optimal NNI within given basin condition is estimated using harmony search method for considering geographical conditions that vary from altitude to altitude. After calculating current state and optimal NNI for each altitude, the distribution of the rain gauge network is evaluated based on the difference between the two NNIs. As a result, it founds that the density of rain gauge networks is relatively thin as the altitude increases. Furthermore, it will be possible to construct an efficient rain gauge network if the characteristics of different altitudes are considered when a new rain gauge network is newly constructed.