• Title/Summary/Keyword: Nearest L-Neighbor Method

Search Result 7, Processing Time 0.068 seconds

Nearest L- Neighbor Method with De-crossing in Vehicle Routing Problem

  • Kim, Hwan-Seong;Tran-Ngoc, Hoang-Son
    • Journal of Navigation and Port Research
    • /
    • v.33 no.2
    • /
    • pp.143-151
    • /
    • 2009
  • The field of vehicle routing is currently growing rapidly because of many actual applications in truckload and less than truckload trucking, courier services, door to door services, and many other problems that generally hinder the optimization of transportation costs in a logistics network. The rapidly increasing number of customers in such a network has caused problems such as difficulty in cost optimization in terms of getting a global optimum solution in an acceptable time. Fast algorithms are needed to find sufficient solutions in a limited time that can be used for real time scheduling. In this paper, the nearest L-method (NLNM) is proposed to obtain a vehicle routing solution. String neighbors of different lengths were chosen, tested and compared. The applied de crossing procedure is meant to solve the routes by NLNM by giving a better solution and shorter computation time than that of NLNM with long string neighbors.

A Pattern Classification Method using Closest Decision Method in k Nearest Neighbor Prototypes (k 근방 원형상에서 최근접 결정법을 이용한 패턴식별법)

  • Kim, Eung-Kyeu;Lee, Soo-Jong
    • Proceedings of the IEEK Conference
    • /
    • 2008.06a
    • /
    • pp.833-834
    • /
    • 2008
  • In this paper, a pattern classification method using closest decision method based on the mean of norm in the closet prototype from an input pattern and its k nearest neighbor prototypes is presented to do accurate classification in arbitrary distributed patterns when the number of patterns is very low. Also this method can be used to classify input pattern precisely when the number patterns is very low because this method considers the weight by the difference of variance in prototypes around the discrimination boundary.

  • PDF

A Fast Motion Estimation Scheme using Spatial and Temporal Characteristics (시공간 특성을 이용한 고속 움직임 백터 예측 방법)

  • 노대영;장호연;오승준;석민수
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.40 no.4
    • /
    • pp.237-247
    • /
    • 2003
  • The Motion Estimation (ME) process is an important part of a video encoding systems since they can significantly reduce bitrate with keeping the output quality of an encoded sequence. Unfortunately this process may dominate the encoding time using straightforward full search algorithm (FS). Up to now, many fast algorithms can reduce the computation complexity by limiting the number of searching locations. This is accomplished at the expense of less accuracy of motion estimation. In this paper, we introduce a new fast motion estimation method based on the spatio-temporal correlation of adjacent blocks. A reliable predicted motion vector (RPMV) is defined. The reliability of RPMV is shown on the basis of motion vectors achieved by FS. The scalar and the direction of RPMV are used in our proposed scheme. The experimental results show that the proposed method Is about l1~14% faster than the nearest neighbor method which is a wellknown conventional fast scheme.

Design of an Efficient Parallel High-Dimensional Index Structure (효율적인 병렬 고차원 색인구조 설계)

  • Park, Chun-Seo;Song, Seok-Il;Sin, Jae-Ryong;Yu, Jae-Su
    • Journal of KIISE:Databases
    • /
    • v.29 no.1
    • /
    • pp.58-71
    • /
    • 2002
  • Generally, multi-dimensional data such as image and spatial data require large amount of storage space. There is a limit to store and manage those large amount of data in single workstation. If we manage the data on parallel computing environment which is being actively researched these days, we can get highly improved performance. In this paper, we propose a parallel high-dimensional index structure that exploits the parallelism of the parallel computing environment. The proposed index structure is nP(processor)-n$\times$mD(disk) architecture which is the hybrid type of nP-nD and lP-nD. Its node structure increases fan-out and reduces the height of a index tree. Also, A range search algorithm that maximizes I/O parallelism is devised, and it is applied to K-nearest neighbor queries. Through various experiments, it is shown that the proposed method outperforms other parallel index structures.

[$L_1$] Shortest Paths with Isothetic Roads (축에 평행한 도로들이 놓여 있을 때의 $L_1$ 최단 경로)

  • Bae Sang Won;Kim Jae-Hoon;Chwa Kyung-Yong
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2005.11a
    • /
    • pp.976-978
    • /
    • 2005
  • We present a nearly optimal ($O(\nu\;min(\nu,\;n)n\;log\;n)$ time and O(n) srace) algorithm that constructs a shortest path map with n isothetic roads of speed $\nu$ under the $L_1$ metric. The algorithm uses the continuous Dijkstra method and its efficiency is based on a new geometric insight; the minimum in-degree of any nearest neighbor graph for points with roads of speed $\nu$ is $\Theta(\nu\;min(\nu,\;n))$, which is first shown in this paper. Also, this algorithm naturally extends to the multi-source case so that the Voronoi diagram for m sites can be computed in $O(\nu\;min(\nu,\;n)(n+m)log(n+m))$ time and O(n+m) space, which is also nearly optimal.

  • PDF

Analyzing Key Variables in Network Attack Classification on NSL-KDD Dataset using SHAP (SHAP 기반 NSL-KDD 네트워크 공격 분류의 주요 변수 분석)

  • Sang-duk Lee;Dae-gyu Kim;Chang Soo Kim
    • Journal of the Society of Disaster Information
    • /
    • v.19 no.4
    • /
    • pp.924-935
    • /
    • 2023
  • Purpose: The central aim of this study is to leverage machine learning techniques for the classification of Intrusion Detection System (IDS) data, with a specific focus on identifying the variables responsible for enhancing overall performance. Method: First, we classified 'R2L(Remote to Local)' and 'U2R (User to Root)' attacks in the NSL-KDD dataset, which are difficult to detect due to class imbalance, using seven machine learning models, including Logistic Regression (LR) and K-Nearest Neighbor (KNN). Next, we use the SHapley Additive exPlanation (SHAP) for two classification models that showed high performance, Random Forest (RF) and Light Gradient-Boosting Machine (LGBM), to check the importance of variables that affect classification for each model. Result: In the case of RF, the 'service' variable and in the case of LGBM, the 'dst_host_srv_count' variable were confirmed to be the most important variables. These pivotal variables serve as key factors capable of enhancing performance in the context of classification for each respective model. Conclusion: In conclusion, this paper successfully identifies the optimal models, RF and LGBM, for classifying 'R2L' and 'U2R' attacks, while elucidating the crucial variables associated with each selected model.

Effect of missing values in detecting differentially expressed genes in a cDNA microarray experiment

  • Kim, Byung-Soo;Rha, Sun-Young
    • Bioinformatics and Biosystems
    • /
    • v.1 no.1
    • /
    • pp.67-72
    • /
    • 2006
  • The aim of this paper is to discuss the effect of missing values in detecting differentially expressed genes in a cDNA microarray experiment in the context of a one sample problem. We conducted a cDNA micro array experiment to detect differentially expressed genes for the metastasis of colorectal cancer based on twenty patients who underwent liver resection due to liver metastasis from colorectal cancer. Total RNAs from metastatic liver tumor and adjacent normal liver tissue from a single patient were labeled with cy5 and cy3, respectively, and competitively hybridized to a cDNA microarray with 7775 human genes. We used $M=log_2(R/G)$ for the signal evaluation, where Rand G denoted the fluorescent intensities of Cy5 and Cy3 dyes, respectively. The statistical problem comprises a one sample test of testing E(M)=0 for each gene and involves multiple tests. The twenty cDNA microarray data would comprise a matrix of dimension 7775 by 20, if there were no missing values. However, missing values occur for various reasons. For each gene, the no missing proportion (NMP) was defined to be the proportion of non-missing values out of twenty. In detecting differentially expressed (DE) genes, we used the genes whose NMP is greater than or equal to 0.4 and then sequentially increased NMP by 0.1 for investigating its effect on the detection of DE genes. For each fixed NMP, we imputed the missing values with K-nearest neighbor method (K=10) and applied the nonparametric t-test of Dudoit et al. (2002), SAM by Tusher et al. (2001) and empirical Bayes procedure by $L\ddot{o}nnstedt$ and Speed (2002) to find out the effect of missing values in the final outcome. These three procedures yielded substantially agreeable result in detecting DE genes. Of these three procedures we used SAM for exploring the acceptable NMP level. The result showed that the optimum no missing proportion (NMP) found in this data set turned out to be 80%. It is more desirable to find the optimum level of NMP for each data set by applying the method described in this note, when the plot of (NMP, Number of overlapping genes) shows a turning point.

  • PDF