• Title/Summary/Keyword: k nearest neighbor

Search Result 650, Processing Time 0.034 seconds

CS-Tree : Cell-based Signature Index Structure for Similarity Search in High-Dimensional Data (CS-트리 : 고차원 데이터의 유사성 검색을 위한 셀-기반 시그니쳐 색인 구조)

  • Song, Gwang-Taek;Jang, Jae-U
    • The KIPS Transactions:PartD
    • /
    • v.8D no.4
    • /
    • pp.305-312
    • /
    • 2001
  • Recently, high-dimensional index structures have been required for similarity search in such database applications s multimedia database and data warehousing. In this paper, we propose a new cell-based signature tree, called CS-tree, which supports efficient storage and retrieval on high-dimensional feature vectors. The proposed CS-tree partitions a high-dimensional feature space into a group of cells and represents a feature vector as its corresponding cell signature. By using cell signatures rather than real feature vectors, it is possible to reduce the height of our CS-tree, leading to efficient retrieval performance. In addition, we present a similarity search algorithm for efficiently pruning the search space based on cells. Finally, we compare the performance of our CS-tree with that of the X-tree being considered as an efficient high-dimensional index structure, in terms of insertion time, retrieval time for a k-nearest neighbor query, and storage overhead. It is shown from experimental results that our CS-tree is better on retrieval performance than the X-tree.

  • PDF

Noncontact Sleep Efficiency and Stage Estimation for Sleep Apnea Patients Using an Ultra-Wideband Radar (UWB 레이더를 사용한 수면무호흡환자에 대한 비접촉방식 수면효율 및 수면 단계 추정)

  • Park, Sang-Bae;Kim, Jung-Ha
    • Journal of the Korean Society of Industry Convergence
    • /
    • v.23 no.3
    • /
    • pp.433-444
    • /
    • 2020
  • This study proposes a method to improve the sleep stage and efficiency estimation of sleep apnea patients using a UWB (Ultra-Wideband) radar. Motion and respiration extracted from the radar signal were used. Respiratory signal disturbances by motion artifacts and irregular respiration patterns of sleep apnea patients are compensated for in the preprocessing stage. Preprocessing calculates the standard deviation of the respiration signal for a shift window of 15 seconds to estimate thresholds for compensation and applies it to the breathing signal. The method for estimating the sleep stage is based on the difference in amplitude of two kinds of smoothed respirations signals. In smoothing, the window size is set to 10 seconds and 34 seconds, respectively. The estimated feature was processed by the k-nearest neighbor classifier and the feature filtering model to discriminate between the sleep periods of the rapid eye movement (REM) and non-rapid eye movement (NREM). The feature filtering model reflects the characteristics of the REM sleep that occur continuously and the characteristics that mainly occur in the latter part of this stage. The sleep efficiency is estimated by using the sleep onset time and motion events. Sleep onset time uses estimated features from the gradient changes of the breathing signal. A motion event was applied based on the estimated energy change in the UWB signal. Sleep efficiency and sleep stage accuracy were assessed with polysomnography. The average sleep efficiency and sleep stage accuracy were estimated respectively to be about 96.3% and 88.8% in 18 sleep apnea subjects.

Combining Multiple Classifiers for Automatic Classification of Email Documents (전자우편 문서의 자동분류를 위한 다중 분류기 결합)

  • Lee, Jae-Haeng;Cho, Sung-Bae
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.3
    • /
    • pp.192-201
    • /
    • 2002
  • Automated text classification is considered as an important method to manage and process a huge amount of documents in digital forms that are widespread and continuously increasing. Recently, text classification has been addressed with machine learning technologies such as k-nearest neighbor, decision tree, support vector machine and neural networks. However, only few investigations in text classification are studied on real problems but on well-organized text corpus, and do not show their usefulness. This paper proposes and analyzes text classification methods for a real application, email document classification task. First, we propose a combining method of multiple neural networks that improves the performance through the combinations with maximum and neural networks. Second, we present another strategy of combining multiple machine learning classifiers. Voting, Borda count and neural networks improve the overall classification performance. Experimental results show the usefulness of the proposed methods for a real application domain, yielding more than 90% precision rates.

A Study of Outer-ring Galaxies within z<0.05 (적색편이 z<0.05의 외부고리 은하에 대한 연구)

  • Chang, Hunhwi;Sohn, Jungjoo;Ahn, Hongbae
    • Journal of the Korean earth science society
    • /
    • v.41 no.3
    • /
    • pp.211-221
    • /
    • 2020
  • This study classified outer-ring galaxies using 25,308 galaxies within z=0.05 from the SDSS DR7, which are larger than Rpet>6 arcsec and whose minor-to-major axis ratio (b/a)<0.6. We selected 531 galaxies that have ring-like structures by visual inspection of the color images of 25,308 galaxies; these galaxies with ring-like structures served as a primary sample from which we selected 90 outer-ring galaxies. The final sample of 69 outer-ring galaxies was selected by examining the photometric properties of the candidate galaxies. Their properties were determined by conducting surface photometry on their u, g, r, i, and z images. The frequency of the outer-ring galaxies was found to be 0.3% of the local galaxies. We examined the environment of the outer-ring galaxies using two measures of environment, namely, the projected distance to the nearest-neighbor galaxy and the local background density. We did not observe any notable difference between outer-ring and other galactic environments.

Development of Traffic Prediction and Optimal Traffic Control System for Highway based on Cell Transmission Model in Cloud Environment (Cell Transmission Model 시뮬레이션을 기반으로 한 클라우드 환경 아래에서의 고속도로 교통 예측 및 최적 제어 시스템 개발)

  • Tak, Se-hyun;Yeo, Hwasoo
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.15 no.4
    • /
    • pp.68-80
    • /
    • 2016
  • This study proposes the traffic prediction and optimal traffic control system based on cell transmission model and genetic algorithm in cloud environment. The proposed prediction and control system consists of four parts. 1) Data preprocessing module detects and imputes the corrupted data and missing data points. 2) Data-driven traffic prediction module predicts the future traffic state using Multi-level K-Nearest Neighbor (MK-NN) Algorithm with stored historical data in SQL database. 3) Online traffic simulation module simulates the future traffic state in various situations including accident, road work, and extreme weather condition with predicted traffic data by MK-NN. 4) Optimal road control module produces the control strategy for large road network with cell transmission model and genetic algorithm. The results show that proposed system can effectively reduce the Vehicle Hours Traveled upto 60%.

Vantage Point Metric Index Improvement for Multimedia Databases

  • Chanpisey, Uch;Lee, Sang-Kon Samuel;Lee, In-Hong
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2011.06c
    • /
    • pp.112-114
    • /
    • 2011
  • On multimedia databases, in order to realize the fast access method, indexing methods for the multidimension data space are used. However, since it is a premise to use the Euclid distance as the distance measure, this method lacks in flexibility. On the other hand, there are metric indexing methods which require only to satisfy distance axiom. Since metric indexing methods can also apply for distance measures other than the Euclid distance, these methods have high flexibility. This paper proposes an improved method of VP-tree which is one of the metric indexing methods. VP-tree follows the node which suits the search range from a route node at searching. And distances between a query and all objects linked from the leaf node which finally arrived are computed, and it investigates whether each object is contained in the search range. However, search speed will become slow if the number of distance calculations in a leaf node increases. Therefore, we paid attention to the candidates selection method using the triangular inequality in a leaf node. As the improved methods, we propose a method to use the nearest neighbor object point for the query as the datum point of the triangular inequality. It becomes possible to make the search range smaller and to cut down the number of times of distance calculation by these improved methods. From evaluation experiments using 10,000 image data, it was found that our proposed method could cut 5%~12% of search time of the traditional method.

A Design and Implementation Red Tide Prediction Monitoring System using Case Based Reasoning (사례 기반 추론을 이용한 적조 예측 모니터링 시스템 구현 및 설계)

  • Song, Byoung-Ho;Jung, Min-A;Lee, Sung-Ro
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.35 no.12B
    • /
    • pp.1219-1226
    • /
    • 2010
  • It is necessary to implementation of system contain intelligent decision making algorithm because discriminant and prediction system for Red Tide is insufficient development and the study of red tide are focused for the investigation of chemical and biological causing. In this paper, we designed inference system using case based reasoning method and implemented knowledge base that case for Red Tide. We used K-Nearest Neighbor algorithm for recommend best similar case and input 375 EA by case for Red Tide case base. As a result, conducted 10-fold cross verification for minimal impact from learning data and acquired confidence, we obtained about 84.2% average accuracy for Red Tide case and the best performance results in case by number of similarity classification k is 5. And, we implemented Red Tide monitoring system using inference result.

Designing Hypothesis of 2-Substituted-N-[4-(1-methyl-4,5-diphenyl-1H-imidazole-2-yl)phenyl] Acetamide Analogs as Anticancer Agents: QSAR Approach

  • Bedadurge, Ajay B.;Shaikh, Anwar R.
    • Journal of the Korean Chemical Society
    • /
    • v.57 no.6
    • /
    • pp.744-754
    • /
    • 2013
  • Quantitative structure-activity relationship (QSAR) analysis for recently synthesized imidazole-(benz)azole and imidazole - piperazine derivatives was studied for their anticancer activities against breast (MCF-7) cell lines. The statistically significant 2D-QSAR models ($r^2=0.8901$; $q^2=0.8130$; F test = 36.4635; $r^2$ se = 0.1696; $q^2$ se = 0.12212; pred_$r^2=0.4229$; pred_$r^2$ se = 0.4606 and $r^2=0.8763$; $q^2=0.7617$; F test = 31.8737; $r^2$ se = 0.1951; $q^2$ se = 0.2708; pred_$r^2=0.4386$; pred_$r^2$ se = 0.3950) were developed using molecular design suite (VLifeMDS 4.2). The study was performed with 18 compounds (data set) using random selection and manual selection methods used for the division of the data set into training and test set. Multiple linear regression (MLR) methodology with stepwise (SW) forward-backward variable selection method was used for building the QSAR models. The results of the 2D-QSAR models were further compared with 3D-QSAR models generated by kNN-MFA, (k-Nearest Neighbor Molecular Field Analysis) investigating the substitutional requirements for the favorable anticancer activity. The results derived may be useful in further designing novel imidazole-(benz)azole and imidazole-piperazine derivatives against breast (MCF-7) cell lines prior to synthesis.

Improved CycleGAN for underwater ship engine audio translation (수중 선박엔진 음향 변환을 위한 향상된 CycleGAN 알고리즘)

  • Ashraf, Hina;Jeong, Yoon-Sang;Lee, Chong Hyun
    • The Journal of the Acoustical Society of Korea
    • /
    • v.39 no.4
    • /
    • pp.292-302
    • /
    • 2020
  • Machine learning algorithms have made immense contributions in various fields including sonar and radar applications. Recently developed Cycle-Consistency Generative Adversarial Network (CycleGAN), a variant of GAN has been successfully used for unpaired image-to-image translation. We present a modified CycleGAN for translation of underwater ship engine sounds with high perceptual quality. The proposed network is composed of an improved generator model trained to translate underwater audio from one vessel type to other, an improved discriminator to identify the data as real or fake and a modified cycle-consistency loss function. The quantitative and qualitative analysis of the proposed CycleGAN are performed on publicly available underwater dataset ShipsEar by evaluating and comparing Mel-cepstral distortion, pitch contour matching, nearest neighbor comparison and mean opinion score with existing algorithms. The analysis results of the proposed network demonstrate the effectiveness of the proposed network.

An Analysis of Accuracy for Submarine Topographic Information by Interpolation Method (보간기법에 따른 해저지형의 정확도 분석)

  • Kim Ga-Ya;Moon Doo-Youl;Seo Dong-Ju
    • Journal of Ocean Engineering and Technology
    • /
    • v.20 no.3 s.70
    • /
    • pp.67-76
    • /
    • 2006
  • Three-dimensional information of submarine topography was acquired by assembling DGPS and Echo Sounder, which is mainly used in the marine survey. However, the features of submarine topography, derived according to mechanical data, were confirmed using human eyes. Because the dredging capacity using a submarine surveying data influences harbor public affairs, analysis and the process method of surveying data is a very special element in construction costs. In this study, information on submarine topography is acquired by assembling DGPS and Echo Sounder. Moreover, the dredging capacity in harbor public affairs has been analyzed by the interpolation method: inverse distance to a power, kriging, minimum curvature, nearest neighbor, and radial basis function. Also, utilization of DGPS and Echo Sounder method in calculation of the dredging capacity have been confirmed by comparing and analyzing the dredging capacity and the actual one, as per each interpolation. According to this comparison result, in the case of applying Radial basis function interpolation and Kriging, 3.94 % and 4.61 % of error rates have been shown, respectively. In the case of the study for application of the proper interpolation, as per characteristics of submarine topography, is preceded in calculation of the dredging capacity relevant to harbor public affairs, it is expected that more speedy and correct calculation for the dredging capacity can be made.