• 제목/요약/키워드: K-nearest neighbor algorithm

검색결과 265건 처리시간 0.026초

Flood Frequency Analysis with the consideration of the heterogeneous impacts from TC and non-TC rainfalls: application to daily flows in the Nam River Basin, South Korea

  • Alcantara, Angelika;Ahn, Kuk-Hyun
    • 한국수자원학회:학술대회논문집
    • /
    • 한국수자원학회 2020년도 학술발표회
    • /
    • pp.121-121
    • /
    • 2020
  • Varying dominant processes, including Tropical Cyclone (TC) and non-TC rainfall events, have been known to drive the occurrence of precipitation in South Korea. With the changes in the pattern of the Earth's climate due to anthropogenic activities, nonstationarity or changes in the magnitude and frequency of these dominant processes have been separately observed for the past decades and are expected to continue in the coming years. These changes often cause unprecedented hydrologic events such as extreme flooding which pose a greater risk to the society. This study aims to take into account a more reliable future climate condition with two dominant processes. Diverse statistical models including the hidden markov chain, K-nearest neighbor algorithm, and quantile mappings are utilized to mimic future rainfall events based on the recorded historical data with the consideration of the varying effects of TC and non-TC events. The data generated is then utilized to the hydrologic model to conduct a flood frequency analysis. Results in this study emphasize the need to consider the nonstationarity of design rainfalls to fully grasp the degree of future flooding events when designing urban water infrastructures.

  • PDF

A personalized exercise recommendation system using dimension reduction algorithms

  • Lee, Ha-Young;Jeong, Ok-Ran
    • 한국컴퓨터정보학회논문지
    • /
    • 제26권6호
    • /
    • pp.19-28
    • /
    • 2021
  • 코로나로 인해 건강관리에 대한 관심이 증가하고 있는 요즘, 여러 사람이 함께 이용하는 헬스장이나 공용시설을 이용하는데 어려움이 늘어남에 따라 홈 트레이닝을 하는 이들이 늘어나고 있다. 이에 본 연구에서는 홈 트레이닝 사용자들에게 좀 더 정확하고 의미 있는 운동 추천을 제공하기 위해 개인 성향 정보를 활용한 개인화된 운동 추천 알고리즘을 제안한다. 이를 위해 식습관 정보, 육체적 조건 등 개인을 나타낼 수 있는 개인 성향 정보를 사용해 k-최근접 이웃 알고리즘으로 데이터를 비만의 기준에 따라 분류하였다. 또한, 운동 데이터 셋을 운동의 레벨에 따라 등급을 구별하였으며 각 데이터 셋의 이웃 정보를 바탕으로 모델 기반 협업 필터링 방법 중 차원 축소모델인 특이값 분해 알고리즘(SVD)을 통해 사용자들에게 개인화된 운동 추천을 제공한다. 따라서 메모리 기반 협업 필터링 추천 기법의 데이터 희소성과 확장성의 문제를 해결할 수 있고, 실험을 통해 본 연구에서 제안하는 알고리즘의 정확도와 성능을 검증한다.

Evaluation of Machine Learning Algorithm Utilization for Lung Cancer Classification Based on Gene Expression Levels

  • Podolsky, Maxim D;Barchuk, Anton A;Kuznetcov, Vladimir I;Gusarova, Natalia F;Gaidukov, Vadim S;Tarakanov, Segrey A
    • Asian Pacific Journal of Cancer Prevention
    • /
    • 제17권2호
    • /
    • pp.835-838
    • /
    • 2016
  • Background: Lung cancer remains one of the most common cancers in the world, both in terms of new cases (about 13% of total per year) and deaths (nearly one cancer death in five), because of the high case fatality. Errors in lung cancer type or malignant growth determination lead to degraded treatment efficacy, because anticancer strategy depends on tumor morphology. Materials and Methods: We have made an attempt to evaluate effectiveness of machine learning algorithms in the task of lung cancer classification based on gene expression levels. We processed four publicly available data sets. The Dana-Farber Cancer Institute data set contains 203 samples and the task was to classify four cancer types and sound tissue samples. With the University of Michigan data set of 96 samples, the task was to execute a binary classification of adenocarcinoma and non-neoplastic tissues. The University of Toronto data set contains 39 samples and the task was to detect recurrence, while with the Brigham and Women's Hospital data set of 181 samples it was to make a binary classification of malignant pleural mesothelioma and adenocarcinoma. We used the k-nearest neighbor algorithm (k=1, k=5, k=10), naive Bayes classifier with assumption of both a normal distribution of attributes and a distribution through histograms, support vector machine and C4.5 decision tree. Effectiveness of machine learning algorithms was evaluated with the Matthews correlation coefficient. Results: The support vector machine method showed best results among data sets from the Dana-Farber Cancer Institute and Brigham and Women's Hospital. All algorithms with the exception of the C4.5 decision tree showed maximum potential effectiveness in the University of Michigan data set. However, the C4.5 decision tree showed best results for the University of Toronto data set. Conclusions: Machine learning algorithms can be used for lung cancer morphology classification and similar tasks based on gene expression level evaluation.

EAR: Enhanced Augmented Reality System for Sports Entertainment Applications

  • Mahmood, Zahid;Ali, Tauseef;Muhammad, Nazeer;Bibi, Nargis;Shahzad, Imran;Azmat, Shoaib
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제11권12호
    • /
    • pp.6069-6091
    • /
    • 2017
  • Augmented Reality (AR) overlays virtual information on real world data, such as displaying useful information on videos/images of a scene. This paper presents an Enhanced AR (EAR) system that displays useful statistical players' information on captured images of a sports game. We focus on the situation where the input image is degraded by strong sunlight. Proposed EAR system consists of an image enhancement technique to improve the accuracy of subsequent player and face detection. The image enhancement is followed by player and face detection, face recognition, and players' statistics display. First, an algorithm based on multi-scale retinex is proposed for image enhancement. Then, to detect players' and faces', we use adaptive boosting and Haar features for feature extraction and classification. The player face recognition algorithm uses boosted linear discriminant analysis to select features and nearest neighbor classifier for classification. The system can be adjusted to work in different types of sports where the input is an image and the desired output is display of information nearby the recognized players. Simulations are carried out on 2096 different images that contain players in diverse conditions. Proposed EAR system demonstrates the great potential of computer vision based approaches to develop AR applications.

An effective automated ontology construction based on the agriculture domain

  • Deepa, Rajendran;Vigneshwari, Srinivasan
    • ETRI Journal
    • /
    • 제44권4호
    • /
    • pp.573-587
    • /
    • 2022
  • The agricultural sector is completely different from other sectors since it completely relies on various natural and climatic factors. Climate changes have many effects, including lack of annual rainfall and pests, heat waves, changes in sea level, and global ozone/atmospheric CO2 fluctuation, on land and agriculture in similar ways. Climate change also affects the environment. Based on these factors, farmers chose their crops to increase productivity in their fields. Many existing agricultural ontologies are either domain-specific or have been created with minimal vocabulary and no proper evaluation framework has been implemented. A new agricultural ontology focused on subdomains is designed to assist farmers using Jaccard relative extractor (JRE) and Naïve Bayes algorithm. The JRE is used to find the similarity between two sentences and words in the agricultural documents and the relationship between two terms is identified via the Naïve Bayes algorithm. In the proposed method, the preprocessing of data is carried out through natural language processing techniques and the tags whose dimensions are reduced are subjected to rule-based formal concept analysis and mapping. The subdomain ontologies of weather, pest, and soil are built separately, and the overall agricultural ontology are built around them. The gold standard for the lexical layer is used to evaluate the proposed technique, and its performance is analyzed by comparing it with different state-of-the-art systems. Precision, recall, F-measure, Matthews correlation coefficient, receiver operating characteristic curve area, and precision-recall curve area are the performance metrics used to analyze the performance. The proposed methodology gives a precision score of 94.40% when compared with the decision tree(83.94%) and K-nearest neighbor algorithm(86.89%) for agricultural ontology construction.

Okapi BM25 단어 가중치법 적용을 통한 문서 범주화의 성능 향상 (A Research on Enhancement of Text Categorization Performance by using Okapi BM25 Word Weight Method)

  • 이용훈;이상범
    • 한국산학기술학회논문지
    • /
    • 제11권12호
    • /
    • pp.5089-5096
    • /
    • 2010
  • 문서 범주화는 정보검색 시스템의 중요한 기능중의 하나로 문서들을 어떤 기준에 의해 그룹화를 하는 것을 말한다. 범주화의 일반적인 방법은 대상 문서에서 중요한 단어들을 추출하고 가중치를 부여한 후에 분류 알고리즘에 따라 문서를 분류한다. 따라서 성능과 정확성은 분류 알고리즘에 의해 결정됨으로 알고리즘의 효율성이 중요하다. 본 논문에서는 단어 가중치 계산 방법을 개선하여 문서분류 성능을 향상시키는 것을 소개하였다. Okapi BM25 단어 가중치법은 일반적인 정보검색분야에서 사용되어 검색 결과에 좋은 결과를 보여주고 있다. 이를 적용하여 문서 범주화에서도 좋은 성능을 보이는지를 실험하였다. 비교한 단어 가중치법에는 가장 일반적인 TF-IDF법와 문서분류에 최적화된 가중치법 TF-ICF법, 그리고 문서요약에서 많이 사용되는 TF-ISF법을 이용하여 4가지 가중치법에 따라 결과를 측정하였다. 실험에 사용한 문서로는 Reuter-21578 문서를 사용하였으며 분류기 알고리즘으로는 Support Vector Machine(SVM)와 K-Nearest Neighbor(KNN)알고리즘을 사용하여 실험하였다. 사용된 가중치법 중 Okapi BM25 법이 가장 좋은 성능을 보였다.

사례 기반 추론을 이용한 적조 예측 모니터링 시스템 구현 및 설계 (A Design and Implementation Red Tide Prediction Monitoring System using Case Based Reasoning)

  • 송병호;정민아;이성로
    • 한국통신학회논문지
    • /
    • 제35권12B호
    • /
    • pp.1219-1226
    • /
    • 2010
  • 적조 현상에 대한 판별, 예측 분석을 위한 시스템은 현재 개발이 아주 미흡한 상태이고 현재의 적조원인에 대한 연구는 화학 및 생물학적 원인의 규명에 대해 그 초점이 맞추어져 있어 지능적인 의사 결정 알고리즘을 갖는 시스템 구현이 필요하다. 본 논문에서는 사례 기반 추론 기법을 이용하여 적조 현상에 관한 사례를 지식 베이스로 구축하고 추론하는 시스템을 설계하였다. 가장 유사한 사례 추천을 위해 KNN 알고리즘을 이용하였고 적조 사례 베이스를 구축하기 위하여 375 건의 데이터를 입력 받아 실험하였다. 학습 데이터로부터의 영향을 최소화하고 신뢰성을 확보하기 위해 10-Fold 교차검증을 수행한 결과 적조 사례에 대한 평균 정확도는 약 84.2%를 나타냈고 유사도 분류 k 개수가 5인 경우에 최적의 수행 결과를 나타냈다. 또한, 추론된 결과를 이용하여 적조 모니터링 시스템을 구현하였다.

저니키 모멘트 기반 지역 서술자를 이용한 실시간 특징점 정합 (Real-Time Feature Point Matching Using Local Descriptor Derived by Zernike Moments)

  • 황선규;김회율
    • 대한전자공학회논문지SP
    • /
    • 제46권4호
    • /
    • pp.116-123
    • /
    • 2009
  • 서로 다른 시점의 두 영상에서 동일한 점들을 정합하는 특징점 정합은 다양한 영상 처리 분야에서 널리 사용되고 있으며, 최근에는 실시간으로 동작하는 특징점 정합에 대한 요구가 높아지고 있다. 본 논문은 저니키 모멘트 기반의 지역 서술자를 이용하여 특징점을 실시간으로 정합하는 방법을 제안한다. 빠른 모서리 점 검출 방법을 이용하여 입력 영상으로부터 특징점을 추출하고, 각 특징점에서 저니키 모멘트를 이용한 지역 서술자를 생성한다. 저니키 모멘트 기반의 지역 서술자는 특징점 주변의 부분 영상을 적은 차수의 특징 벡터로써 효율적으로 표현하며, 영상의 회전과 밝기 변화에 강인하다. 본 논문에서는 저니키 모멘트 계산을 실시간으로 수행하기 위하여 고정된 크기의 저니키 기저 함수를 미리 계산하여 이를 룩업 테이블에 저장하여 사용한다. 특징점 정합 단계에서는 근사 최근방 이웃(ANN) 방법을 사용하여 초기 정합 결과를 얻고, 이 중 잘못된 정합은 RANSAC 알고리즘을 이용하여 제거함으로써 최종 정합 결과를 얻는다. 실험 결과 제안하는 방법은 다양한 변환이 존재하는 영상에 대하여 실시 간으로 특징점 정합을 수행함을 확인하였다.

The Relationship between Smartphone Use and Oral Health in Adolescents

  • Ahn, Eunsuk;Han, Ji-Hyoung
    • 치위생과학회지
    • /
    • 제20권1호
    • /
    • pp.44-50
    • /
    • 2020
  • Background: Smartphones are a modern necessity. While they are convenient to use, smartphones also have side effects such as addiction. This study assessed the relationship between smartphone use, a part of everyday life in modern society, and oral health. Methods: An analysis was conducted using 2017 Korea Youth Risk Behavior Web-based Survey data. The propensity score estimation algorithm used logistic regression and 1:1 matching algorithm using nearest-neighbor matching. After matching, a total of 15,032 participants were classified into two groups containing 7,516 teenagers each who did and did not use smartphones, respectively. Results: Comparison of oral health behaviors according to smartphone use revealed a statistically significant difference in the frequency of tooth brushing per day, use of oral hygiene products, intake of foods harmful to oral health, and experience of oral health education (p<0.05). The factors affecting oral pain experience of adolescents were examined. Compared to male participants, female participants had an odds ratio of 1.627 for oral pain (p<0.05). According to the household income level, compared to the group with higher income, the group with lower income showed higher oral pain experience (p<0.05). Oral pain experience was 1.601 times more frequent among teenagers using smartphones (p<0.05). Conclusion: The results of this study indicated that use of smartphones by adolescents affected their oral health. These findings indicate the need for improved oral health management through the use of effective school oral health programs and individual counseling by oral health professionals, promotion of information dissemination through public media, and development of prevention strategies.

Indoor Path Recognition Based on Wi-Fi Fingerprints

  • Donggyu Lee;Jaehyun Yoo
    • Journal of Positioning, Navigation, and Timing
    • /
    • 제12권2호
    • /
    • pp.91-100
    • /
    • 2023
  • The existing indoor localization method using Wi-Fi fingerprinting has a high collection cost and relatively low accuracy, thus requiring integrated correction of convergence with other technologies. This paper proposes a new method that significantly reduces collection costs compared to existing methods using Wi-Fi fingerprinting. Furthermore, it does not require labeling of data at collection and can estimate pedestrian travel paths even in large indoor spaces. The proposed pedestrian movement path estimation process is as follows. Data collection is accomplished by setting up a feature area near an indoor space intersection, moving through the set feature areas, and then collecting data without labels. The collected data are processed using Kernel Linear Discriminant Analysis (KLDA) and the valley point of the Euclidean distance value between two data is obtained within the feature space of the data. We build learning data by labeling data corresponding to valley points and some nearby data by feature area numbers, and labeling data between valley points and other valley points as path data between each corresponding feature area. Finally, for testing, data are collected randomly through indoor space, KLDA is applied as previous data to build test data, the K-Nearest Neighbor (K-NN) algorithm is applied, and the path of movement of test data is estimated by applying a correction algorithm to estimate only routes that can be reached from the most recently estimated location. The estimation results verified the accuracy by comparing the true paths in indoor space with those estimated by the proposed method and achieved approximately 90.8% and 81.4% accuracy in two experimental spaces, respectively.