통합 검색 | Korea Science

Band Selection Using Forward Feature Selection Algorithm for Citrus Huanglongbing Disease Detection

Katti, Anurag R.;Lee, W.S.;Ehsani, R.;Yang, C.
- Journal of Biosystems Engineering
- /
- 제40권4호
- /
- pp.417-427
- /
- 2015
Purpose: This study investigated different band selection methods to classify spectrally similar data - obtained from aerial images of healthy citrus canopies and citrus greening disease (Huanglongbing or HLB) infected canopies - using small differences without unmixing endmember components and therefore without the need for an endmember library. However, large number of hyperspectral bands has high redundancy which had to be reduced through band selection. The objective, therefore, was to first select the best set of bands and then detect citrus Huanglongbing infected canopies using these bands in aerial hyperspectral images. Methods: The forward feature selection algorithm (FFSA) was chosen for band selection. The selected bands were used for identifying HLB infected pixels using various classifiers such as K nearest neighbor (KNN), support vector machine (SVM), naïve Bayesian classifier (NBC), and generalized local discriminant bases (LDB). All bands were also utilized to compare results. Results: It was determined that a few well-chosen bands yielded much better results than when all bands were chosen, and brought the classification results on par with standard hyperspectral classification techniques such as spectral angle mapper (SAM) and mixture tuned matched filtering (MTMF). Median detection accuracies ranged from 66-80%, which showed great potential toward rapid detection of the disease. Conclusions: Among the methods investigated, a support vector machine classifier combined with the forward feature selection algorithm yielded the best results.
https://doi.org/10.5307/JBE.2015.40.4.417 인용 PDF KSCI

부하평준화 문제에서 국지적 탐색의 효율향상을 위한 이웃해 선정 기법 (A Neighbor Selection Technique for Improving Efficiency of Local Search in Load Balancing Problems)

강병호;조민숙;류광렬
- 한국정보과학회논문지:소프트웨어및응용
- /
- 제31권2호
- /
- pp.164-172
- /
- 2004
일반적으로 국지적 탐색에서 최적해를 획득할 가능성은 가능한 많은 이웃해를 생성하면서 반복 수를 늘릴수록 높아지나 긴 탐색시간이 소요된다. 따라서 한정된 시간 내에 최적해를 효율적으로 찾기 위해서는. 적절한 수의 이웃해를 생성하되, 탐색의 질을 높일 수 있는 이웃해를 선별해서 생성하는 것이 요구된다. 본 논문에서는 국지적 탐색기법을 적용하여 부하평준화 문제를 해결할 때, 탐색의 효율을 향상시킬 수 있는 이웃해 선정 기법을 제안하고, 실세계 데이타를 대상으로 그 성능을 검증하였다. 본 논문에서 제안하는 이웃해 선정 기법은 확률적 선별에 기반 한 방법으로서, 탐색의 질을 개선시킬 가능성에 대한 추정치를 기준으로 부여된 확률에 따라 이웃해를 선별하여 생성하는 기법이다. 대상 문제에 국지적 탐색기법으로 tabu 탐색과 simulated annealing를 적용한 실험에서, 무작위 또는 그리디 선별에 기반 한 방법보다 우수한 성능을 보임을 확인하였다.
PDF KSCI

Centroid and Nearest Neighbor based Class Imbalance Reduction with Relevant Feature Selection using Ant Colony Optimization for Software Defect Prediction

B., Kiran Kumar;Gyani, Jayadev;Y., Bhavani;P., Ganesh Reddy;T, Nagasai Anjani Kumar
- International Journal of Computer Science & Network Security
- /
- 제22권10호
- /
- pp.1-10
- /
- 2022
Nowadays software defect prediction (SDP) is most active research going on in software engineering. Early detection of defects lowers the cost of the software and also improves reliability. Machine learning techniques are widely used to create SDP models based on programming measures. The majority of defect prediction models in the literature have problems with class imbalance and high dimensionality. In this paper, we proposed Centroid and Nearest Neighbor based Class Imbalance Reduction (CNNCIR) technique that considers dataset distribution characteristics to generate symmetry between defective and non-defective records in imbalanced datasets. The proposed approach is compared with SMOTE (Synthetic Minority Oversampling Technique). The high-dimensionality problem is addressed using Ant Colony Optimization (ACO) technique by choosing relevant features. We used nine different classifiers to analyze six open-source software defect datasets from the PROMISE repository and seven performance measures are used to evaluate them. The results of the proposed CNNCIR method with ACO based feature selection reveals that it outperforms SMOTE in the majority of cases.
https://doi.org/10.22937/IJCSNS.2022.22.10.1 인용 PDF KSCI

유전알고리즘을 이용한 최적 k-최근접이웃 분류기 (Optimal k-Nearest Neighborhood Classifier Using Genetic Algorithm)

박종선;허균
- Communications for Statistical Applications and Methods
- /
- 제17권1호
- /
- pp.17-27
- /
- 2010
분류분석에 사용되는 k-최근접이웃 분류기에 유전알고리즘을 적용하여 의미 있는 변수들과 이들에 대한 가중치 그리고 적절한 k를 동시에 선택하는 알고리즘을 제시하였다. 다양한 실제 자료에 대하여 기존의 여러 방법들과 교차타당성 방법을 통하여 비교한 결과 효과적인 것으로 나타났다.
https://doi.org/10.5351/CKSS.2010.17.1.017 인용 PDF KSCI

Impact of Instance Selection on kNN-Based Text Categorization

Barigou, Fatiha
- Journal of Information Processing Systems
- /
- 제14권2호
- /
- pp.418-434
- /
- 2018
With the increasing use of the Internet and electronic documents, automatic text categorization becomes imperative. Several machine learning algorithms have been proposed for text categorization. The k-nearest neighbor algorithm (kNN) is known to be one of the best state of the art classifiers when used for text categorization. However, kNN suffers from limitations such as high computation when classifying new instances. Instance selection techniques have emerged as highly competitive methods to improve kNN through data reduction. However previous works have evaluated those approaches only on structured datasets. In addition, their performance has not been examined over the text categorization domain where the dimensionality and size of the dataset is very high. Motivated by these observations, this paper investigates and analyzes the impact of instance selection on kNN-based text categorization in terms of various aspects such as classification accuracy, classification efficiency, and data reduction.
https://doi.org/10.3745/JIPS.02.0080 인용 PDF KSCI

Resume Classification System using Natural Language Processing & Machine Learning Techniques

Irfan Ali;Nimra;Ghulam Mujtaba;Zahid Hussain Khand;Zafar Ali;Sajid Khan
- International Journal of Computer Science & Network Security
- /
- 제24권7호
- /
- pp.108-117
- /
- 2024
The selection and recommendation of a suitable job applicant from the pool of thousands of applications are often daunting jobs for an employer. The recommendation and selection process significantly increases the workload of the concerned department of an employer. Thus, Resume Classification System using the Natural Language Processing (NLP) and Machine Learning (ML) techniques could automate this tedious process and ease the job of an employer. Moreover, the automation of this process can significantly expedite and transparent the applicants' selection process with mere human involvement. Nevertheless, various Machine Learning approaches have been proposed to develop Resume Classification Systems. However, this study presents an automated NLP and ML-based system that classifies the Resumes according to job categories with performance guarantees. This study employs various ML algorithms and NLP techniques to measure the accuracy of Resume Classification Systems and proposes a solution with better accuracy and reliability in different settings. To demonstrate the significance of NLP & ML techniques for processing & classification of Resumes, the extracted features were tested on nine machine learning models Support Vector Machine - SVM (Linear, SGD, SVC & NuSVC), Naïve Bayes (Bernoulli, Multinomial & Gaussian), K-Nearest Neighbor (KNN) and Logistic Regression (LR). The Term-Frequency Inverse Document (TF-IDF) feature representation scheme proven suitable for Resume Classification Task. The developed models were evaluated using F-Score_M, Recall_M, Precission_M, and overall Accuracy. The experimental results indicate that using the One-Vs-Rest-Classification strategy for this multi-class Resume Classification task, the SVM class of Machine Learning algorithms performed better on the study dataset with over 96% overall accuracy. The promising results suggest that NLP & ML techniques employed in this study could be used for the Resume Classification task.
https://doi.org/10.22937/IJCSNS.2024.24.7.13 인용 PDF

초분광 이미지 픽셀 분류를 위한 풀링 연산과 PSNR을 이용한 최적 밴드 선택 기법 (Optimal Band Selection Techniques for Hyperspectral Image Pixel Classification using Pooling Operations & PSNR)

장두혁;정병현;허준영
- 한국인터넷방송통신학회논문지
- /
- 제21권5호
- /
- pp.141-147
- /
- 2021
본 연구를 통해 임베디드 시스템(Embedded System)에서 뉴럴 네트워크(Neural Network) 인풋의 차원 감소 방식으로 복잡한 연산량을 줄여 초분광 대용량 데이터 특징 정보의 활용률을 개선하기 위해, 전체 밴드를 밴드별 최댓값과 최솟값 차이로 부분집합으로 군집화하여, 각 부분집합에서 밴드 선택 알고리즘을 적용한다. 특징 추출과 특징 선택 기법 중에, 특징 선택 기법을 통해, 파장 범위와 관계없이 데이터세트에 맞는 최적의 밴드 수와 기존 알고리즘 적용 소요 시간과 성능을 향상하고자 한다. 이 실험을 통해 기존 밴드 선택 기법보다 1/3~ 1/9배 소요 시간을 단축했음에도 불구하고 K-최근접 이웃 분류기를 통한 성능 면에서는 약 4% 이상 향상된 의미 있는 결과를 도출하였다. 실시간 초분광 데이터 분석 활용에는 어렵지만, 개선된 가능성을 확인했다.
https://doi.org/10.7236/JIIBC.2021.21.5.141 인용 PDF KSCI HTML

영한 기계 번역에서 미가공 텍스트 데이터를 이용한 대역어 선택 중의성 해소 (Target Word Selection Disambiguation using Untagged Text Data in English-Korean Machine Translation)

김유섭;장정호
- 정보처리학회논문지B
- /
- 제11B권6호
- /
- pp.749-758
- /
- 2004
본 논문에서는 미가공 말뭉치 데이터를 활용하여 영한 기계번역 시스템의 대역어 선택 시 발생하는 중의성을 해소하는 방법을 제안한다. 이를 위하여 은닉 의미 분석(Latent Semantic Analysis : LSA)과 확률적 은닉 의미 분석(Probabilistic LSA : PLSA)을 적용한다. 이 두 기법은 텍스트 문단과 같은 문맥 정보가 주어졌을 때, 이 문맥이 내포하고 있는 복잡한 의미 구조를 표현할 수 있다 본 논문에서는 이들을 사용하여 언어적인 의미 지식(Semantic Knowledge)을 구축하였으며 이 지식은 결국 영한 기계번역에서의 대역어 선택 시 발생하는 중의성을 해소하기 위하여 단어간 의미 유사도를 추정하는데 사용된다. 또한 대역어 선택을 위해서는 미리 사전에 저장된 문법 관계를 활용하여야 한다. 본 논문에서는 이러한 대역어 선택 시 발생하는 데이터 희소성 문제를 해소하기 위하여 k-최근점 학습 알고리즘을 사용한다. 그리고 위의 두 모델을 활용하여 k-최근점 학습에서 필요한 예제 간 거리를 추정하였다. 실험에서는, 두 기법에서의 은닉 의미 공간을 구성하기 위하여 TREC 데이터(AP news)론 활용하였고, 대역어 선택의 정확도를 평가하기 위하여 Wall Street Journal 말뭉치를 사용하였다. 그리고 은닉 의미 분석을 통하여 대역어 선택의 정확성이 디폴트 의미 선택과 비교하여 약 10% 향상되었으며 PLSA가 LSA보다 근소하게 더 좋은 성능을 보였다. 또한 은닉 공간에서의 축소된 벡터의 차원수와 k-최근점 학습에서의 k값이 대역어 선택의 정확도에 미치는 영향을 대역어 선택 정확도와의 상관관계를 계산함으로써 검증하였다.젝트의 성격에 맞도록 필요한 조정만을 통하여 품질보증 프로세스를 확립할 수 있다. 개발 된 패키지의 효율적인 활용이 내조직의 소프트웨어 품질보증 구축에 투입되는 공수 및 어려움을 줄일 것으로 기대된다.도가 증가할 때 구기자 열수 추출 농축액은 $1.6182{\sim}2.0543$, 혼합구기자 열수 추출 농축액은 $1.7057{\sim}2.1462{\times}10^7\;J/kg{\cdot}mol$로 증가하였다. 이와 같이 구기자 열수 추출 농축액과 혼합구기자 열수 추출 농축액의 리올리지적 특성에 큰 차이를 나타내지는 않았다. security simultaneously.% 첨가시 pH 5.0, 7.0 및 8.0에서 각각 대조구의 57, 413 및 315% 증진되었다. 거품의 열안정성은 15분 whipping시, pH 4.0(대조구, 30.2%) 및 5.0(대조구, 23.7%)에서 각각 $0{\sim}38.0$ 및 $0{\sim}57.0%$이었고 pH 7.0(대조구, 39.6%) 및 8.0(대조구, 43.6%)에서 각각 $0{\sim}59.4$ 및 $36.6{\sim}58.4%$이었으며 sodium alginate 첨가시가 가장 양호하였다. 전체적으로 보아 거품안정성이 높은 것은 열안정성도 높은 경향이며, 표면장력이 낮으면 거품형성능이 높아지고, 비점도가 높으면 거품안정성 및 열안정성이 높아지는 경향이 있었다.protocol.eractions between application agents that are developed using different
https://doi.org/10.3745/KIPSTB.2004.11B.6.749 인용 PDF KSCI

Category Variable Selection Method for Efficient Clustering

Heo, Jun;Kim, Chae Yun;Jung, Yong-Gyu
- International journal of advanced smart convergence
- /
- 제2권2호
- /
- pp.40-42
- /
- 2013
Recent medical industry is an aging society and the application of national health insurance, with state-of-the-art research and development, including the pharmaceutical market is greatly increased. The nation's health care industry through new support expansion and improve the quality of life for the research and development will be needed. In addition, systemic administration of basic medical supplies, or drugs are needed, the drug at the same time managing how systematic analysis of pharmaceutical ingredients, based on data through the purchase of new medicines and pharmaceutical ingredients automatically classified by analyzing the statistics of drug purchases and the future a system that can predict a patient is needed. In this study, the drugs to the patient according to the component analysis and predictions for future research techniques, k-means clustering and k-NN (Nearest Neighbor) Comparative studies through experiments using the techniques employ a more efficient method to study how to proceed. In this study, the effects of the drugs according to the respective components in time according to the number of pieces in accordance with the patient by analyzing the statistics by predicting future patient better medical industry can be built.
https://doi.org/10.7236/IJASC2013.2.2.9 인용 PDF KSCI

IMU 원신호 기반의 기계학습을 통한 충격전 낙상방향 분류 (Classification of Fall Direction Before Impact Using Machine Learning Based on IMU Raw Signals)

이현빈;이창준;이정근
- 센서학회지
- /
- 제31권2호
- /
- pp.96-101
- /
- 2022
As the elderly population gradually increases, the risk of fatal fall accidents among the elderly is increasing. One way to cope with a fall accident is to determine the fall direction before impact using a wearable inertial measurement unit (IMU). In this context, a previous study proposed a method of classifying fall directions using a support vector machine with sensor velocity, acceleration, and tilt angle as input parameters. However, in this method, the IMU signals are processed through several processes, including a Kalman filter and the integration of acceleration, which involves a large amount of computation and error factors. Therefore, this paper proposes a machine learning-based method that classifies the fall direction before impact using IMU raw signals rather than processed data. In this study, we investigated the effects of the following two factors on the classification performance: (1) the usage of processed/raw signals and (2) the selection of machine learning techniques. First, as a result of comparing the processed/raw signals, the difference in sensitivities between the two methods was within 5%, indicating an equivalent level of classification performance. Second, as a result of comparing six machine learning techniques, K-nearest neighbor and naive Bayes exhibited excellent performance with a sensitivity of 86.0% and 84.1%, respectively.
https://doi.org/10.46670/JSST.2022.31.2.96 인용 PDF KSCI

검색결과 17건 처리시간 0.032초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)