• 제목/요약/키워드: feature vector selection

검색결과 178건 처리시간 0.024초

A Comprehensive Approach for Tamil Handwritten Character Recognition with Feature Selection and Ensemble Learning

  • Manoj K;Iyapparaja M
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제18권6호
    • /
    • pp.1540-1561
    • /
    • 2024
  • This research proposes a novel approach for Tamil Handwritten Character Recognition (THCR) that combines feature selection and ensemble learning techniques. The Tamil script is complex and highly variable, requiring a robust and accurate recognition system. Feature selection is used to reduce dimensionality while preserving discriminative features, improving classification performance and reducing computational complexity. Several feature selection methods are compared, and individual classifiers (support vector machines, neural networks, and decision trees) are evaluated through extensive experiments. Ensemble learning techniques such as bagging, and boosting are employed to leverage the strengths of multiple classifiers and enhance recognition accuracy. The proposed approach is evaluated on the HP Labs Dataset, achieving an impressive 95.56% accuracy using an ensemble learning framework based on support vector machines. The dataset consists of 82,928 samples with 247 distinct classes, contributed by 500 participants from Tamil Nadu. It includes 40,000 characters with 500 user variations. The results surpass or rival existing methods, demonstrating the effectiveness of the approach. The research also offers insights for developing advanced recognition systems for other complex scripts. Future investigations could explore the integration of deep learning techniques and the extension of the proposed approach to other Indic scripts and languages, advancing the field of handwritten character recognition.

New Feature Selection Method for Text Categorization

  • Wang, Xingfeng;Kim, Hee-Cheol
    • Journal of information and communication convergence engineering
    • /
    • 제15권1호
    • /
    • pp.53-61
    • /
    • 2017
  • The preferred feature selection methods for text classification are filter-based. In a common filter-based feature selection scheme, unique scores are assigned to features; then, these features are sorted according to their scores. The last step is to add the top-N features to the feature set. In this paper, we propose an improved global feature selection scheme wherein its last step is modified to obtain a more representative feature set. The proposed method aims to improve the classification performance of global feature selection methods by creating a feature set representing all classes almost equally. For this purpose, a local feature selection method is used in the proposed method to label features according to their discriminative power on classes; these labels are used while producing the feature sets. Experimental results obtained using the well-known 20 Newsgroups and Reuters-21578 datasets with the k-nearest neighbor algorithm and a support vector machine indicate that the proposed method improves the classification performance in terms of a widely known metric ($F_1$).

모바일 비디오기기 위에서의 중요한 객체탐색을 위한 문맥인식 특성벡터 선택 모델 (Context Aware Feature Selection Model for Salient Feature Detection from Mobile Video Devices)

  • 이재호;신현경
    • 인터넷정보학회논문지
    • /
    • 제15권6호
    • /
    • pp.117-124
    • /
    • 2014
  • 모바일 기기를 사용한 실시간 비디오 영상처리분야의 중요 객체탐색 및 추적의 문제에 있어서 난제는 복잡한 배경속에서 전경을 구분해 내는 일이다. 본 논문에서는 기계학습을 위한 특성벡터 선정의 문제를 위한 문맥인식 모델을 제시하여 잡음제거를 위한 기계학습기반의 구분자를 구현하였다. 수학적으로 NP-hard로 알려진 가장 가까운 이웃을 사용한 문맥인식 특성벡터 선정 알고리즘의 구현에 있어서, 본 논문은 연산횟수를 줄인 유사방법론에 대해 자세히 거론하였다. 또한, 문맥인식 성격을 가미한 특성벡터 선정을 통해 얻어진 특성 공간에서의 향상된 분리성에 대해 주성분 분석을 통해 엄밀한 분석결과를 제시하였다. 전반적인 성능 향상의 정도를 계측하기 위해 다양한 기계학습 방법론, 예를 들어, 다층신경망, 지원벡터기계, 나이브베이지안, 회귀분석 등을 사용해 비교결과를 제시하였다. 본 논문에서 제시한 방법론의 성능과 계산상 자원사용에 대한 내용을 결론으로 서술하였다.

Feature Selection for Multi-Class Support Vector Machines Using an Impurity Measure of Classification Trees: An Application to the Credit Rating of S&P 500 Companies

  • Hong, Tae-Ho;Park, Ji-Young
    • Asia pacific journal of information systems
    • /
    • 제21권2호
    • /
    • pp.43-58
    • /
    • 2011
  • Support vector machines (SVMs), a machine learning technique, has been applied to not only binary classification problems such as bankruptcy prediction but also multi-class problems such as corporate credit ratings. However, in general, the performance of SVMs can be easily worse than the best alternative model to SVMs according to the selection of predictors, even though SVMs has the distinguishing feature of successfully classifying and predicting in a lot of dichotomous or multi-class problems. For overcoming the weakness of SVMs, this study has proposed an approach for selecting features for multi-class SVMs that utilize the impurity measures of classification trees. For the selection of the input features, we employed the C4.5 and CART algorithms, including the stepwise method of discriminant analysis, which is a well-known method for selecting features. We have built a multi-class SVMs model for credit rating using the above method and presented experimental results with data regarding S&P 500 companies.

문서측 자질선정을 이용한 고속 문서분류기의 성능향상에 관한 연구 (Improving the Performance of a Fast Text Classifier with Document-side Feature Selection)

  • 이재윤
    • 정보관리연구
    • /
    • 제36권4호
    • /
    • pp.51-69
    • /
    • 2005
  • 문서분류에 있어서 분류속도의 향상이 중요한 연구과제가 되고 있다. 최근 개발된 자질값투표 기법은 문서자동분류 문제에 대해서 매우 빠른 속도를 가졌지만, 분류정확도는 만족스럽지 못하다. 이 논문에서는 새로운 자질선정 기법인 문서측 자질선정 기법을 제안하고, 이를 자질값투표 기법에 적용해 보았다. 문서측 자질선정은 일반적인 분류자질선정과 달리 학습집단이 아닌 분류대상 문서의 자질 중 일부만을 선택하여 분류에 이용하는 방식이다. 문서측 자질선정을 적용한 실험에서는, 간단하고 빠른 자질값투표 분류기로 SVM 분류기만큼 좋은 성능을 얻을 수 있었다.

Morphological Feature Extraction of Microorganisms Using Image Processing

  • Kim Hak-Kyeong;Jeong Nam-Su;Kim Sang-Bong;Lee Myung-Suk
    • Fisheries and Aquatic Sciences
    • /
    • 제4권1호
    • /
    • pp.1-9
    • /
    • 2001
  • This paper describes a procedure extracting feature vector of a target cell more precisely in the case of identifying specified cell. The classification of object type is based on feature vector such as area, complexity, centroid, rotation angle, effective diameter, perimeter, width and height of the object So, the feature vector plays very important role in classifying objects. Because the feature vectors is affected by noises and holes, it is necessary to remove noises contaminated in original image to get feature vector extraction exactly. In this paper, we propose the following method to do to get feature vector extraction exactly. First, by Otsu's optimal threshold selection method and morphological filters such as cleaning, filling and opening filters, we separate objects from background an get rid of isolated particles. After the labeling step by 4-adjacent neighborhood, the labeled image is filtered by the area filter. From this area-filtered image, feature vector such as area, complexity, centroid, rotation angle, effective diameter, the perimeter based on chain code and the width and height based on rotation matrix are extracted. To prove the effectiveness, the proposed method is applied for yeast Zygosaccharomyces rouxn. It is also shown that the experimental results from the proposed method is more efficient in measuring feature vectors than from only Otsu's optimal threshold detection method.

  • PDF

초분광 영상 특징선택과 밴드비 기법을 이용한 유사색상의 특이재질 검출기법 (Specific Material Detection with Similar Colors using Feature Selection and Band Ratio in Hyperspectral Image)

  • 심민섭;김성호
    • 제어로봇시스템학회논문지
    • /
    • 제19권12호
    • /
    • pp.1081-1088
    • /
    • 2013
  • Hyperspectral cameras acquire reflectance values at many different wavelength bands. Dimensions tend to increase because spectral information is stored in each pixel. Several attempts have been made to reduce dimensional problems such as the feature selection using Adaboost and dimension reduction using the Simulated Annealing technique. We propose a novel material detection method that consists of four steps: feature band selection, feature extraction, SVM (Support Vector Machine) learning, and target and specific region detection. It is a combination of the band ratio method and Simulated Annealing algorithm based on detection rate. The experimental results validate the effectiveness of the proposed feature selection and band ratio method.

Improved Feature Selection Techniques for Image Retrieval based on Metaheuristic Optimization

  • Johari, Punit Kumar;Gupta, Rajendra Kumar
    • International Journal of Computer Science & Network Security
    • /
    • 제21권1호
    • /
    • pp.40-48
    • /
    • 2021
  • Content-Based Image Retrieval (CBIR) system plays a vital role to retrieve the relevant images as per the user perception from the huge database is a challenging task. Images are represented is to employ a combination of low-level features as per their visual content to form a feature vector. To reduce the search time of a large database while retrieving images, a novel image retrieval technique based on feature dimensionality reduction is being proposed with the exploit of metaheuristic optimization techniques based on Genetic Algorithm (GA), Extended Binary Cuckoo Search (EBCS) and Whale Optimization Algorithm (WOA). Each image in the database is indexed using a feature vector comprising of fuzzified based color histogram descriptor for color and Median binary pattern were derived in the color space from HSI for texture feature variants respectively. Finally, results are being compared in terms of Precision, Recall, F-measure, Accuracy, and error rate with benchmark classification algorithms (Linear discriminant analysis, CatBoost, Extra Trees, Random Forest, Naive Bayes, light gradient boosting, Extreme gradient boosting, k-NN, and Ridge) to validate the efficiency of the proposed approach. Finally, a ranking of the techniques using TOPSIS has been considered choosing the best feature selection technique based on different model parameters.

The Important Frequency Band Selection and Feature Vecotor Extraction System by an Evolutional Method

  • Yazama, Yuuki;Mitsukura, Yasue;Fukumi, Minoru;Akamatsu, Norio
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 제어로봇시스템학회 2003년도 ICCAS
    • /
    • pp.2209-2212
    • /
    • 2003
  • In this paper, we propose the method to extract the important frequency bands from the EMG signal, and for generation of feature vector using the important frequency bands. The EMG signal is measured with 4 sensor and is recorded as 4 channel’s time series data. The same frequency bands from 4 channel’s frequency components are selected as the important frequency bands. The feature vector is calculated by the function formed using the combination of selected same important frequency bands. The EMG signals acquired from seven wrist motion type are recognized by changing into the feature vector formed. Then, the extraction and generation is performed by using the double combination of the genetic algorithm (GA) and the neural network (NN). Finally, in order to illustrate the effectiveness of the proposed method, computer simulations are done.

  • PDF

음성신호기반의 감정분석을 위한 특징벡터 선택 (Discriminative Feature Vector Selection for Emotion Classification Based on Speech)

  • 최하나;변성우;이석필
    • 전기학회논문지
    • /
    • 제64권9호
    • /
    • pp.1363-1368
    • /
    • 2015
  • Recently, computer form were smaller than before because of computing technique's development and many wearable device are formed. So, computer's cognition of human emotion has importantly considered, thus researches on analyzing the state of emotion are increasing. Human voice includes many information of human emotion. This paper proposes a discriminative feature vector selection for emotion classification based on speech. For this, we extract some feature vectors like Pitch, MFCC, LPC, LPCC from voice signals are divided into four emotion parts on happy, normal, sad, angry and compare a separability of the extracted feature vectors using Bhattacharyya distance. So more effective feature vectors are recommended for emotion classification.