• 제목/요약/키워드: selection approach

검색결과 1,665건 처리시간 0.055초

A Novel Statistical Feature Selection Approach for Text Categorization

  • Fattah, Mohamed Abdel
    • Journal of Information Processing Systems
    • /
    • 제13권5호
    • /
    • pp.1397-1409
    • /
    • 2017
  • For text categorization task, distinctive text features selection is important due to feature space high dimensionality. It is important to decrease the feature space dimension to decrease processing time and increase accuracy. In the current study, for text categorization task, we introduce a novel statistical feature selection approach. This approach measures the term distribution in all collection documents, the term distribution in a certain category and the term distribution in a certain class relative to other classes. The proposed method results show its superiority over the traditional feature selection methods.

Comparing Bayesian model selection with a frequentist approach using iterative method of smoothing residuals

  • Koo, Hanwool;Shafieloo, Arman;Keeley, Ryan E.;L'Huillier, Benjamin
    • 천문학회보
    • /
    • 제46권1호
    • /
    • pp.48.2-48.2
    • /
    • 2021
  • We have developed a frequentist approach for model selection which determines consistency of a cosmological model and the data using the distribution of likelihoods from the iterative smoothing method. Using this approach, we have shown how confidently we can distinguish different models without comparison with one another. In this current work, we compare our approach with conventional Bayesian approach based on estimation of Bayesian evidence using nested sampling for the purpose of model selection. We use simulated future Roman (formerly WFIRST)-like type Ia supernovae data in our analysis. We discuss limits of the Bayesian approach for model selection and display how our proposed frequentist approach, if implemented appropriately, can perform better in falsification of individual models.

  • PDF

A Scheduling Approach with Component Selection

  • Harashima, Katsumi;Satoh, Hisashi;Hiro, Daisuke;Kutsuwa, Toshiro
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2000년도 ITC-CSCC -1
    • /
    • pp.399-402
    • /
    • 2000
  • The reduction of chip area and delay is important purpose of Scheduling in High-Level Synthesis. This paper presents a scheduling approach with component selection. After obtaining a initial schedule taking only single-functional u-nits, the component selection of our approach attempts the reduction of chip area and/or delay by the selection more suitable components in a component library using Simulated Annealing.

  • PDF

Speech Feature Selection of Normal and Autistic children using Filter and Wrapper Approach

  • Akhtar, Muhammed Ali;Ali, Syed Abbas;Siddiqui, Maria Andleeb
    • International Journal of Computer Science & Network Security
    • /
    • 제21권5호
    • /
    • pp.129-132
    • /
    • 2021
  • Two feature selection approaches are analyzed in this study. First Approach used in this paper is Filter Approach which comprises of correlation technique. It provides two reduced feature sets using positive and negative correlation. Secondly Approach used in this paper is the wrapper approach which comprises of Sequential Forward Selection technique. The reduced feature set obtained by positive correlation results comprises of Rate of Acceleration, Intensity and Formant. The reduced feature set obtained by positive correlation results comprises of Rasta PLP, Log energy, Log power and Zero Crossing Rate. Pitch, Rate of Acceleration, Log Power, MFCC, LPCC is the reduced feature set yield as a result of Sequential Forwarding Selection.

A Comprehensive Approach for Tamil Handwritten Character Recognition with Feature Selection and Ensemble Learning

  • Manoj K;Iyapparaja M
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제18권6호
    • /
    • pp.1540-1561
    • /
    • 2024
  • This research proposes a novel approach for Tamil Handwritten Character Recognition (THCR) that combines feature selection and ensemble learning techniques. The Tamil script is complex and highly variable, requiring a robust and accurate recognition system. Feature selection is used to reduce dimensionality while preserving discriminative features, improving classification performance and reducing computational complexity. Several feature selection methods are compared, and individual classifiers (support vector machines, neural networks, and decision trees) are evaluated through extensive experiments. Ensemble learning techniques such as bagging, and boosting are employed to leverage the strengths of multiple classifiers and enhance recognition accuracy. The proposed approach is evaluated on the HP Labs Dataset, achieving an impressive 95.56% accuracy using an ensemble learning framework based on support vector machines. The dataset consists of 82,928 samples with 247 distinct classes, contributed by 500 participants from Tamil Nadu. It includes 40,000 characters with 500 user variations. The results surpass or rival existing methods, demonstrating the effectiveness of the approach. The research also offers insights for developing advanced recognition systems for other complex scripts. Future investigations could explore the integration of deep learning techniques and the extension of the proposed approach to other Indic scripts and languages, advancing the field of handwritten character recognition.

Biological Feature Selection and Disease Gene Identification using New Stepwise Random Forests

  • Hwang, Wook-Yeon
    • Industrial Engineering and Management Systems
    • /
    • 제16권1호
    • /
    • pp.64-79
    • /
    • 2017
  • Identifying disease genes from human genome is a critical task in biomedical research. Important biological features to distinguish the disease genes from the non-disease genes have been mainly selected based on traditional feature selection approaches. However, the traditional feature selection approaches unnecessarily consider many unimportant biological features. As a result, although some of the existing classification techniques have been applied to disease gene identification, the prediction performance was not satisfactory. A small set of the most important biological features can enhance the accuracy of disease gene identification, as well as provide potentially useful knowledge for biologists or clinicians, who can further investigate the selected biological features as well as the potential disease genes. In this paper, we propose a new stepwise random forests (SRF) approach for biological feature selection and disease gene identification. The SRF approach consists of two stages. In the first stage, only important biological features are iteratively selected in a forward selection manner based on one-dimensional random forest regression, where the updated residual vector is considered as the current response vector. We can then determine a small set of important biological features. In the second stage, random forests classification with regard to the selected biological features is applied to identify disease genes. Our extensive experiments show that the proposed SRF approach outperforms the existing feature selection and classification techniques in terms of biological feature selection and disease gene identification.

Hybrid Optimization for Distribution Channel Management: A Case of Retail Location Selection

  • NONG, Nhu-Mai Thi;HA, Duc-Son
    • 유통과학연구
    • /
    • 제19권12호
    • /
    • pp.45-56
    • /
    • 2021
  • Purpose: This study aims to introduce a hybrid MCDM model to support the selection of retail store location. Research design, data, and methodology: The hybrid approach of ANP and TOPSIS was used to address the location selection problem. The ANP technique was employed to compute the weights of the selection criteria, whilst the TOPSIS was used to rank alternatives. The proposed approach was then applied into a fashion company in Vietnam to select the best alternatives to be the retail store. Results: The results showed that Candidate 1 - Hai Ba Trung street is the most appropriate selection for locating retail stores. Conclusions: The proposed approach provides the decision makers with more useful methods than traditional ones. Therefore, the model can be applied to the location selection in all industries. In terms of academic contribution, the selection criteria proposed in the research can devote to the literature in the selection of location along with the concept of distribution channels. Additionally, the research also provides insight and guidelines for firms in making decision on retail store location based on limited resources to avoid the waste of funds. However, the results only answer to the context of Vietnam - a developing country. Thus, future research may be extended to developed countries where have better conditions.

Variable Selection and Outlier Detection for Automated K-means Clustering

  • Kim, Sung-Soo
    • Communications for Statistical Applications and Methods
    • /
    • 제22권1호
    • /
    • pp.55-67
    • /
    • 2015
  • An important problem in cluster analysis is the selection of variables that define cluster structure that also eliminate noisy variables that mask cluster structure; in addition, outlier detection is a fundamental task for cluster analysis. Here we provide an automated K-means clustering process combined with variable selection and outlier identification. The Automated K-means clustering procedure consists of three processes: (i) automatically calculating the cluster number and initial cluster center whenever a new variable is added, (ii) identifying outliers for each cluster depending on used variables, (iii) selecting variables defining cluster structure in a forward manner. To select variables, we applied VS-KM (variable-selection heuristic for K-means clustering) procedure (Brusco and Cradit, 2001). To identify outliers, we used a hybrid approach combining a clustering based approach and distance based approach. Simulation results indicate that the proposed automated K-means clustering procedure is effective to select variables and identify outliers. The implemented R program can be obtained at http://www.knou.ac.kr/~sskim/SVOKmeans.r.

Quality Driven Approach for Product Line Architecture Customization in Patient Navigation Program Software Product Line

  • Ashari, Afifah M.;Abd Halim, Shahliza;Jawawi, Dayang N.A.;Suvelayutnan, Ushananthiny;Isa, Mohd Adham
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제15권7호
    • /
    • pp.2455-2475
    • /
    • 2021
  • Patient Navigation Program (PNP) is considered as an important implementation of health care systems that can assist in patient's treatment. Due to the feasibility of PNP implementation, a systematic reuse is needed for a wide adoption of PNP computerized system. SPL is one of the promising systematic reuse approaches for creating a reusable architecture to enabled reuse in several similar applications of PNP systems which has its own variations with other applications. However, stakeholder decision making which result from the imprecise, uncertain, and subjective nature of architecture selection based on quality attributes (QA) further hinders the development of the product line architecture. Therefore, this study aims to propose a quality-driven approach using Multi-Criteria Decision Analysis (MCDA) techniques for Software Product Line Architecture (SPLA) to have an objective selection based on the QA of stakeholders in the domain of PNP. There are two steps proposed to this approach. First, a clear representation of quality is proposed by extending feature model (FM) with QA feature to determine the QA in the early phase of architecture selection. Second, MCDA techniques were applied for architecture selection based on objective preference for certain QA in the domain of PNP. The result of the proposed approach is the implementation of the PNP system with SPLA that had been selected using MCDA techniques. Evaluation for the approach is done by checking the approach's applicability in a case study and stakeholder validation. Evaluation on ease of use and usefulness of the approach with selected stakeholders have shown positive responses. The evaluation results proved that the proposed approach assisted in the implementation of PNP systems.