• 제목/요약/키워드: Method Selection

검색결과 6,590건 처리시간 0.032초

A Novel Feature Selection Method in the Categorization of Imbalanced Textual Data

  • Pouramini, Jafar;Minaei-Bidgoli, Behrouze;Esmaeili, Mahdi
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제12권8호
    • /
    • pp.3725-3748
    • /
    • 2018
  • Text data distribution is often imbalanced. Imbalanced data is one of the challenges in text classification, as it leads to the loss of performance of classifiers. Many studies have been conducted so far in this regard. The proposed solutions are divided into several general categories, include sampling-based and algorithm-based methods. In recent studies, feature selection has also been considered as one of the solutions for the imbalance problem. In this paper, a novel one-sided feature selection known as probabilistic feature selection (PFS) was presented for imbalanced text classification. The PFS is a probabilistic method that is calculated using feature distribution. Compared to the similar methods, the PFS has more parameters. In order to evaluate the performance of the proposed method, the feature selection methods including Gini, MI, FAST and DFS were implemented. To assess the proposed method, the decision tree classifications such as C4.5 and Naive Bayes were used. The results of tests on Reuters-21875 and WebKB figures per F-measure suggested that the proposed feature selection has significantly improved the performance of the classifiers.

Improvement of cluster head selection method in L-SEP

  • Jin, Seung Yeon;Jung, Kye-Dong;Lee, Jong-Yong
    • International Journal of Internet, Broadcasting and Communication
    • /
    • 제9권4호
    • /
    • pp.51-58
    • /
    • 2017
  • This paper deals with the improvement of cluster head selection method in L-SEP for heterogeneous nodes among hierarchical routing protocols of wireless sensor network. Wireless sensor networks are classified into homogeneous and heterogeneous network. In heterogeneous network, SEP, L-SEP are mainly used because cluster head selection probability is different depending on node type. But, since protocol based on SEP has different cluster head selection probabilities depending on the node type, clusters that transmit data inefficiently can be formed. to improve this, it is necessary to select the cluster head that minimizes the transmission distance of member node and the cluster head. Therefore, we propose a protocol that improve the cluster head selection method.

An Exploration on the Use of Data Envelopment Analysis for Product Line Selection

  • Lin, Chun-Yu;Okudan, Gul E.
    • Industrial Engineering and Management Systems
    • /
    • 제8권1호
    • /
    • pp.47-53
    • /
    • 2009
  • We define product line (or mix) selection problem as selecting a subset of potential product variants that can simultaneously minimize product proliferation and maintain market coverage. Selecting the most efficient product mix is a complex problem, which requires analyses of multiple criteria. This paper proposes a method based on Data Envelopment Analysis (DEA) for product line selection. Data Envelopment Analysis (DEA) is a linear programming based technique commonly used for measuring the relative performance of a group of decision making units with multiple inputs and outputs. Although DEA has been proved to be an effective evaluation tool in many fields, it has not been applied to solve the product line selection problem. In this study, we construct a five-step method that systematically adopts DEA to solve a product line selection problem. We then apply the proposed method to an existing line of staplers to provide quantitative evidence for managers to generate desirable decisions to maximize the company profits while also fulfilling market demands.

Incremental Antenna Selection Based on Lattice-Reduction for Spatial Multiplexing MIMO Systems

  • Kim, Sangchoon
    • 한국정보기술학회 영문논문지
    • /
    • 제10권1호
    • /
    • pp.1-14
    • /
    • 2020
  • Antenna selection is a method to enhance the performance of spatial multiplexing multiple-input multiple-output (MIMO) systems, which can achieve the diversity order of the full MIMO systems. Although various selection criteria have been studied in the literature, they should be adjusted to the detection operation implemented at the receiver. In this paper, antenna selection methods that optimize the post-processing signal-to-noise ratio (SNR) and eigenvalue are considered for the lattice reduction (LR)-based receiver. To develop a complexity-efficient antenna selection algorithm, the incremental selection strategy is adopted. Moreover, for improvement of performance, an additional iterative selection method is presented in combination with an incremental strategy.

A Study on Unbiased Methods in Constructing Classification Trees

  • Lee, Yoon-Mo;Song, Moon Sup
    • Communications for Statistical Applications and Methods
    • /
    • 제9권3호
    • /
    • pp.809-824
    • /
    • 2002
  • we propose two methods which separate the variable selection step and the split-point selection step. We call these two algorithms as CHITES method and F&CHITES method. They adapted some of the best characteristics of CART, CHAID, and QUEST. In the first step the variable, which is most significant to predict the target class values, is selected. In the second step, the exhaustive search method is applied to find the splitting point based on the selected variable in the first step. We compared the proposed methods, CART, and QUEST in terms of variable selection bias and power, error rates, and training times. The proposed methods are not only unbiased in the null case, but also powerful for selecting correct variables in non-null cases.

On an Optimal Bayesian Variable Selection Method for Generalized Logit Model

  • Kim, Hea-Jung;Lee, Ae Kuoung
    • Communications for Statistical Applications and Methods
    • /
    • 제7권2호
    • /
    • pp.617-631
    • /
    • 2000
  • This paper is concerned with suggesting a Bayesian method for variable selection in generalized logit model. It is based on Laplace-Metropolis algorithm intended to propose a simple method for estimating the marginal likelihood of the model. The algorithm then leads to a criterion for the selection of variables. The criterion is to find a subset of variables that maximizes the marginal likelihood of the model and it is seen to be a Bayes rule in a sense that it minimizes the risk of the variable selection under 0-1 loss function. Based upon two examples, the suggested method is illustrated and compared with existing frequentist methods.

  • PDF

Exploring an Optimal Feature Selection Method for Effective Opinion Mining Tasks

  • Eo, Kyun Sun;Lee, Kun Chang
    • 한국컴퓨터정보학회논문지
    • /
    • 제24권2호
    • /
    • pp.171-177
    • /
    • 2019
  • This paper aims to find the most effective feature selection method for the sake of opinion mining tasks. Basically, opinion mining tasks belong to sentiment analysis, which is to categorize opinions of the online texts into positive and negative from a text mining point of view. By using the five product groups dataset such as apparel, books, DVDs, electronics, and kitchen, TF-IDF and Bag-of-Words(BOW) fare calculated to form the product review feature sets. Next, we applied the feature selection methods to see which method reveals most robust results. The results show that the stacking classifier based on those features out of applying Information Gain feature selection method yields best result.

자동적인 여러 임계값 결정 기법 (Automatic Multithreshold Selection Method)

  • 이한;박래홍
    • 대한전기학회:학술대회논문집
    • /
    • 대한전기학회 1987년도 전기.전자공학 학술대회 논문집(II)
    • /
    • pp.1371-1374
    • /
    • 1987
  • This paper presents a new automatic multithreshold selection method which is based on the threshold selection method proposed by Otsu. This method can overcome some of limitations of the Otsu's method. An optimal threshold is selected by the new criterion so as to maximize the separability in all subregions. To get multiple thresholds, the procedure may be recursively applied to the resultant classes which are determined by the proposed evaluation measure.

  • PDF

전진선택법에 의해 선택된 부분 상관관계의 유전자들을 이용한 암 분류 (Classifying Cancer Using Partially Correlated Genes Selected by Forward Selection Method)

  • 유시호;조성배
    • 대한전자공학회논문지SP
    • /
    • 제41권3호
    • /
    • pp.83-92
    • /
    • 2004
  • 유전 발현 데이터는 생명체의 특정 조직에서 채취한 샘플을 마이크로어레이상에서 측정한 것으로, 유전자들의 발현 정도가 수치로 나타난 데이터이다. 일반적으로 정상조직과 이상조직에서 관련 유전자들의 발현 정도는 차이를 보이기 때문에 유전 발현 데이터를 통하여 암을 분류할 수 있다. 그러나 분류에 모든 유전자가 관여하지는 않으므로 효율적인 암의 분류를 위해서는 관련성 있는 소수의 유전자만을 선별해내는 작업인 특징선택 방법이 필요하다. 본 논문에서는 회귀분석의 변수선택방법중 하나인 전진 선택법(forward selection method)을 사용하여 유전자들을 선하고 분류하는 방법을 제안한다. 이 방법은 선택되는 유전자들의 중복된 정보를 최소화시켜 암의 분류에 있어 보다 효과적인 유전자 선택을 한다. 실험데이터는 대장암 데이터(Colon cancer dataset)를 사용하였고, 분류기는 k-최근접 이웃(KNN)을 사용하였다. 이 방법과 상관계수를 이용한 특징 선택방법인 피어슨 상관계수와 스피어맨 상관계수방법과 비교해본 결과 전진 선택법에 의한 특징선택 방법이 암의 분류에 있어서 더 효과적인 유전자 선택을 한다는 사실을 확인하였다. 실험결과 90.3%의 높은 인식률을 보였다. 추가적으로 림프종 데이터에 대한 실험을 하였고, 그 결과 전진 선택법의 유용성을 확인할 수 있었다.

Link Adaptation and Selection Method for OFDM Based Wireless Relay Networks

  • Can, Basak;Yomo, Hiroyuki;Carvalho, Elisabeth De
    • Journal of Communications and Networks
    • /
    • 제9권2호
    • /
    • pp.118-127
    • /
    • 2007
  • We propose a link adaptation and selection method for the links constituting an orthogonal frequency division multiplexing (OFDM) based wireless relay network. The proposed link adaptation and selection method selects the forwarding, modulation, and channel coding schemes providing the highest end-to-end throughput and decides whether to use the relay or not. The link adaptation and selection is done for each sub-channel based on instantaneous signal to interference plus noise ratio (SINR) conditions in the source-to-destination, source-to-relay and relay-to-destination links. The considered forwarding schemes are amplify and forward (AF) and simple adaptive decode and forward (DF). Efficient adaptive modulation and coding decision rules are provided for various relaying schemes. The proposed end-to-end link adaptation and selection method ensures that the end-to-end throughput is always larger than or equal to that of transmissions without relay and non-adaptive relayed transmissions. Our evaluations show that over the region where relaying improves the end-to-end throughput, the DF scheme provides significant throughput gain over the AF scheme provided that the error propagation is avoided via error detection techniques. We provide a frame structure to enable the proposed link adaptation and selection method for orthogonal frequency division multiple access (OFDMA)-time division duplex relay networks based on the IEEE 802.16e standard.