• Title/Summary/Keyword: Feature Evaluation and Selection

Search Result 82, Processing Time 0.033 seconds

Electricity Demand Forecasting based on Support Vector Regression (Support Vector Regression에 기반한 전력 수요 예측)

  • Lee, Hyoung-Ro;Shin, Hyun-Jung
    • IE interfaces
    • /
    • v.24 no.4
    • /
    • pp.351-361
    • /
    • 2011
  • Forecasting of electricity demand have difficulty in adapting to abrupt weather changes along with a radical shift in major regional and global climates. This has lead to increasing attention to research on the immediate and accurate forecasting model. Technically, this implies that a model requires only a few input variables all of which are easily obtainable, and its predictive performance is comparable with other competing models. To meet the ends, this paper presents an energy demand forecasting model that uses the variable selection or extraction methods of data mining to select only relevant input variables, and employs support vector regression method for accurate prediction. Also, it proposes a novel performance measure for time-series prediction, shift index, followed by description on preprocessing procedure. A comparative evaluation of the proposed method with other representative data mining models such as an auto-regression model, an artificial neural network model, an ordinary support vector regression model was carried out for obtaining the forecast of monthly electricity demand from 2000 to 2008 based on data provided by Korea Energy Economics Institute. Among the models tested, the proposed method was shown promising results than others.

Extraction and classification of characteristic information of malicious code for an intelligent detection model (지능적 탐지 모델을 위한 악의적인 코드의 특징 정보 추출 및 분류)

  • Hwang, Yoon-Cheol
    • Journal of Industrial Convergence
    • /
    • v.20 no.5
    • /
    • pp.61-68
    • /
    • 2022
  • In recent years, malicious codes are being produced using the developing information and communication technology, and it is insufficient to detect them with the existing detection system. In order to accurately and efficiently detect and respond to such intelligent malicious code, an intelligent detection model is required, and in order to maximize detection performance, it is important to train with the main characteristic information set of the malicious code. In this paper, we proposed a technique for designing an intelligent detection model and generating the data required for model training as a set of key feature information through transformation, dimensionality reduction, and feature selection steps. And based on this, the main characteristic information was classified by malicious code. In addition, based on the classified characteristic information, we derived common characteristic information that can be used to analyze and detect modified or newly emerging malicious codes. Since the proposed detection model detects malicious codes by learning with a limited number of characteristic information, the detection time and response are fast, so damage can be greatly reduced and Although the performance evaluation result value is slightly different depending on the learning algorithm, it was found through evaluation that most malicious codes can be detected.

Using Genetic Algorithms in Wireless Mesh Network Routing Protocol Design (유전 알고리즘을 이용한 무선 메쉬 네트워크에서의 라우팅 프로토콜 설계)

  • Yoon, Chang-Pyo;Ryou, Hwang-Bin
    • The KIPS Transactions:PartC
    • /
    • v.18C no.3
    • /
    • pp.179-186
    • /
    • 2011
  • Wireless Mesh Network technology refers to the technology which establishes wireless network whose transmission speed is similar to that of the wire system, and provides more enhanced flexibility in the building of network, compared to the existing wired network. In addition, it has the feature of less mobility and less restriction from the energy effect. However, there follow many considerations such as system overhead in the case of setting or the selection of multi-path. Accordingly, the focus is on the design and optimization of network which can reflect this network feature and the technology to establish path. This paper suggests the methods on the programming of path in Wireless Mesh Network routing by applying the evaluation value of node service, making use of the loss rate of data, the hop count of bandwidth and link and the traffic status of node, considering the performance of link and load in the fitness evaluation function, in order to respond to the programming of multi-path effectively.

Evaluation of the Feature Selection function of Latent Semantic Indexing(LSI) Using a kNN Classifier (잠재의미색인(LSI) 기법을 이용한 kNN 분류기의 자질 선정에 관한 연구)

  • Park, Boo-Young;Chung, Young-Mee
    • Proceedings of the Korean Society for Information Management Conference
    • /
    • 2004.08a
    • /
    • pp.163-166
    • /
    • 2004
  • 텍스트 범주화에 관한 선행연구에서 자주 사용되면서 좋은 성능을 보인 자질 선정 기법은 문헌빈도와 카이제곱 통계량 등이다. 그러나 이들은 단어 자체가 갖고 있는 모호성은 제거하지 못한다는 단점이 있다. 본 연구에서는 kNN 분류기를 이용한 범주화 실험에서 단어간의 상호 관련성이 자동적으로 유도됨으로써 단어 자체 보다는 단어의 개념을 분석하는 잠재의미색인 기법을 자질 선정 방법으로 제안한다.

  • PDF

A Method of Feature Selection and Evaluation for Resolving Estrangement of Welfare (복지사각지대 예측을 위한 속성 선택 및 평가 기법)

  • Lee, Keonsoo;Park, Junho
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2019.10a
    • /
    • pp.671-673
    • /
    • 2019
  • 복지사각지대 발굴은 복지 서비스가 필요하지만 수혜를 받지 못하고 있는 국민을 찾아내서 지원해주는 정부 시스템이다. 본 연구는 이 복지사각지대 발굴 관리 시스템의 효용성을 높이기 위해, 발굴 예측 모델의 정확도를 높이기 위한 방법으로, 발굴 과정에서 사용하는 속성을 선택하고 평가하기 위한 방법을 제안한다. 제안 방법은 각 속성의 유효성을 검증하고, 검증된 속성 집합을 선택한 뒤, 예측 기준을 결정하는 세 단계로 구성되며, 이 방법을 통해 기존 복지사각지대 예측 모델의 개선에 적용할 수 있을 것으로 기대할 수 있다.

Integration rough set theory and case-base reasoning for the corporate credit evaluation (러프집합이론과 사례기반추론을 결합한 기업신용평가 모형)

  • Roh, Tae-Hyup;Yoo Myung-Hwan;Han In-Goo
    • The Journal of Information Systems
    • /
    • v.14 no.1
    • /
    • pp.41-65
    • /
    • 2005
  • The credit ration is a significant area of financial management which is of major interest to practitioners, financial and credit analysts. The components of credit rating are identified decision models are developed to assess credit rating an the corresponding creditworthiness of firms an accurately ad possble. Although many early studies demonstrate a priori which of these techniques will be most effective to solve a specific classification problem. Recently, a number of studies have demonstrate that a hybrid model integration artificial intelligence approaches with other feature selection algorthms can be alternative methodologies for business classification problems. In this article, we propose a hybrid approach using rough set theory as an alternative methodology to select appropriate attributes for case-based reasoning. This model uses rough specific interest lies in lthe stable combining of both rough set theory to extract knowledge that can guide dffective retrevals of useful cases. Our specific interest lies in the stable combining of both rough set theory and case-based reasoning in the problem of corporate credit rating. In addition, we summarize backgrounds of applying integrated model in the field of corporate credit rating with a brief description of various credit rating methodologies.

  • PDF

Performance Improvement of Fast Speaker Adaptation Based on Dimensional Eigenvoice and Adaptation Mode Selection (차원별 Eigenvoice와 화자적응 모드 선택에 기반한 고속화자적응 성능 향상)

  • 송화전;이윤근;김형순
    • The Journal of the Acoustical Society of Korea
    • /
    • v.22 no.1
    • /
    • pp.48-53
    • /
    • 2003
  • Eigenvoice method is known to be adequate for fast speaker adaptation, but it hardly shows additional improvement with increased amount of adaptation data. In this paper, to deal with this problem, we propose a modified method estimating the weights of eigenvoices in each feature vector dimension. We also propose an adaptation mode selection scheme that one method with higher performance among several adaptation methods is selected according to the amount of adaptation data. We used POW DB to construct the speaker independent model and eigenvoices, and utterances(ranging from 1 to 50) from PBW 452 DB and the remaining 400 utterances were used for adaptation and evaluation, respectively. With the increased amount of adaptation data, proposed dimensional eigenvoice method showed higher performance than both conventional eigenvoice method and MLLR. Up to 26% of word error rate was reduced by the adaptation mode selection between eigenvoice and dimensional eigenvoice methods in comparison with conventional eigenvoice method.

Fuzzy discretization with spatial distribution of data and Its application to feature selection (데이터의 공간적 분포를 고려한 퍼지 이산화와 특징선택에의 응용)

  • Son, Chang-Sik;Shin, A-Mi;Lee, In-Hee;Park, Hee-Joon;Park, Hyoung-Seob;Kim, Yoon-Nyun
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.20 no.2
    • /
    • pp.165-172
    • /
    • 2010
  • In clinical data minig, choosing the optimal subset of features is such important, not only to reduce the computational complexity but also to improve the usefulness of the model constructed from the given data. Moreover the threshold values (i.e., cut-off points) of selected features are used in a clinical decision criteria of experts for differential diagnosis of diseases. In this paper, we propose a fuzzy discretization approach, which is evaluated by measuring the degree of separation of redundant attribute values in overlapping region, based on spatial distribution of data with continuous attributes. The weighted average of the redundant attribute values is then used to determine the threshold value for each feature and rough set theory is utilized to select a subset of relevant features from the overall features. To verify the validity of the proposed method, we compared experimental results, which applied to classification problem using 668 patients with a chief complaint of dyspnea, based on three discretization methods (i.e., equal-width, equal-frequency, and entropy-based) and proposed discretization method. From the experimental results, we confirm that the discretization methods with fuzzy partition give better results in two evaluation measures, average classification accuracy and G-mean, than those with hard partition.

Methodology for Prioritizing Sidewalk Construction among 100 Candidate Sites on Rural National Highways (지방부 국도에서의 보도설치 우선순위 결정을 위한 방법론 개발 (일반국도 적용사례 중심으로))

  • Jeon, Woo Hoon;Yang, Choong Heon;Yoon, Jung Eun;Yang, Inchul
    • International Journal of Highway Engineering
    • /
    • v.17 no.4
    • /
    • pp.127-133
    • /
    • 2015
  • PURPOSES: The purpose of this study is to develop a methodology to prioritize sidewalk construction on rural national highways. METHODS : In order to determine an appropriate prioritization for sidewalk construction, we developed a specific methodology. The proposed methodology includes three main steps: 1) Analytic Hierarchy Process (AHP) methods, 2) Subjective evaluation of relevant road agencies for the candidate sidewalks along rural national highways, and 3) Field study conduction. Each step has four phases. The primary feature of this methodology is the addition of expert consultation and survey data, as well as a field study. In addition, the method could guarantee flexibility in selection for evaluation criteria. As a result, the proposed methodology could be used as a general procedure for application to other roadway classifications when considering sidewalk construction. RESULTS: In order to demonstrate the reasonableness of the proposed methodology, a case study was performed for exactly 100 candidate sites for sidewalk construction on rural national highways. All required evaluation scores were properly produced for each candidate site. By doing so, decision-makers can determine the priority for sidewalk construction at these sites by reviewing quantitatively and qualitatively considered data. CONCLUSIONS: The results of the case study can be applied to a long-term fundamental plan for sidewalk construction on rural national highways. Furthermore, this methodology could be employed to prioritize a small-scale SOC project(e. g. bicycle or pedestrian roads).

A Study of Evaluation of the Feature from Cooccurrence Matrix and Appropriate Applicable Resolution (공기행렬의 질감특성치들에 대한 평가와 적정 적용해상도에 관한 연구)

  • Kwon, Oh-Hyoung;Kim, Yong-Il;Eo, Yang-Dam
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.8 no.1 s.15
    • /
    • pp.105-110
    • /
    • 2000
  • Since the advent of high resolution satellite image, possibilities of applying various human interpretation mechanism to these images have increased. Also many studies about these possibilities in many fields such as computer vision, pattern recognition, artificial intellegence and remote sensing have been done. In this field of these studies, texture is defined as a kind of quantity related to spatial distribution of brightness and tone and also plays an important role for interpretation of images. Especially, methods of obtaining texture by statistical model have been studied intensively. Among these methods, texture measurement method based on cooccurrence matrix is highly estimated because it is easy to calculate texture features compared with other methods. In addition, these results in high classification accuracy when this is applied to satellite images and aerial photos. But in the existing studies using cooccurrence matrix, features have been chosen arbitrarily without considering feature variation. And not enough studies have been implemented for appropriate resolution selection in which cooccurrence matrix can extract texture. Therefore, this study reviews the concept of cooccurrence matrix as a texture measurement method, evaluates usefulness of several features obtained from cooccurrence matrix, and proposes appropriate resolution by investigating variance trend of several features.

  • PDF