Search | Korea Science

Bayesian Parameter :Estimation and Variable Selection in Random Effects Generalised Linear Models for Count Data

Oh, Man-Suk;Park, Tae-Sung
- Journal of the Korean Statistical Society
- /
- v.31 no.1
- /
- pp.93-107
- /
- 2002
Random effects generalised linear models are useful for analysing clustered count data in which responses are usually correlated. We propose a Bayesian approach to parameter estimation and variable selection in random effects generalised linear models for count data. A simple Gibbs sampling algorithm for parameter estimation is presented and a simple and efficient variable selection is done by using the Gibbs outputs. An illustrative example is provided.
PDF KSCI

A Feature Selection Method Based on Fuzzy Cluster Analysis (퍼지 클러스터 분석 기반 특징 선택 방법)

Rhee, Hyun-Sook
- The KIPS Transactions:PartB
- /
- v.14B no.2
- /
- pp.135-140
- /
- 2007
Feature selection is a preprocessing technique commonly used on high dimensional data. Feature selection studies how to select a subset or list of attributes that are used to construct models describing data. Feature selection methods attempt to explore data's intrinsic properties by employing statistics or information theory. The recent developments have involved approaches like correlation method, dimensionality reduction and mutual information technique. This feature selection have become the focus of much research in areas of applications with massive and complex data sets. In this paper, we provide a feature selection method considering data characteristics and generalization capability. It provides a computational approach for feature selection based on fuzzy cluster analysis of its attribute values and its performance measures. And we apply it to the system for classifying computer virus and compared with heuristic method using the contrast concept. Experimental result shows the proposed approach can give a feature ranking, select the features, and improve the system performance.
https://doi.org/10.3745/KIPSTB.2007.14-B.2.135 인용 PDF KSCI

The Game Selection Model for the Payoff Strategy Optimization of Mobile CrowdSensing Task

Zhao, Guosheng;Liu, Dongmei;Wang, Jian
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.15 no.4
- /
- pp.1426-1447
- /
- 2021
The payoff game between task publishers and users in the mobile crowdsensing environment is a hot topic of research. A optimal payoff selection model based on stochastic evolutionary game is proposed. Firstly, the process of payoff optimization selection is modeled as a task publisher-user stochastic evolutionary game model. Secondly, the low-quality data is identified by the data quality evaluation algorithm, which improves the fitness of perceptual task matching target users, so that task publishers and users can obtain the optimal payoff at the current moment. Finally, by solving the stability strategy and analyzing the stability of the model, the optimal payoff strategy is obtained under different intensity of random interference and different initial state. The simulation results show that, in the aspect of data quality evaluation, compared with BP detection method and SVM detection method, the accuracy of anomaly data detection of the proposed model is improved by 8.1% and 0.5% respectively, and the accuracy of data classification is improved by 59.2% and 32.2% respectively. In the aspect of the optimal payoff strategy selection, it is verified that the proposed model can reasonably select the payoff strategy.
https://doi.org/10.3837/tiis.2021.04.013 인용 PDF KSCI HTML

A Study on Classifications of Remote Sensed Multispectral Image Data using Soft Computing Technique - Stressed on Rough Sets - (소프트 컴퓨팅기술을 이용한 원격탐사 다중 분광 이미지 데이터의 분류에 관한 연구 -Rough 집합을 중심으로-)

Won Sung-Hyun
- Management & Information Systems Review
- /
- v.3
- /
- pp.15-45
- /
- 1999
Processing techniques of remote sensed image data using computer have been recognized very necessary techniques to all social fields, such as, environmental observation, land cultivation, resource investigation, military trend grasp and agricultural product estimation, etc. Especially, accurate classification and analysis to remote sensed image da are important elements that can determine reliability of remote sensed image data processing systems, and many researches have been processed to improve these accuracy of classification and analysis. Traditionally, remote sensed image data processing systems have been processed 2 or 3 selected bands in multiple bands, in this time, their selection criterions are statistical separability or wavelength properties. But, it have be bring up the necessity of bands selection method by data distribution characteristics than traditional bands selection by wavelength properties or statistical separability. Because data sensing environments change from multispectral environments to hyperspectral environments. In this paper for efficient data classification in multispectral bands environment, a band feature extraction method using the Rough sets theory is proposed. First, we make a look up table from training data, and analyze the properties of experimental multispectral image data, then select the efficient band using indiscernibility relation of Rough set theory from analysis results. Proposed method is applied to LANDSAT TM data on 2 June 1992. From this, we show clustering trends that similar to traditional band selection results by wavelength properties, from this, we verify that can use the proposed method that centered on data properties to select the efficient bands, though data sensing environment change to hyperspectral band environments.
PDF

Improving an Ensemble Model Using Instance Selection Method (사례 선택 기법을 활용한 앙상블 모형의 성능 개선)

Min, Sung-Hwan
- Journal of Korean Society of Industrial and Systems Engineering
- /
- v.39 no.1
- /
- pp.105-115
- /
- 2016
Ensemble classification involves combining individually trained classifiers to yield more accurate prediction, compared with individual models. Ensemble techniques are very useful for improving the generalization ability of classifiers. The random subspace ensemble technique is a simple but effective method for constructing ensemble classifiers; it involves randomly drawing some of the features from each classifier in the ensemble. The instance selection technique involves selecting critical instances while deleting and removing irrelevant and noisy instances from the original dataset. The instance selection and random subspace methods are both well known in the field of data mining and have proven to be very effective in many applications. However, few studies have focused on integrating the instance selection and random subspace methods. Therefore, this study proposed a new hybrid ensemble model that integrates instance selection and random subspace techniques using genetic algorithms (GAs) to improve the performance of a random subspace ensemble model. GAs are used to select optimal (or near optimal) instances, which are used as input data for the random subspace ensemble model. The proposed model was applied to both Kaggle credit data and corporate credit data, and the results were compared with those of other models to investigate performance in terms of classification accuracy, levels of diversity, and average classification rates of base classifiers in the ensemble. The experimental results demonstrated that the proposed model outperformed other models including the single model, the instance selection model, and the original random subspace ensemble model.
https://doi.org/10.11627/jkise.2016.39.1.105 인용 PDF KSCI

FAFS: A Fuzzy Association Feature Selection Method for Network Malicious Traffic Detection

Feng, Yongxin;Kang, Yingyun;Zhang, Hao;Zhang, Wenbo
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.14 no.1
- /
- pp.240-259
- /
- 2020
Analyzing network traffic is the basis of dealing with network security issues. Most of the network security systems depend on the feature selection of network traffic data and the detection ability of malicious traffic in network can be improved by the correct method of feature selection. An FAFS method, which is short for Fuzzy Association Feature Selection method, is proposed in this paper for network malicious traffic detection. Association rules, which can reflect the relationship among different characteristic attributes of network traffic data, are mined by association analysis. The membership value of association rules are obtained by the calculation of fuzzy reasoning. The data features with the highest correlation intensity in network data sets are calculated by comparing the membership values in association rules. The dimension of data features are reduced and the detection ability of malicious traffic detection algorithm in network is improved by FAFS method. To verify the effect of malicious traffic feature selection by FAFS method, FAFS method is used to select data features of different dataset in this paper. Then, K-Nearest Neighbor algorithm, C4.5 Decision Tree algorithm and Naïve Bayes algorithm are used to test on the dataset above. Moreover, FAFS method is also compared with classical feature selection methods. The analysis of experimental results show that the precision and recall rate of malicious traffic detection in the network can be significantly improved by FAFS method, which provides a valuable reference for the establishment of network security system.
https://doi.org/10.3837/tiis.2020.01.014 인용 PDF KSCI HTML

Discretization Method Based on Quantiles for Variable Selection Using Mutual Information

CHa, Woon-Ock;Huh, Moon-Yul
- Communications for Statistical Applications and Methods
- /
- v.12 no.3
- /
- pp.659-672
- /
- 2005
This paper evaluates discretization of continuous variables to select relevant variables for supervised learning using mutual information. Three discretization methods, MDL, Histogram and 4-Intervals are considered. The process of discretization and variable subset selection is evaluated according to the classification accuracies with the 6 real data sets of UCI databases. Results show that 4-Interval discretization method based on quantiles, is robust and efficient for variable selection process. We also visually evaluate the appropriateness of the selected subset of variables.
https://doi.org/10.5351/CKSS.2005.12.3.659 인용 PDF KSCI

H-likelihood approach for variable selection in gamma frailty models

Ha, Il-Do;Cho, Geon-Ho
- Journal of the Korean Data and Information Science Society
- /
- v.23 no.1
- /
- pp.199-207
- /
- 2012
Recently, variable selection methods using penalized likelihood with a shrink penalty function have been widely studied in various statistical models including generalized linear models and survival models. In particular, they select important variables and estimate coefficients of covariates simultaneously. In this paper, we develop a penalize h-likelihood method for variable selection in gamma frailty models. For this we use the smoothly clipped absolute deviation (SCAD) penalty function, which satisfies a good property in variable selection. The proposed method is illustrated using simulation study and a practical data set.
https://doi.org/10.7465/jkdi.2012.23.1.199 인용 PDF KSCI

Variable selection in Poisson HGLMs using h-likelihoood

Ha, Il Do;Cho, Geon-Ho
- Journal of the Korean Data and Information Science Society
- /
- v.26 no.6
- /
- pp.1513-1521
- /
- 2015
Selecting relevant variables for a statistical model is very important in regression analysis. Recently, variable selection methods using a penalized likelihood have been widely studied in various regression models. The main advantage of these methods is that they select important variables and estimate the regression coefficients of the covariates, simultaneously. In this paper, we propose a simple procedure based on a penalized h-likelihood (HL) for variable selection in Poisson hierarchical generalized linear models (HGLMs) for correlated count data. For this we consider three penalty functions (LASSO, SCAD and HL), and derive the corresponding variable-selection procedures. The proposed method is illustrated using a practical example.
https://doi.org/10.7465/jkdi.2015.26.6.1513 인용 PDF KSCI

Genomic Selection for Adjacent Genetic Markers of Yorkshire Pigs Using Regularized Regression Approaches

Park, Minsu;Kim, Tae-Hun;Cho, Eun-Seok;Kim, Heebal;Oh, Hee-Seok
- Asian-Australasian Journal of Animal Sciences
- /
- v.27 no.12
- /
- pp.1678-1683
- /
- 2014
This study considers a problem of genomic selection (GS) for adjacent genetic markers of Yorkshire pigs which are typically correlated. The GS has been widely used to efficiently estimate target variables such as molecular breeding values using markers across the entire genome. Recently, GS has been applied to animals as well as plants, especially to pigs. For efficient selection of variables with specific traits in pig breeding, it is required that any such variable selection retains some properties: i) it produces a simple model by identifying insignificant variables; ii) it improves the accuracy of the prediction of future data; and iii) it is feasible to handle high-dimensional data in which the number of variables is larger than the number of observations. In this paper, we applied several variable selection methods including least absolute shrinkage and selection operator (LASSO), fused LASSO and elastic net to data with 47K single nucleotide polymorphisms and litter size for 519 observed sows. Based on experiments, we observed that the fused LASSO outperforms other approaches.
https://doi.org/10.5713/ajas.2014.14236 인용 PDF KSCI

Search Result 5,748, Processing Time 0.032 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)