통합 검색 | Korea Science

Biological Feature Selection and Disease Gene Identification using New Stepwise Random Forests

Hwang, Wook-Yeon
- Industrial Engineering and Management Systems
- /
- 제16권1호
- /
- pp.64-79
- /
- 2017
Identifying disease genes from human genome is a critical task in biomedical research. Important biological features to distinguish the disease genes from the non-disease genes have been mainly selected based on traditional feature selection approaches. However, the traditional feature selection approaches unnecessarily consider many unimportant biological features. As a result, although some of the existing classification techniques have been applied to disease gene identification, the prediction performance was not satisfactory. A small set of the most important biological features can enhance the accuracy of disease gene identification, as well as provide potentially useful knowledge for biologists or clinicians, who can further investigate the selected biological features as well as the potential disease genes. In this paper, we propose a new stepwise random forests (SRF) approach for biological feature selection and disease gene identification. The SRF approach consists of two stages. In the first stage, only important biological features are iteratively selected in a forward selection manner based on one-dimensional random forest regression, where the updated residual vector is considered as the current response vector. We can then determine a small set of important biological features. In the second stage, random forests classification with regard to the selected biological features is applied to identify disease genes. Our extensive experiments show that the proposed SRF approach outperforms the existing feature selection and classification techniques in terms of biological feature selection and disease gene identification.
https://doi.org/10.7232/iems.2017.16.1.064 인용 PDF

베타 확률분포를 이용한 입자 떼 최적화 알고리즘의 성능 비교 (On the Comparison of Particle Swarm Optimization Algorithm Performance using Beta Probability Distribution)

이병석;이준화;허문범
- 제어로봇시스템학회논문지
- /
- 제20권8호
- /
- pp.854-867
- /
- 2014
This paper deals with the performance comparison of a PSO algorithm inspired in the process of simulating the behavior pattern of the organisms. The PSO algorithm finds the optimal solution (fitness value) of the objective function based on a stochastic process. Generally, the stochastic process, a random function, is used with the expression related to the velocity included in the PSO algorithm. In this case, the random function of the normal distribution (Gaussian) or uniform distribution are mainly used as the random function in a PSO algorithm. However, in this paper, because the probability distribution which is various with 2 shape parameters can be expressed, the performance comparison of a PSO algorithm using the beta probability distribution function, that is a random function which has a high degree of freedom, is introduced. For performance comparison, 3 functions (Rastrigin, Rosenbrock, Schwefel) were selected among the benchmark Set. And the convergence property was compared and analyzed using PSO-FIW to find the optimal solution.
https://doi.org/10.5302/J.ICROS.2014.13.0019 인용 PDF KSCI

랜덤 심볼열의 바이어스된 분포를 이용한 정보 포텐셜과 블라인드 알고리즘 (Information Potential and Blind Algorithms Using a Biased Distribution of Random-Order Symbols)

김남용
- 한국통신학회논문지
- /
- 제38A권1호
- /
- pp.26-32
- /
- 2013
출력 샘플과 수신단에서 랜덤한 순서로 발생된 심볼의 정보 포텐셜을 기반으로 한 블라인드 알고리즘은, 바이어스된 충격성 잡음이 채널에 더해질 때, 정보 포텐셜을 바탕으로 한 비용함수에 바이어스된 신호를 처리할 변수가 포함되어 있지 않아 성능저하를 겪게 된다. 이러한 바이어스된 충격성 잡음에 대한 강건성을 목표로, 이 논문에서는 수정된 정보 포텐셜을 제안하고, 이 제안된 정보 포텐셜에 기반하여 증강된 필터 구조와 랜덤 심볼을 사용한 새로운 블라인드 알고리즘을 도출하였다. 다중 경로 채널의 블라인드 등화에 대한 시뮬레이션 결과로부터, 제안된 정보 포텐셜에 기반한 블라인드 알고리즘이 바이어스된 강한 충격성 잡음 환경에서 탁월한 수렴 성능을 나타냈다.
https://doi.org/10.7840/kics.2013.38A.1.26 인용 PDF KSCI

A Novel Random Scheduling Algorithm based on Subregions Coverage for SET K-Cover Problem in Wireless Sensor Networks

Muhammad, Zahid;Roy, Abhishek;Ahn, Chang Wook;Sachan, Ruchi;Saxena, Navrati
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- 제12권6호
- /
- pp.2658-2679
- /
- 2018
This paper proposes a novel Random Scheduling Algorithm based on Subregion Coverage (RSASC), to solve the SET K-cover problem (an NP-complete problem). SET K-cover problem distributes the set of sensors into the maximum number of mutually exclusive subsets (MESSs) in such a way that each of them can be scheduled for lifetime extension of WSN. Sensor coverage divides the target region into different subregions. RSASC first sorts the subregions in the ascending order concerning their sensor coverage. Then, it forms the subregion groups according to their similar sensor coverage. Lastly, RSASC ensures the K-coverage of each subregion from every group by randomly scheduling the sensors. We consider the target-coverage and area-coverage applications of WSN to analyze the usefulness of our proposed RSASC algorithm. The distinct quality of RSASC is that it utilizes less number of deployed sensors (33% less) to form the optimum number of MESSs with the higher computational speed (saves more than 93% of the time) as compared to the existing three algorithms.
https://doi.org/10.3837/tiis.2018.06.012 인용 PDF KSCI

Use of a Machine Learning Algorithm to Predict Individuals with Suicide Ideation in the General Population

Ryu, Seunghyong;Lee, Hyeongrae;Lee, Dong-Kyun;Park, Kyeongwoo
- Psychiatry investigation
- /
- 제15권11호
- /
- pp.1030-1036
- /
- 2018
Objective In this study, we aimed to develop a model predicting individuals with suicide ideation within a general population using a machine learning algorithm. Methods Among 35,116 individuals aged over 19 years from the Korea National Health & Nutrition Examination Survey, we selected 11,628 individuals via random down-sampling. This included 5,814 suicide ideators and the same number of non-suicide ideators. We randomly assigned the subjects to a training set (n=10,466) and a test set (n=1,162). In the training set, a random forest model was trained with 15 features selected with recursive feature elimination via 10-fold cross validation. Subsequently, the fitted model was used to predict suicide ideators in the test set and among the total of 35,116 subjects. All analyses were conducted in R. Results The prediction model achieved a good performance [area under receiver operating characteristic curve (AUC)=0.85] in the test set and predicted suicide ideators among the total samples with an accuracy of 0.821, sensitivity of 0.836, and specificity of 0.807. Conclusion This study shows the possibility that a machine learning approach can enable screening for suicide risk in the general population. Further work is warranted to increase the accuracy of prediction.
https://doi.org/10.30773/pi.2018.08.27 인용 KSCI

Random Pattern Testability of AND/XOR Circuits

Lee, Gueesang
- Journal of Electrical Engineering and information Science
- /
- 제3권1호
- /
- pp.8-13
- /
- 1998
Often ESOP(Exclusive Sum of Products) expressions provide more compact representations of logic functions and implemented circuits are known to be highly testable. Motivated by the merits of using XOR(Exclusive-OR) gates in circuit design, ESOP(Exclusive Sum of Products) expressions are considered s the input to the logic synthesis for random pattern testability. The problem of interest in this paper is whether ESOP expressions provide better random testability than corresponding SOP expressions of the given function. Since XOR gates are used to collect product terms of ESOP expression, fault propagation is not affected by any other product terms in the ESOP expression. Therefore the test set for a fault in ESOP expressions becomes larger than that of SOP expressions, thereby providing better random testability. Experimental results show that in many cases, ESOP expressions require much less random patterns compared to SOP expressions.
PDF

Joint Modeling of Death Times and Counts Using a Random Effects Model

Park, Hee-Chang;Klein, John P.
- Journal of the Korean Data and Information Science Society
- /
- 제16권4호
- /
- pp.1017-1026
- /
- 2005
We consider the problem of modeling count data where the observation period is determined by the survival time of the individual under study. We assume random effects or frailty model to allow for a possible association between the death times and the counts. We assume that, given a random effect, the death times follow a Weibull distribution with a rate that depends on some covariates. For the counts, given the random effect, a Poisson process is assumed with the intensity depending on time and the covariates. A gamma model is assumed for the random effect. Maximum likelihood estimators of the model parameters are obtained. The model is applied to data set of patients with breast cancer who received a bone marrow transplant. A model for the time to death and the number of supportive transfusions a patient received is constructed and consequences of the model are examined.
PDF

On a Stopping Rule for the Random Walks with Time Stationary Random Distribution Function

Hong, Dug-Hun;Oh, Kwang-Sik;Park, Hee-Joo
- Journal of the Korean Statistical Society
- /
- 제24권2호
- /
- pp.293-301
- /
- 1995
Sums of independent random variables $S_n = X_1 + \cdots + X_n$ are considered, where the $X_n$ are chosen according to a stationary process of distributions. For $c > 0$, let $t_c$ be the smallest positive integer n such that $$\mid$S_n$\mid$ > cn^{\frac{1}{2}}$. In this set up we are concerned with finiteness of expectation of $t_c$ and we have some results of sign-invariant process as applications.
PDF

A HGLM framework for Meta-Analysis of Clinical Trials with Binary Outcomes

Ha, Il-Do
- Journal of the Korean Data and Information Science Society
- /
- 제19권4호
- /
- pp.1429-1440
- /
- 2008
In a meta-analysis combining the results from different clinical trials, it is important to consider the possible heterogeneity in outcomes between trials. Such variations can be regarded as random effects. Thus, random-effect models such as HGLMs (hierarchical generalized linear models) are very useful. In this paper, we propose a HGLM framework for analyzing the binominal response data which may have variations in the odds-ratios between clinical trials. We also present the prediction intervals for random effects which are in practice useful to investigate the heterogeneity of the trial effects. The proposed method is illustrated with a real-data set on 22 trials about respiratory tract infections. We further demonstrate that an appropriate HGLM can be confirmed via model-selection criteria.
PDF

Reliability-based fragility analysis of nonlinear structures under the actions of random earthquake loads

Salimi, Mohammad-Rashid;Yazdani, Azad
- Structural Engineering and Mechanics
- /
- 제66권1호
- /
- pp.75-84
- /
- 2018
This study presents the reliability-based analysis of nonlinear structures using the analytical fragility curves excited by random earthquake loads. The stochastic method of ground motion simulation is combined with the random vibration theory to compute structural failure probability. The formulation of structural failure probability using random vibration theory, based on only the frequency information of the excitation, provides an important basis for structural analysis in places where there is a lack of sufficient recorded ground motions. The importance of frequency content of ground motions on probability of structural failure is studied for different levels of the nonlinear behavior of structures. The set of simulated ground motion for this study is based on the results of probabilistic seismic hazard analysis. It is demonstrated that the scenario events identified by the seismic risk differ from those obtained by the disaggregation of seismic hazard. The validity of the presented procedure is evaluated by Monte-Carlo simulation.
https://doi.org/10.12989/sem.2018.66.1.075 인용 KSCI

검색결과 855건 처리시간 0.035초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)