Search | Korea Science

A Clustering Approach for Feature Selection in Microarray Data Classification Using Random Forest

Aydadenta, Husna;Adiwijaya, Adiwijaya
- Journal of Information Processing Systems
- /
- v.14 no.5
- /
- pp.1167-1175
- /
- 2018
Microarray data plays an essential role in diagnosing and detecting cancer. Microarray analysis allows the examination of levels of gene expression in specific cell samples, where thousands of genes can be analyzed simultaneously. However, microarray data have very little sample data and high data dimensionality. Therefore, to classify microarray data, a dimensional reduction process is required. Dimensional reduction can eliminate redundancy of data; thus, features used in classification are features that only have a high correlation with their class. There are two types of dimensional reduction, namely feature selection and feature extraction. In this paper, we used k-means algorithm as the clustering approach for feature selection. The proposed approach can be used to categorize features that have the same characteristics in one cluster, so that redundancy in microarray data is removed. The result of clustering is ranked using the Relief algorithm such that the best scoring element for each cluster is obtained. All best elements of each cluster are selected and used as features in the classification process. Next, the Random Forest algorithm is used. Based on the simulation, the accuracy of the proposed approach for each dataset, namely Colon, Lung Cancer, and Prostate Tumor, achieved 85.87%, 98.9%, and 89% accuracy, respectively. The accuracy of the proposed approach is therefore higher than the approach using Random Forest without clustering.
https://doi.org/10.3745/JIPS.04.0087 인용 PDF KSCI

Efficient Sampling of Graph Signals with Reduced Complexity (저 복잡도를 갖는 효율적인 그래프 신호의 샘플링 알고리즘)

Kim, Yoon Hak
- The Journal of the Korea institute of electronic communication sciences
- /
- v.17 no.2
- /
- pp.367-374
- /
- 2022
A sampling set selection algorithm is proposed to reconstruct original graph signals from the sampled signals generated on the nodes in the sampling set. Instead of directly minimizing the reconstruction error, we focus on minimizing the upper bound on the reconstruction error to reduce the algorithm complexity. The metric is manipulated by using QR factorization to produce the upper triangular matrix and the analytic result is presented to enable a greedy selection of the next nodes at iterations by using the diagonal entries of the upper triangular matrix, leading to an efficient sampling process with reduced complexity. We run experiments for various graphs to demonstrate a competitive reconstruction performance of the proposed algorithm while offering the execution time about 3.5 times faster than one of the previous selection methods.
https://doi.org/10.13067/JKIECS.2022.17.2.367 인용 PDF KSCI

Hepatitis C Stage Classification with hybridization of GA and Chi2 Feature Selection

Umar, Rukayya;Adeshina, Steve;Boukar, Moussa Mahamat
- International Journal of Computer Science & Network Security
- /
- v.22 no.1
- /
- pp.167-174
- /
- 2022
In metaheuristic algorithms such as Genetic Algorithm (GA), initial population has a significant impact as it affects the time such algorithm takes to obtain an optimal solution to the given problem. In addition, it may influence the quality of the solution obtained. In the machine learning field, feature selection is an important process to attaining a good performance model; Genetic algorithm has been utilized for this purpose by scientists. However, the characteristics of Genetic algorithm, namely random initial population generation from a vector of feature elements, may influence solution and execution time. In this paper, the use of a statistical algorithm has been introduced (Chi2) for feature relevant checks where p-values of conditional independence were considered. Features with low p-values were discarded and subject relevant subset of features to Genetic Algorithm. This is to gain a level of certainty of the fitness of features randomly selected. An ensembled-based learning model for Hepatitis has been developed for Hepatitis C stage classification. 1385 samples were used using Egyptian-dataset obtained from UCI repository. The comparative evaluation confirms decreased in execution time and an increase in model performance accuracy from 56% to 63%.
https://doi.org/10.22937/IJCSNS.2022.22.1.23 인용 PDF KSCI

A Study on the File Allocation in Distributed Computer Systems (분산 컴퓨터 시스템에서 파일 할당에 관한 연구)

홍진표;임재택
- Journal of the Korean Institute of Telematics and Electronics
- /
- v.27 no.4
- /
- pp.571-579
- /
- 1990
A dynamic relocation algorithm for non-deterministic process graph in distributed computer systems is proposed. A method is represented for determining the optimal policy for processing a process tree. A general database query request is modelled by a process tree which represent a set of subprocesses together with their precedence relationship. The process allocation model is based on operating cost which is a function fo selection of site for processing operation, data reduction function and file size. By using expected values of parameters for non-deterministic process tree, the process graph and optimal policy that yield minimum operating cost are determined. As process is relocated according to threshold value and new information of parameters after the execution of low level process for non-deterministic process graph, the assigned state that approximate to optiaml solution is obtained. The proposed algorihtm is heuristic By performing algorithm for sample problems, it is shown that the proposed algorithm is good in obtaining optimal solution.
PDF

Adaptation for Object-based MPEG-4 Content with Multiple Streams (다중 스트림을 이용한 객체기반 MPEG-4 컨텐트의 적응 기법)

Cha Kyung-Ae
- Journal of Korea Society of Industrial Information Systems
- /
- v.11 no.3
- /
- pp.69-81
- /
- 2006
In this paper, an adaptive algorithm is proposed in streaming MPEG-4 contents with fluctuating resource amount such as throughput of network conditions. In the area of adaptive streaming issue, a lot of researches have been made on how to represent encoded media(such as video) bitstream in scalable way. By contrast, MPEG-4 supports object-based multimedia content which is composed of various types of media streams such as audio, video, image and other graphical elements. Thus, it can be more effective to provide individual media streams in scalable way for streaming object-based content to heterogeneous environment. The proposed method provides the multiple media streams corresponding to an object with different qualities and bit rate in order to support object based scalability to the MPEG-4 content. In addition, an optimal selection of the multiple streams for each object to meet a given constraint is proposed. The selection process is adopted a multiple choice knapsack problem with multi-step selection for the MPEG-4 objects with different scalability levels. The proposed algorithm enforces the optimal selection process to maintain the perceptual qualities of more important objects at the best effort. The experimental results show that the set of selected media stream for presenting objects meets a current transmission condition with more high perceptual quality.
PDF

A Dynamic Dispatching Algorithm for Operation of Automated Guided Vehicles and Machines in CIM Systems (CIM 시스템에서 기계가공과 AGV 의 운영을 위한 동적 작업배정 알고리듬)

Kim, Jung-Wook;Rhee, Jong-Tae
- Journal of Korean Institute of Industrial Engineers
- /
- v.21 no.1
- /
- pp.85-101
- /
- 1995
Automated Guided Vehicles(AGVs) are widely used in computer integrated manufacturing(CIM) systems for material handling purposes. Although automated guided vehicles provide higher levels of flexibility and computer integrability, the installations are limited in number and one of the critical reasons lies in the complexity involved in the operation. The main objective of this research is to alleviate this problem by proposing efficient integrated operational control methods for AGV-based CIM systems. Particularly, this research is concerned with the mixed problem of dispatching automated guided vehicles and scheduling machines operation. The proposed dynamic heuristic algorithm uses various priority schemes and relevant information concerning the load of the system, the status of queues, and the position of AGVs in the scheduling process. The scheduling decision process is hierarchical in the sense that different decision criteria are applied sequentially to identify the most appropriate part to be served. This algorithm consists of two sections, the section of part selection by AGVs for the next service whenever an AGV completes the current assignment, and the section of part selection by machines for next service whenever a machine completes the current operation. The proposed algorithm has been compared with other scheduling schemes using the performance measure of mean flow-time and mean tardiness. Simulation results indicate that the proposed algorithm can reduce the mean flow-time and mean tardiness significantly.
PDF

Multi-Objective Genetic Algorithm for Machine Selection in Dynamic Process Planning (동적 공정계획에서의 기계선정을 위한 다목적 유전자 알고리즘)

Choi, Hoe-Ryeon;Kim, Jae-Kwan;Lee, Hong-Chul;Rho, Hyung-Min
- Journal of the Korean Society for Precision Engineering
- /
- v.24 no.4 s.193
- /
- pp.84-92
- /
- 2007
Dynamic process planning requires not only more flexible capabilities of a CAPP system but also higher utility of the generated process plans. In order to meet the requirements, this paper develops an algorithm that can select machines for the machining operations by calculating the machine loads. The developed algorithm is based on the multi-objective genetic algorithm that gives rise to a set of optimal solutions (in general, known as the Pareto-optimal solutions). The objective is to satisfy both the minimization number of part movements and the maximization of machine utilization. The algorithm is characterized by a new and efficient method for nondominated sorting through K-means algorithm, which can speed up the running time, as well as a method of two stages for genetic operations, which can maintain a diverse set of solutions. The performance of the algorithm is evaluated by comparing with another multiple objective genetic algorithm, called NSGA-II and branch and bound algorithm.
PDF KSCI

Fast Sampling Set Selection Algorithm for Arbitrary Graph Signals (임의의 그래프신호를 위한 고속 샘플링 집합 선택 알고리즘)

Kim, Yoon-Hak
- The Journal of the Korea institute of electronic communication sciences
- /
- v.15 no.6
- /
- pp.1023-1030
- /
- 2020
We address the sampling set selection problem for arbitrary graph signals such that the original graph signal is reconstructed from the signal values on the nodes in the sampling set. We introduce a variation difference as a new indirect metric that measures the error of signal variations caused by sampling process without resorting to the eigen-decomposition which requires a huge computational cost. Instead of directly minimizing the reconstruction error, we propose a simple and fast greedy selection algorithm that minimizes the variation differences at each iteration and justify the proposed reasoning by showing that the principle used in the proposed process is similar to that in the previous novel technique. We run experiments to show that the proposed method yields a competitive reconstruction performance with a substantially reduced complexity for various graphs as compared with the previous selection methods.
https://doi.org/10.13067/JKIECS.2020.15.6.1023 인용 PDF KSCI

Generation of Cutting Layers and Tool Selection for 3D Pocket Machining (3차원 포켓가공을 위한 절삭층 형성 및 공구선정)

경영민;조규갑
- Journal of the Korean Society for Precision Engineering
- /
- v.15 no.9
- /
- pp.101-110
- /
- 1998
In process planning for 3D pocket machining, the critical issues for the optimal process planning are the generation of cutting layers and the tool selection for each cutting layers as well as the other factors such as the determination of machining types, tool path, etc. This paper describes the optimal tool selection on a single cutting layer for 2D pocket machining, the generation of cutting layers for 3D pocket machining, the determination of the thickness of each cutting layers, the determination of the tool combinations for each cutting layers and also the development of an algorithm for determining the machining sequence which reduces the number of tool exchanges, which are based on the backward approach. The branch and bound method is applied to select the optimal tools for each cutting layer, and an algorithmic procedure is developed to determine the machining sequence consisting of the pairs of the cutting layers and cutting tools to be used in the same operation.
PDF

An Efficient Channel Selection and Power Allocation Scheme for TVWS based on Interference Analysis in Smart Metering Infrastructure

Huynh, Chuyen Khoa;Lee, Won Cheol
- Journal of Communications and Networks
- /
- v.18 no.1
- /
- pp.50-64
- /
- 2016
Nowadays, smart meter (SM) technology is widely effectively used. In addition, power allocation (PA) and channel selection (CS) are considered problems with many proposed approaches. In this paper, we will suggest a specific scenario for an SM configuration system and show how to solve the optimization problem for transmission between SMs and the data concentrator unit (DCU), the center that collects the data from several SMs, via simulation. An efficient CS with PA scheme is proposed in the TV white space system, which uses the TV band spectrum. On the basic of the optimal configuration requirements, SMs can have a transmission schedule and channel selection to obtain the optimal efficiency of using spectrum resources when transmitting data to the DCU. The optimal goals discussed in this paper are the maximum capacity or maximum channel efficiency and the maximum allowable power of the SMs used to satisfy the quality of service without harm to another wireless system. In addition, minimization of the interference to the digital television system and other SMs is also important and needs to be considered when the solving coexistence scenario. Further, we propose a process that performs an interference analysis scheme by using the spectrum engineering advanced Monte Carlo analysis tool (SEAMCAT), which is an integrated software tool based on a Monte-Carlo simulation method. Briefly, the process is as follows: The optimization process implemented by genetic evolution optimization engines, i.e., a genetic algorithm, will calculate the best configuration for the SM system on the basis of the interference limitation for each SM by SEAMCAT in a specific configuration, which reaches the solution with the best defined optimal goal satisfaction.
https://doi.org/10.1109/JCN.2016.000008 인용 PDF KSCI

Search Result 451, Processing Time 0.025 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)