통합 검색 | Korea Science

A Clustering Approach for Feature Selection in Microarray Data Classification Using Random Forest

Aydadenta, Husna;Adiwijaya, Adiwijaya
- Journal of Information Processing Systems
- /
- 제14권5호
- /
- pp.1167-1175
- /
- 2018
Microarray data plays an essential role in diagnosing and detecting cancer. Microarray analysis allows the examination of levels of gene expression in specific cell samples, where thousands of genes can be analyzed simultaneously. However, microarray data have very little sample data and high data dimensionality. Therefore, to classify microarray data, a dimensional reduction process is required. Dimensional reduction can eliminate redundancy of data; thus, features used in classification are features that only have a high correlation with their class. There are two types of dimensional reduction, namely feature selection and feature extraction. In this paper, we used k-means algorithm as the clustering approach for feature selection. The proposed approach can be used to categorize features that have the same characteristics in one cluster, so that redundancy in microarray data is removed. The result of clustering is ranked using the Relief algorithm such that the best scoring element for each cluster is obtained. All best elements of each cluster are selected and used as features in the classification process. Next, the Random Forest algorithm is used. Based on the simulation, the accuracy of the proposed approach for each dataset, namely Colon, Lung Cancer, and Prostate Tumor, achieved 85.87%, 98.9%, and 89% accuracy, respectively. The accuracy of the proposed approach is therefore higher than the approach using Random Forest without clustering.
https://doi.org/10.3745/JIPS.04.0087 인용 PDF KSCI

Neural and MTS Algorithms for Feature Selection

Su, Chao-Ton;Li, Te-Sheng
- International Journal of Quality Innovation
- /
- 제3권2호
- /
- pp.113-131
- /
- 2002
The relationships among multi-dimensional data (such as medical examination data) with ambiguity and variation are difficult to explore. The traditional approach to building a data classification system requires the formulation of rules by which the input data can be analyzed. The formulation of such rules is very difficult with large sets of input data. This paper first describes two classification approaches using back-propagation (BP) neural network and Mahalanobis distance (MD) classifier, and then proposes two classification approaches for multi-dimensional feature selection. The first one proposed is a feature selection procedure from the trained back-propagation (BP) neural network. The basic idea of this procedure is to compare the multiplication weights between input and hidden layer and hidden and output layer. In order to simplify the structure, only the multiplication weights of large absolute values are used. The second approach is Mahalanobis-Taguchi system (MTS) originally suggested by Dr. Taguchi. The MTS performs Taguchi's fractional factorial design based on the Mahalanobis distance as a performance metric. We combine the automatic thresholding with MD: it can deal with a reduced model, which is the focus of this paper In this work, two case studies will be used as examples to compare and discuss the complete and reduced models employing BP neural network and MD classifier. The implementation results show that proposed approaches are effective and powerful for the classification.
PDF KSCI

OptiNeural System for Optical Pattern Classification

Kim, Myung-Soo
- Journal of Electrical Engineering and information Science
- /
- 제3권3호
- /
- pp.342-347
- /
- 1998
An OptiNeural system is developed for optical pattern classification. It is a novel hybrid system which consists of an optical processor and a multilayer neural network. It takes advantages of two dimensional processing capability of an optical processor and nonlinear mapping capability of a neural network. The optical processor with a binary phase only filter is used as a preprocessor for feature extraction and the neural network is used as a decision system through mapping. OptiNeural system is trained for optical pattern classification by use of a simulated annealing algorithm. Its classification performance for grey tone texture patterns is excellent, while a conventional optical system shows poor classification performance.
PDF

母岳山道立公園植物群集의 分類와 多次元分析 (Classification and multidimensional analysis of plant communities mt. moak provincial park, korea)

Kim, Jeong-Un;Yang-Jai Yim
- The Korean Journal of Ecology
- /
- 제16권1호
- /
- pp.1-15
- /
- 1993
Ordination and classification techiques were used to analyze the forest communities and to examine the integration problem of community-to-ecological species group in mt. moak provincial park of korea. phytosociological classiication based on floristic composition produced seven commuities of zelkova serrata, carpinus densiflora. These seven communities were well discriminated in the two-dimensional analyses of soil moisture, soil organic matter content and temperature(elevation), eciprocally, and in three-dimensional space of the three environmental factors also. They corresponded to seven ecological groups derived from the distribution pattern analysis of species populations in this mountain.
PDF

다중 레이블 분류의 정확도 향상을 위한 스킵 연결 오토인코더 기반 레이블 임베딩 방법론 (Label Embedding for Improving Classification Accuracy UsingAutoEncoderwithSkip-Connections)

김무성;김남규
- 지능정보연구
- /
- 제27권3호
- /
- pp.175-197
- /
- 2021
최근 딥 러닝 기술의 발전으로 뉴스, 블로그 등 다양한 문서에 포함된 텍스트 분석에 딥 러닝 기술을 활용하는 연구가 활발하게 수행되고 있다. 다양한 텍스트 분석 응용 가운데, 텍스트 분류는 학계와 업계에서 가장 많이 활용되는 대표적인 기술이다. 텍스트 분류의 활용 예로는 정답 레이블이 하나만 존재하는 이진 클래스 분류와 다중 클래스 분류, 그리고 정답 레이블이 여러 개 존재하는 다중 레이블 분류 등이 있다. 특히, 다중 레이블 분류는 여러 개의 정답 레이블이 존재한다는 특성 때문에 일반적인 분류와는 상이한 학습 방법이 요구된다. 또한, 다중 레이블 분류 문제는 레이블과 클래스의 개수가 증가할수록 예측의 난이도가 상승한다는 측면에서 데이터 과학 분야의 난제로 여겨지고 있다. 따라서 이를 해결하기 위해 다수의 레이블을 압축한 후 압축된 레이블을 예측하고, 예측된 압축 레이블을 원래 레이블로 복원하는 레이블 임베딩이 많이 활용되고 있다. 대표적으로 딥 러닝 모델인 오토인코더 기반 레이블 임베딩이 이러한 목적으로 사용되고 있지만, 이러한 기법은 클래스의 수가 무수히 많은 고차원 레이블 공간을 저차원 잠재 레이블 공간으로 압축할 때 많은 정보 손실을 야기한다는 한계가 있다. 이에 본 연구에서는 오토인코더의 인코더와 디코더 각각에 스킵 연결을 추가하여, 고차원 레이블 공간의 압축 과정에서 정보 손실을 최소화할 수 있는 레이블 임베딩 방법을 제안한다. 또한 학술연구정보서비스인 'RISS'에서 수집한 학술논문 4,675건에 대해 각 논문의 초록으로부터 해당 논문의 다중 키워드를 예측하는 실험을 수행한 결과, 제안 방법론이 기존의 일반 오토인코더 기반 레이블 임베딩 기법에 비해 정확도, 정밀도, 재현율, 그리고 F1 점수 등 모든 측면에서 우수한 성능을 나타냄을 확인하였다.
https://doi.org/10.13088/jiis.2021.27.3.175 인용 PDF KSCI

가중치가 적용된 공분산을 이용한 2D-LDA 기반의 얼굴인식 (Improved Face Recognition based on 2D-LDA using Weighted Covariance Scatter)

이석진;오치민;이칠우
- 한국멀티미디어학회논문지
- /
- 제17권12호
- /
- pp.1446-1452
- /
- 2014
Existing LDA uses the transform matrix that maximizes distance between classes. So we have to convert from an image to one-dimensional vector as training vector. However, in 2D-LDA, we can directly use two-dimensional image itself as training matrix, so that the classification performance can be enhanced about 20% comparing LDA, since the training matrix preserves the spatial information of two-dimensional image. However 2D-LDA uses same calculation schema for transformation matrix and therefore both LDA and 2D-LDA has the heteroscedastic problem which means that the class classification cannot obtain beneficial information of spatial distances of class clusters since LDA uses only data correlation-based covariance matrix of the training data without any reference to distances between classes. In this paper, we propose a new method to apply training matrix of 2D-LDA by using WPS-LDA idea that calculates the reciprocal of distance between classes and apply this weight to between class scatter matrix. The experimental result shows that the discriminating power of proposed 2D-LDA with weighted between class scatter has been improved up to 2% than original 2D-LDA. This method has good performance, especially when the distance between two classes is very close and the dimension of projection axis is low.
https://doi.org/10.9717/kmms.2014.17.12.1446 인용 PDF KSCI KPUBS HTML

SLOW VISCOUS FLOW PAST A CAVITY WITH INFINITE DEPTH

Kim, D.W;Kim, S.B;Chu, J.H
- Journal of applied mathematics & informatics
- /
- 제7권3호
- /
- pp.801-812
- /
- 2000
Two-dimensional slow viscous flow on infinite half-plane past a perpendicular infinite cavity is considered on the basis of the Stokes approximation. Using complex representation of the two-dimensional Stokes flow, the problem is reduced to solving a set of Fredholm integral equations of the second kind. The streamlines and the pressure and vorticity distribution on the wall are numerically determined.

Hybrid-Feature Extraction for the Facial Emotion Recognition

Byun, Kwang-Sub;Park, Chang-Hyun;Sim, Kwee-Bo;Jeong, In-Cheol;Ham, Ho-Sang
- 제어로봇시스템학회:학술대회논문집
- /
- 제어로봇시스템학회 2004년도 ICCAS
- /
- pp.1281-1285
- /
- 2004
There are numerous emotions in the human world. Human expresses and recognizes their emotion using various channels. The example is an eye, nose and mouse. Particularly, in the emotion recognition from facial expression they can perform the very flexible and robust emotion recognition because of utilization of various channels. Hybrid-feature extraction algorithm is based on this human process. It uses the geometrical feature extraction and the color distributed histogram. And then, through the independently parallel learning of the neural-network, input emotion is classified. Also, for the natural classification of the emotion, advancing two-dimensional emotion space is introduced and used in this paper. Advancing twodimensional emotion space performs a flexible and smooth classification of emotion.
PDF

Image Processing and Cryo-Transmission Electron Microscopy; Example of Human Proteasome

Choi, Hyosun;Jeon, Hyunbum;Noh, Seulgi;Kwon, Ohkyung;Mun, Ji Young
- Applied Microscopy
- /
- 제48권1호
- /
- pp.1-5
- /
- 2018
Cryo-transmission electron microscopy (cryo-TEM) allows us to perform structural analysis of a analyses of large protein complexes, which are difficult to analyze using X-ray crystallography or nuclear magnetic resonance. The most common examples of proteins used are ribosomes and proteasomes. In this paper, we briefly describe the advantage of cryo-TEM and the process of two-dimensional classification by considering a human proteasome as an example.
https://doi.org/10.9729/AM.2018.48.1.1 인용 PDF KSCI

Motion classification using distributional features of 3D skeleton data

Woohyun Kim;Daeun Kim;Kyoung Shin Park;Sungim Lee
- Communications for Statistical Applications and Methods
- /
- 제30권6호
- /
- pp.551-560
- /
- 2023
Recently, there has been significant research into the recognition of human activities using three-dimensional sequential skeleton data captured by the Kinect depth sensor. Many of these studies employ deep learning models. This study introduces a novel feature selection method for this data and analyzes it using machine learning models. Due to the high-dimensional nature of the original Kinect data, effective feature extraction methods are required to address the classification challenge. In this research, we propose using the first four moments as predictors to represent the distribution of joint sequences and evaluate their effectiveness using two datasets: The exergame dataset, consisting of three activities, and the MSR daily activity dataset, composed of ten activities. The results show that the accuracy of our approach outperforms existing methods on average across different classifiers.
https://doi.org/10.29220/CSAM.2023.30.6.551 인용 PDF

검색결과 259건 처리시간 0.026초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)