• Title/Summary/Keyword: classifier

Search Result 2,199, Processing Time 0.041 seconds

Class prediction of an independent sample using a set of gene modules consisting of gene-pairs which were condition(Tumor, Normal) specific (조건(암, 정상)에 따라 특이적 관계를 나타내는 유전자 쌍으로 구성된 유전자 모듈을 이용한 독립샘플의 클래스예측)

  • Jeong, Hyeon-Iee;Yoon, Young-Mi
    • Journal of the Korea Society of Computer and Information
    • /
    • v.15 no.12
    • /
    • pp.197-207
    • /
    • 2010
  • Using a variety of data-mining methods on high-throughput cDNA microarray data, the level of gene expression in two different tissues can be compared, and DEG(Differentially Expressed Gene) genes in between normal cell and tumor cell can be detected. Diagnosis can be made with these genes, and also treatment strategy can be determined according to the cancer stages. Existing cancer classification methods using machine learning select the marker genes which are differential expressed in normal and tumor samples, and build a classifier using those marker genes. However, in addition to the differences in gene expression levels, the difference in gene-gene correlations between two conditions could be a good marker in disease diagnosis. In this study, we identify gene pairs with a big correlation difference in two sets of samples, build gene classification modules using these gene pairs. This cancer classification method using gene modules achieves higher accuracy than current methods. The implementing clinical kit can be considered since the number of genes in classification module is small. For future study, Authors plan to identify novel cancer-related genes with functionality analysis on the genes in a classification module through GO(Gene Ontology) enrichment validation, and to extend the classification module into gene regulatory networks.

Human-Computer Interface using sEMG according to the Number of Electrodes (전극 개수에 따른 근전도 기반 휴먼-컴퓨터 인터페이스의 정확도에 대한 연구)

  • Lee, Seulbi;Chee, Youngjoon
    • Journal of the HCI Society of Korea
    • /
    • v.10 no.2
    • /
    • pp.21-26
    • /
    • 2015
  • NUI (Natural User Interface) system interprets the user's natural movement or the signals from human body to the machine. sEMG (surface electromyogram) can be observed when there is any effort in muscle even without actual movement, which is impossible with camera and accelerometer based NUI system. In sEMG based movement recognition system, the minimal number of electrodes is preferred to minimize the inconvenience. We analyzed the decrease in recognition accuracy as decreasing the number of electrodes. For the four kinds of movement intention without movement, extension (up), flexion (down), abduction (right), and adduction (left), the multilayer perceptron classifier was used with the features of RMS (Root Mean Square) from sEMG. The classification accuracy was 91.9% in four channels, 87.0% in three channels, and 78.9% in two channels. To increase the accuracy in two channels of sEMG, RMSs from previous time epoch (50-200 ms) were used in addition. With the RMSs from 150 ms, the accuracy was increased from 78.9% to 83.6%. The decrease in accuracy with minimal number of electrodes could be compensated partly by utilizing more features in previous RMSs.

Improvement of Endoscopic Image using De-Interlacing Technique (De-Interlace 기법을 이용한 내시경 영상의 화질 개선)

  • 신동익;조민수;허수진
    • Journal of Biomedical Engineering Research
    • /
    • v.19 no.5
    • /
    • pp.469-476
    • /
    • 1998
  • In the case of acquisition and displaying medical Images such as ultrasonography and endoscopy on VGA monitor of PC system, image degradation of tear-drop appears through scan conversion. In this study, we compare several methods which can solve this degradation and implement the hardware system that resolves this problem in real-time with PC. It is possible to represent high quality image display and real-time processing and acquisition with specific de-interlacing device and PCI bridge on our hardware system. Image quality is improved remarkably on our hardware system. It is implemented as PC-based system, so acquiring, saving images and describing text comment on those images and PACS networking can be easily implemented.metabolism. All images were spatially normalized to MNI standard PET template and smoothed with 16mm FWHM Gaussian kernel using SPM96. Mean count in cerebral region was normalized. The VOls for 34 cerebral regions were previously defined on the standard template and 17 different counts of mirrored regions to hemispheric midline were extracted from spatially normalized images. A three-layer feed-forward error back-propagation neural network classifier with 7 input nodes and 3 output nodes was used. The network was trained to interpret metabolic patterns and produce identical diagnoses with those of expert viewers. The performance of the neural network was optimized by testing with 5~40 nodes in hidden layer. Randomly selected 40 images from each group were used to train the network and the remainders were used to test the learned network. The optimized neural network gave a maximum agreement rate of 80.3% with expert viewers. It used 20 hidden nodes and was trained for 1508 epochs. Also, neural network gave agreement rates of 75~80% with 10 or 30 nodes in hidden layer. We conclude that artificial neural network performed as well as human experts and could be potentially useful as clinical decision support tool for the localization of epileptogenic zones.

  • PDF

Illegal Cash Accommodation Detection Modeling Using Ensemble Size Reduction (신용카드 불법현금융통 적발을 위한 축소된 앙상블 모형)

  • Lee, Hwa-Kyung;Han, Sang-Bum;Jhee, Won-Chul
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.1
    • /
    • pp.93-116
    • /
    • 2010
  • Ensemble approach is applied to the detection modeling of illegal cash accommodation (ICA) that is the well-known type of fraudulent usages of credit cards in far east nations and has not been addressed in the academic literatures. The performance of fraud detection model (FDM) suffers from the imbalanced data problem, which can be remedied to some extent using an ensemble of many classifiers. It is generally accepted that ensembles of classifiers produce better accuracy than a single classifier provided there is diversity in the ensemble. Furthermore, recent researches reveal that it may be better to ensemble some selected classifiers instead of all of the classifiers at hand. For the effective detection of ICA, we adopt ensemble size reduction technique that prunes the ensemble of all classifiers using accuracy and diversity measures. The diversity in ensemble manifests itself as disagreement or ambiguity among members. Data imbalance intrinsic to FDM affects our approach for ICA detection in two ways. First, we suggest the training procedure with over-sampling methods to obtain diverse training data sets. Second, we use some variants of accuracy and diversity measures that focus on fraud class. We also dynamically calculate the diversity measure-Forward Addition and Backward Elimination. In our experiments, Neural Networks, Decision Trees and Logit Regressions are the base models as the ensemble members and the performance of homogeneous ensembles are compared with that of heterogeneous ensembles. The experimental results show that the reduced size ensemble is as accurate on average over the data-sets tested as the non-pruned version, which provides benefits in terms of its application efficiency and reduced complexity of the ensemble.

An Algorithm for Efficient use of Label Space over MPLS Network with Multiple Disconnent Timers (MPLS 망에서 복수 연결해제 타이머를 이용한 레이블 공간의 효율적 사용방법)

  • Lee, Sun-Woo;Byun, Tae-Young;Han, Ki-Jun;Jeong, Youn-Kwae
    • Journal of KIISE:Information Networking
    • /
    • v.29 no.1
    • /
    • pp.24-30
    • /
    • 2002
  • Label switching technology is currently emerging as a solution for the rapidly growing of Internet traffic demand. Multiprotocol label switching(MPLS) is one of the standards made by the Internet Engineering Task Force(IETE) intended to enhance speed, scalability, and inter-opearability between label switching technologies. In MPLS, utilization of label space is a very important factor of network performance because labels are basic unit in packet switching. We propose a algorithm to effectively use label space by a multiple disconnect timer at the label switching router. Our algorithm is based on multiple utilization of the connection release timer over the MPLS network with multiple domains. In our algorithm, a relatively linger timeout interval is assigned to the traffic with higher class by the aid of the packet classifier. This reduces delay for making a new connection and also reduces the amount of packets which will be routed to the layer 3. Simulation results shows that reduction of required label number in MPLS network and this indicate our algorithm offers better performance than the existing ones in term of utilization of label space.

Localizing Head and Shoulder Line Using Statistical Learning (통계학적 학습을 이용한 머리와 어깨선의 위치 찾기)

  • Kwon, Mu-Sik
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.32 no.2C
    • /
    • pp.141-149
    • /
    • 2007
  • Associating the shoulder line with head location of the human body is useful in verifying, localizing and tracking persons in an image. Since the head line and the shoulder line, what we call ${\Omega}$-shape, move together in a consistent way within a limited range of deformation, we can build a statistical shape model using Active Shape Model (ASM). However, when the conventional ASM is applied to ${\Omega}$-shape fitting, it is very sensitive to background edges and clutter because it relies only on the local edge or gradient. Even though appearance is a good alternative feature for matching the target object to image, it is difficult to learn the appearance of the ${\Omega}$-shape because of the significant difference between people's skin, hair and clothes, and because appearance does not remain the same throughout the entire video. Therefore, instead of teaming appearance or updating appearance as it changes, we model the discriminative appearance where each pixel is classified into head, torso and background classes, and update the classifier to obtain the appropriate discriminative appearance in the current frame. Accordingly, we make use of two features in fitting ${\Omega}$-shape, edge gradient which is used for localization, and discriminative appearance which contributes to stability of the tracker. The simulation results show that the proposed method is very robust to pose change, occlusion, and illumination change in tracking the head and shoulder line of people. Another advantage is that the proposed method operates in real time.

Performance Improvement of Collaborative Filtering System Using Associative User′s Clustering Analysis for the Recalculation of Preference and Representative Attribute-Neighborhood (선호도 재계산을 위한 연관 사용자 군집 분석과 Representative Attribute -Neighborhood를 이용한 협력적 필터링 시스템의 성능향상)

  • Jung, Kyung-Yong;Kim, Jin-Su;Kim, Tae-Yong;Lee, Jung-Hyun
    • The KIPS Transactions:PartB
    • /
    • v.10B no.3
    • /
    • pp.287-296
    • /
    • 2003
  • There has been much research focused on collaborative filtering technique in Recommender System. However, these studies have shown the First-Rater Problem and the Sparsity Problem. The main purpose of this Paper is to solve these Problems. In this Paper, we suggest the user's predicting preference method using Bayesian estimated value and the associative user clustering for the recalculation of preference. In addition to this method, to complement a shortcoming, which doesn't regard the attribution of item, we use Representative Attribute-Neighborhood method that is used for the prediction when we find the similar neighborhood through extracting the representative attribution, which most affect the preference. We improved the efficiency by using the associative user's clustering analysis in order to calculate the preference of specific item within the cluster item vector to the collaborative filtering algorithm. Besides, for the problem of the Sparsity and First-Rater, through using Association Rule Hypergraph Partitioning algorithm associative users are clustered according to the genre. New users are classified into one of these genres by Naive Bayes classifier. In addition, in order to get the similarity value between users belonged to the classified genre and new users, and this paper allows the different estimated value to item which user evaluated through Naive Bayes learning. As applying the preference granted the estimated value to Pearson correlation coefficient, it can make the higher accuracy because the errors that cause the missing value come less. We evaluate our method on a large collaborative filtering database of user rating and it significantly outperforms previous proposed method.

Research on Text Classification of Research Reports using Korea National Science and Technology Standards Classification Codes (국가 과학기술 표준분류 체계 기반 연구보고서 문서의 자동 분류 연구)

  • Choi, Jong-Yun;Hahn, Hyuk;Jung, Yuchul
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.1
    • /
    • pp.169-177
    • /
    • 2020
  • In South Korea, the results of R&D in science and technology are submitted to the National Science and Technology Information Service (NTIS) in reports that have Korea national science and technology standard classification codes (K-NSCC). However, considering there are more than 2000 sub-categories, it is non-trivial to choose correct classification codes without a clear understanding of the K-NSCC. In addition, there are few cases of automatic document classification research based on the K-NSCC, and there are no training data in the public domain. To the best of our knowledge, this study is the first attempt to build a highly performing K-NSCC classification system based on NTIS report meta-information from the last five years (2013-2017). To this end, about 210 mid-level categories were selected, and we conducted preprocessing considering the characteristics of research report metadata. More specifically, we propose a convolutional neural network (CNN) technique using only task names and keywords, which are the most influential fields. The proposed model is compared with several machine learning methods (e.g., the linear support vector classifier, CNN, gated recurrent unit, etc.) that show good performance in text classification, and that have a performance advantage of 1% to 7% based on a top-three F1 score.

Committee Learning Classifier based on Attribute Value Frequency (속성 값 빈도 기반의 전문가 다수결 분류기)

  • Lee, Chang-Hwan;Jung, In-Chul;Kwon, Young-S.
    • Journal of KIISE:Databases
    • /
    • v.37 no.4
    • /
    • pp.177-184
    • /
    • 2010
  • In these day, many data including sensor, delivery, credit and stock data are generated continuously in massive quantity. It is difficult to learn from these data because they are large in volume and changing fast in their concepts. To handle these problems, learning methods based in sliding window methods over time have been used. But these approaches have a problem of rebuilding models every time new data arrive, which requires a lot of time and cost. Therefore we need very simple incremental learning methods. Bayesian method is an example of these methods but it has a disadvantage which it requries the prior knowledge(probabiltiy) of data. In this study, we propose a learning method based on attribute values. In the proposed method, even though we don't know the prior knowledge(probability) of data, we can apply our new method to data. The main concept of this method is that each attribute value is regarded as an expert learner, summing up the expert learners lead to better results. Experimental results show our learning method learns from data very fast and performs well when compared to current learning methods(decision tree and bayesian).

Color Laser Printer Identification through Discrete Wavelet Transform and Gray Level Co-occurrence Matrix (이산 웨이블릿 변환과 명암도 동시발생 행렬을 이용한 컬러 레이저프린터 판별 알고리즘)

  • Baek, Ji-Yeoun;Lee, Heung-Su;Kong, Seung-Gyu;Choi, Jung-Ho;Yang, Yeon-Mo;Lee, Hae-Yeoun
    • The KIPS Transactions:PartB
    • /
    • v.17B no.3
    • /
    • pp.197-206
    • /
    • 2010
  • High-quality and low-price digital printing devices are nowadays abused to print or forge official documents and bills. Identifying color laser printers will be a step for media forensics. This paper presents a new method to identify color laser printers with printed color images. Since different printer companies use different manufactural systems, printed documents from different printers have little difference in visual. Analyzing this artifact, we can identify the color laser printers. First, high-frequency components of images are extracted from original images with discrete wavelet transform. After calculating the gray-level co-occurrence matrix of the components, we extract some statistical features. Then, these features are applied to train and classify the support vector machine for identifying the color laser printer. In the experiment, total 2,597 images of 7 printers (HP, Canon, Xerox DCC400, Xerox DCC450, Xerox DCC5560, Xerox DCC6540, Konica), are tested to classify the color laser printer. The results prove that the presented identification method performs well with 96.9% accuracy.