• Title/Summary/Keyword: redundant methods

Search Result 212, Processing Time 0.027 seconds

Fuzzy discretization with spatial distribution of data and Its application to feature selection (데이터의 공간적 분포를 고려한 퍼지 이산화와 특징선택에의 응용)

  • Son, Chang-Sik;Shin, A-Mi;Lee, In-Hee;Park, Hee-Joon;Park, Hyoung-Seob;Kim, Yoon-Nyun
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.20 no.2
    • /
    • pp.165-172
    • /
    • 2010
  • In clinical data minig, choosing the optimal subset of features is such important, not only to reduce the computational complexity but also to improve the usefulness of the model constructed from the given data. Moreover the threshold values (i.e., cut-off points) of selected features are used in a clinical decision criteria of experts for differential diagnosis of diseases. In this paper, we propose a fuzzy discretization approach, which is evaluated by measuring the degree of separation of redundant attribute values in overlapping region, based on spatial distribution of data with continuous attributes. The weighted average of the redundant attribute values is then used to determine the threshold value for each feature and rough set theory is utilized to select a subset of relevant features from the overall features. To verify the validity of the proposed method, we compared experimental results, which applied to classification problem using 668 patients with a chief complaint of dyspnea, based on three discretization methods (i.e., equal-width, equal-frequency, and entropy-based) and proposed discretization method. From the experimental results, we confirm that the discretization methods with fuzzy partition give better results in two evaluation measures, average classification accuracy and G-mean, than those with hard partition.

CNN-Based Novelty Detection with Effectively Incorporating Document-Level Information (효과적인 문서 수준의 정보를 이용한 합성곱 신경망 기반의 신규성 탐지)

  • Jo, Seongung;Oh, Heung-Seon;Im, Sanghun;Kim, Seonho
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.9 no.10
    • /
    • pp.231-238
    • /
    • 2020
  • With a large number of documents appearing on the web, document-level novelty detection has become important since it can reduce the efforts of finding novel documents by discarding documents sharing redundant information already seen. A recent work proposed a convolutional neural network (CNN)-based novelty detection model with significant performance improvements. We observed that it has a restriction of using document-level information in determining novelty but assumed that the document-level information is more important. As a solution, this paper proposed two methods of effectively incorporating document-level information using a CNN-based novelty detection model. Our methods focus on constructing a feature vector of a target document to be classified by extracting relative information between the target document and source documents given as evidence. A series of experiments showed the superiority of our methods on a standard benchmark collection, TAP-DLND 1.0.

Prediction of Diabetic Nephropathy from Diabetes Dataset Using Feature Selection Methods and SVM Learning (특징점 선택방법과 SVM 학습법을 이용한 당뇨병 데이터에서의 당뇨병성 신장합병증의 예측)

  • Cho, Baek-Hwan;Lee, Jong-Shill;Chee, Young-Joan;Kim, Kwang-Won;Kim, In-Young;Kim, Sun-I.
    • Journal of Biomedical Engineering Research
    • /
    • v.28 no.3
    • /
    • pp.355-362
    • /
    • 2007
  • Diabetes mellitus can cause devastating complications, which often result in disability and death, and diabetic nephropathy is a leading cause of death in people with diabetes. In this study, we tried to predict the onset of diabetic nephropathy from an irregular and unbalanced diabetic dataset. We collected clinical data from 292 patients with type 2 diabetes and performed preprocessing to extract 184 features to resolve the irregularity of the dataset. We compared several feature selection methods, such as ReliefF and sensitivity analysis, to remove redundant features and improve the classification performance. We also compared learning methods with support vector machine, such as equal cost learning and cost-sensitive learning to tackle the unbalanced problem in the dataset. The best classifier with the 39 selected features gave 0.969 of the area under the curve by receiver operation characteristics analysis, which represents that our method can predict diabetic nephropathy with high generalization performance from an irregular and unbalanced dataset, and physicians can benefit from it for predicting diabetic nephropathy.

A Study on the Network Adjustment Analysis for Planimetric Positioning (수평위치 결정을 위한 망조정 해석에 관한 연구)

  • 유복모;조기성;이현직;곽동옥
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.9 no.2
    • /
    • pp.37-48
    • /
    • 1991
  • In this study, conventional network adjustment and combined network adjustment methods for single network adjustment methods for single network and centric combination network were compared by the analysis of root mean square error and standard error ellipse of observed points. It can be concluded from this study that for conventional surveying methods, the accuracy is in theorder of trilateration, traverse and triangulation, and for the case of combined surveying method, the accuracy is in the order of multilateration surveying, combined traverse and combined triangulation-trilateration surveying. And when establishing new control points, the accuracy can be improved by increasing redundant observations of centric combination network instead of using the single network. Also, in case of combined traverse surveying by adding observable laterals, accuracy level of trilateration could be achieved, and it was found that traverse is effective for large areas where sighting is easy, and combined traverse surveying is effective for urban areas where sighting is difficult.

  • PDF

Fast Implementations of Projector-Backprojector Pairs for Iterative Tomographic Reconstruction (반복법을 사용한 단층영상 재구성을 위한 투사기 및 역투사기의 고속 구현)

  • 김수미;이수진;김용호
    • Journal of Biomedical Engineering Research
    • /
    • v.24 no.5
    • /
    • pp.473-480
    • /
    • 2003
  • Iterative reconstruction methods have played a prominent role in emission computed tomography due to their remarkable advantages over the conventional filtered backprojection method. However, since iterative reconstructions typically are comprised of repeatedly projecting and backprojecting the data, the computational load required for reconstructing an image depends highly on the performance of the projector-backprojector pair used in the algorithm. In this work we compare quantitative performance of representative methods for implementing projector-backprojector pairs. To reduce the overall cost for the projection-backprojection operations for each method, we investigate how previously computed results can be reused so that the number of redundant calculations can be minimized. Our experimental results demonstrate that the ray tracing method not only outperforms other methods in computation time, but also provides improved reconstructions with good accuracy.

A Synthesis of Combinational Logic with TANT Networks (조합논리함수의 TANT회로에 의한 합성)

  • 고경식
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.5 no.4
    • /
    • pp.1-8
    • /
    • 1968
  • A TANT network is a three-level network composed solely of NAND gates having only true(i.e. uncomplemented) inputs. The paper presents a technique for finding for any given Boolean function a least-cost TANT network. The first step of the technique is to determine essential prime implicants(EPI) by Quine-McCluskey procedure or other methods and generate prime implicants(PI) hving the same head as any one of EPI by consensus operation. The second step is to select common factors among the usable tail factors. The selcetion phase is analogous to the use of C-C table. The last step is to minimize inputs by deleting the redundant PI. the technique permits hand solution of typical five-and six-variable problems.

  • PDF

A Study on Data Pre-filtering Methods for Fault Diagnosis (시스템 결함원인분석을 위한 데이터 로그 전처리 기법 연구)

  • Lee, Yang-Ji;Kim, Duck-Young;Hwang, Min-Soon;Cheong, Young-Soo
    • Korean Journal of Computational Design and Engineering
    • /
    • v.17 no.2
    • /
    • pp.97-110
    • /
    • 2012
  • High performance sensors and modern data logging technology with real-time telemetry facilitate system fault diagnosis in a very precise manner. Fault detection, isolation and identification in fault diagnosis systems are typical steps to analyze the root cause of failures. This systematic failure analysis provides not only useful clues to rectify the abnormal behaviors of a system, but also key information to redesign the current system for retrofit. The main barriers to effective failure analysis are: (i) the gathered data (event) logs are too large in general, and further (ii) they usually contain noise and redundant data that make precise analysis difficult. This paper therefore applies suitable pre-processing techniques to data reduction and feature extraction, and then converts the reduced data log into a new format of event sequence information. Finally the event sequence information is decoded to investigate the correlation between specific event patterns and various system faults. The efficiency of the developed pre-filtering procedure is examined with a terminal box data log of a marine diesel engine.

Feature Selection for Classification of Mass Spectrometric Proteomic Data Using Random Forest (단백체 스펙트럼 데이터의 분류를 위한 랜덤 포리스트 기반 특성 선택 알고리즘)

  • Ohn, Syng-Yup;Chi, Seung-Do;Han, Mi-Young
    • Journal of the Korea Society for Simulation
    • /
    • v.22 no.4
    • /
    • pp.139-147
    • /
    • 2013
  • This paper proposes a novel method for feature selection for mass spectrometric proteomic data based on Random Forest. The method includes an effective preprocessing step to filter a large amount of redundant features with high correlation and applies a tournament strategy to get an optimal feature subset. Experiments on three public datasets, Ovarian 4-3-02, Ovarian 7-8-02 and Prostate shows that the new method achieves high performance comparing with widely used methods and balanced rate of specificity and sensitivity.

Effects of Anxiety on Health Related Quality of Life of the Elderly: Multiple Mediating Effects of Self-esteem and Social Support (노인의 불안이 건강 관련 삶의 질에 미치는 영향: 자아존중감과 사회적 지지의 복수매개 효과)

  • Park, Min-Jeong;Chung, Mi Young
    • Research in Community and Public Health Nursing
    • /
    • v.31 no.1
    • /
    • pp.24-33
    • /
    • 2020
  • Purpose: The purpose of this study was to examine the mediating effect of self-esteem and social support on the relationship between anxiety and health-related quality of life (HRQoL) in the elderly. Methods: The Korea adult psycho-social anxiety survey data were collected from August to September 2015 by the Korea Institute for Health. The subjects were 1,035 elderly people who were aged 65 or older at the time of the data survey. The data were analyzed by t-test, chi-square, Pearson correlation coefficient, and parallel redundant mediated model for PROCESS macro using SPSS 23.0. Results: They scored an average of 37.93±7.58 for anxiety, 28.59±3.45 for self-esteem, 17.25±4.11 for social support, and 0.88±0.11 for HRQoL. The direct effect of anxiety on HRQoL and the indirect effect of anxiety mediated with self-esteem and social support about HRQoL were statistically significant. Conclusion: These results indicate that in order to increase the HRQoL of the elderly, it is necessary to develop an intervention program that focuses not only on reducing anxiety but also on improving self-esteem and social support.

A comparison study of classification method based of SVM and data depth in microarray data (마이크로어레이 자료에서 서포트벡터머신과 데이터 뎁스를 이용한 분류방법의 비교연구)

  • Hwang, Jin-Soo;Kim, Jee-Yun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.20 no.2
    • /
    • pp.311-319
    • /
    • 2009
  • A robust L1 data depth was used in clustering and classification, so called DDclus and DDclass by Jornsten (2004). SVM-based classification works well in most of the situation but show some weakness in the presence of outliers. Proper gene selection is important in classification since there are so many redundant genes. Either by selecting appropriate genes or by gene clustering combined with classification method enhance the overall performance of classification. The performance of depth based method are evaluated among several SVM-based classification methods.

  • PDF