• 제목/요약/키워드: Non-linear Classification

검색결과 108건 처리시간 0.022초

The extension of the largest generalized-eigenvalue based distance metric Dij1) in arbitrary feature spaces to classify composite data points

  • Daoud, Mosaab
    • Genomics & Informatics
    • /
    • 제17권4호
    • /
    • pp.39.1-39.20
    • /
    • 2019
  • Analyzing patterns in data points embedded in linear and non-linear feature spaces is considered as one of the common research problems among different research areas, for example: data mining, machine learning, pattern recognition, and multivariate analysis. In this paper, data points are heterogeneous sets of biosequences (composite data points). A composite data point is a set of ordinary data points (e.g., set of feature vectors). We theoretically extend the derivation of the largest generalized eigenvalue-based distance metric Dij1) in any linear and non-linear feature spaces. We prove that Dij1) is a metric under any linear and non-linear feature transformation function. We show the sufficiency and efficiency of using the decision rule $\bar{{\delta}}_{{\Xi}i}$(i.e., mean of Dij1)) in classification of heterogeneous sets of biosequences compared with the decision rules min𝚵iand median𝚵i. We analyze the impact of linear and non-linear transformation functions on classifying/clustering collections of heterogeneous sets of biosequences. The impact of the length of a sequence in a heterogeneous sequence-set generated by simulation on the classification and clustering results in linear and non-linear feature spaces is empirically shown in this paper. We propose a new concept: the limiting dispersion map of the existing clusters in heterogeneous sets of biosequences embedded in linear and nonlinear feature spaces, which is based on the limiting distribution of nucleotide compositions estimated from real data sets. Finally, the empirical conclusions and the scientific evidences are deduced from the experiments to support the theoretical side stated in this paper.

Design of High-Performance Unified Circuit for Linear and Non-Linear SVM Classifications

  • Kim, Soo-Jin;Lee, Seon-Young;Cho, Kyeong-Soon
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • 제12권2호
    • /
    • pp.162-167
    • /
    • 2012
  • This paper describes the design of a high-performance unified SVM classifier circuit. The proposed circuit supports both linear and non-linear SVM classifications. In order to ensure efficient classification, a 48x96 or 64x64 sliding window with 20 window strides is used. We reduced the circuit size by sharing most of the resources required for both types of classification. We described the proposed unified SVM classifier circuit using the Verilog HDL and synthesized the gate-level circuit using 65nm standard cell library. The synthesized circuit consists of 661,261 gates, operates at the maximum operating frequency of 152 MHz and processes up to 33.8 640x480 image frames per second.

계층 구조 클러스터링 알고리즘 설계 및 그 응용 (Design of Hierarchically Structured Clustering Algorithm and its Application)

  • 방영근;박하용;이철희
    • 산업기술연구
    • /
    • 제29권B호
    • /
    • pp.17-23
    • /
    • 2009
  • In many cases, clustering algorithms have been used for extracting and discovering useful information from non-linear data. They have made a great effect on performances of the systems dealing with non-linear data. Thus, this paper presents a new approach called hierarchically structured clustering algorithm, and it is applied to the prediction system for non-linear time series data. The proposed hierarchically structured clustering algorithm (called HCKA: Hierarchical Cross-correlation and K-means clustering Algorithms) in which the cross-correlation and k-means clustering algorithm are combined can accept the correlationship of non-linear time series as well as statistical characteristics. First, the optimal differences of data are generated, which can suitably reveal the characteristics of non-linear time series. Second, the generated differences are classified into the upper clusters for their predictors by the cross-correlation clustering algorithm, and then each classified differences are classified again into the lower fuzzy sets by the k-means clustering algorithm. As a result, the proposed method can give an efficient classification and improve the performance. Finally, we demonstrates the effectiveness of the proposed HCKA via typical time series examples.

  • PDF

A Comparative Study on Classification Methods of Sleep Stages by Using EEG

  • Kim, Jinwoo
    • 한국멀티미디어학회논문지
    • /
    • 제17권2호
    • /
    • pp.113-123
    • /
    • 2014
  • Electrophysiological recordings are considered a reliable method of assessing a person's alertness. Sleep medicine is asked to offer objective methods to measure daytime alertness, tiredness and sleepiness. As EEG signals are non-stationary, the conventional method of frequency analysis is not highly successful in recognition of alertness level. In this paper, EEG signals have been analyzed using wavelet transform as well as discrete wavelet transform and classification using statistical classifiers such as euclidean and mahalanobis distance classifiers and a promising method SVM (Support Vector Machine). As a result of simulation, the average values of accuracies for the Linear Discriminant Analysis (LDA)-Quadratic, k-Nearest Neighbors (k-NN)-Euclidean, and Linear SVM were 48%, 34.2%, and 86%, respectively. The experimental results show that SVM classification method offer the better performance for reliable classification of the EEG signal in comparison with the other classification methods.

웨이블릿에 기반한 시그널 형태를 지닌 대형 자료의 feature 추출 방법 (A Wavelet based Feature Selection Method to Improve Classification of Large Signal-type Data)

  • 장우성;장우진
    • 대한산업공학회지
    • /
    • 제32권2호
    • /
    • pp.133-140
    • /
    • 2006
  • Large signal type data sets are difficult to classify, especially if the data sets are non-stationary. In this paper, large signal type and non-stationary data sets are wavelet transformed so that distinct features of the data are extracted in wavelet domain rather than time domain. For the classification of the data, a few wavelet coefficients representing class properties are employed for statistical classification methods : Linear Discriminant Analysis, Quadratic Discriminant Analysis, Neural Network etc. The application of our wavelet-based feature selection method to a mass spectrometry data set for ovarian cancer diagnosis resulted in 100% classification accuracy.

부분 최소 자승법과 잔차 보상기를 이용한 비선형 데이터 분류 (Non-linear Data Classification Using Partial Least Square and Residual Compensator)

  • 김경훈;김태영;최원호
    • 제어로봇시스템학회논문지
    • /
    • 제10권2호
    • /
    • pp.185-191
    • /
    • 2004
  • Partial least squares(PLS) is one of multiplicate statistical process methods and has been developed in various algorithms with the characteristics of principal component analysis, dimensionality reduction, and analysis of the relationship between input variables and output variables. But it has been limited somewhat by their dependency on linear mathematics. The algorithm is proposed to classify for the non-linear data using PLS and the residual compensator(RC) based on radial basis function network (RBFN). It compensates for the error of the non-linear data using the RC based on RBFN. The experimental result is given to verify its efficiency compared with those of previous works.

Evaluation of User Profile Construction Method by Fuzzy Inference

  • Kim, Byeong-Man;Rho, Sun-Ok;Oh, Sang-Yeop;Lee, Hyun-Ah;Kim, Jong-Wan
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • 제8권3호
    • /
    • pp.175-184
    • /
    • 2008
  • To construct user profiles automatically, an extraction method for representative keywords from a set of documents is needed. In our previous works, we suggested such a method and showed its usefulness. Here, we apply it to the classification problem and observe how much it contributes to performance improvement. The method can be used as a linear document classifier with few modifications. So, we first evaluate its performance for that case. The method is also applicable to some non-linear classification methods such as GIS (Generalized Instance Set). In GIS algorithm, generalized instances are built from training documents by a generalization function and then the K-NN algorithm is applied to them, where the method can be used as a generalization function. For comparative works, two famous linear classification methods, Rocchio and Widrow-Hoff algorithms, are also used. Experimental results show that our method is better than the others for the case that only positive documents are considered, but not when negative documents are considered together.

Using Support Vector Machine to Predict Political Affiliations on Twitter: Machine Learning approach

  • Muhammad Javed;Kiran Hanif;Arslan Ali Raza;Syeda Maryum Batool;Syed Muhammad Ali Haider
    • International Journal of Computer Science & Network Security
    • /
    • 제24권5호
    • /
    • pp.217-223
    • /
    • 2024
  • The current study aimed to evaluate the effectiveness of using Support Vector Machine (SVM) for political affiliation classification. The system was designed to analyze the political tweets collected from Twitter and classify them as positive, negative, and neutral. The performance analysis of the SVM classifier was based on the calculation of metrics such as accuracy, precision, recall, and f1-score. The results showed that the classifier had high accuracy and f1-score, indicating its effectiveness in classifying the political tweets. The implementation of SVM in this study is based on the principle of Structural Risk Minimization (SRM), which endeavors to identify the maximum margin hyperplane between two classes of data. The results indicate that SVM can be a reliable classification approach for the analysis of political affiliations, possessing the capability to accurately categorize both linear and non-linear information using linear, polynomial or radial basis kernels. This paper provides a comprehensive overview of using SVM for political affiliation analysis and highlights the importance of using accurate classification methods in the field of political analysis.

Ultimate Longitudinal Strength Assessment of Ships' Hull Girders

  • Lee, Hun-Gon;Lee, Joo-Sung
    • Journal of Ship and Ocean Technology
    • /
    • 제12권1호
    • /
    • pp.45-56
    • /
    • 2008
  • Recently, International Association Classification Societies (IACS) has adopted the Common Structural Rules (CSR) for Bulk Carriers and Tankers, which specifies the requirement associated with the ultimate strength of hull girder structure. The theoretical background and the results of verification study are neither well summarized nor released. Furthermore, the requirement is not a form of deterministic formula but a form of program in which source code is not disclosed. The reliability of the non-linear structural analysis program is verified through the comparison with the results of the analysis and the model test. Then, the reliability of the ultimate strength requirement in CSR is checked by comparing with the results of rigorous non-linear analysis.

서비스 속성과 고객만족과의 비대칭적, 비선형적 관계에 근거한 서비스 속성 분류와 전략적 고객서비스 경영 (Classification of Service Attributes and Strategic Customer Service Management based on the Asymmetric and Non-linear Relationship between Service Attributes and Customer Satisfaction)

  • 박정영;이계희
    • 한국식생활문화학회지
    • /
    • 제23권5호
    • /
    • pp.605-615
    • /
    • 2008
  • The principal objective of this study was to categorize service attributes on the basis of the asymmetric and non-linear relationship existing between service attributes and customer satisfaction. Researchers generally assume that service attribute performances and customer satisfaction are both symmetrical and linear. That is to say, improvements in attribute performance will inevitably result in increased customer satisfaction. However, this is not always the case. Certain attributes have been shown not to create satisfaction even when improved, and others do not create dissatisfaction even when their performance ratings become negative. Understanding this relationship is crucial not only to researchers, but also to service managers. Service managers can arrange their priorities with regard to which attributes must be improved or promoted first, in an environment of limited technical, financial, and human resources. Many studies into this asymmetric and non-linear relationship have recently been conducted, beginning with Herzberg's motivation-hygiene theory (1976) and the disconfirmation theory, which was eventually developed into Kano's model (1984). This study attempted to determine the impact level of service attributes on incidents of satisfaction or dissatisfaction. It used 30 service attributes generated by Park (2008) in the CIT research into family restaurants. The data were collected from 600 participants, 300 incidences of satisfaction and 300 incidents of dissatisfaction, via an online survey. The t-test was used to confirm the difference between the satisfaction group's and dissatisfaction group's attributes. 11 attributes were found to be significant at a level of p>0.05. This indicates that the 11 attributes exerted different impacts on satisfaction and dissatisfaction, which confirmed the asymmetric and non-linear relationship. 14 attributes were categorized into the core service, 1 attribute into the quality service, 7 attributes into the basic service, and 8 attributes into the neutral service. Strategic customer service management was recommended for the 'A' family restaurant as an example, on the basis of the asymmetric and non-linear relationship and the characteristics of the four service factors.