• Title/Summary/Keyword: Classification Attributes

Search Result 303, Processing Time 0.023 seconds

Classifier Selection using Feature Space Attributes in Local Region (국부적 영역에서의 특징 공간 속성을 이용한 다중 인식기 선택)

  • Shin Dong-Kuk;Song Hye-Jeong;Kim Baeksop
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.12
    • /
    • pp.1684-1690
    • /
    • 2004
  • This paper presents a method for classifier selection that uses distribution information of the training samples in a small region surrounding a sample. The conventional DCS-LA(Dynamic Classifier Selection - Local Accuracy) selects a classifier dynamically by comparing the local accuracy of each classifier at the test time, which inevitably requires long classification time. On the other hand, in the proposed approach, the best classifier in a local region is stored in the FSA(Feature Space Attribute) table during the training time, and the test is done by just referring to the table. Therefore, this approach enables fast classification because classification is not needed during test. Two feature space attributes are used entropy and density of k training samples around each sample. Each sample in the feature space is mapped into a point in the attribute space made by two attributes. The attribute space is divided into regular rectangular cells in which the local accuracy of each classifier is appended. The cells with associated local accuracy comprise the FSA table. During test, when a test sample is applied, the cell to which the test sample belongs is determined first by calculating the two attributes, and then, the most accurate classifier is chosen from the FSA table. To show the effectiveness of the proposed algorithm, it is compared with the conventional DCS -LA using the Elena database. The experiments show that the accuracy of the proposed algorithm is almost same as DCS-LA, but the classification time is about four times faster than that.

Privacy Disclosure and Preservation in Learning with Multi-Relational Databases

  • Guo, Hongyu;Viktor, Herna L.;Paquet, Eric
    • Journal of Computing Science and Engineering
    • /
    • v.5 no.3
    • /
    • pp.183-196
    • /
    • 2011
  • There has recently been a surge of interest in relational database mining that aims to discover useful patterns across multiple interlinked database relations. It is crucial for a learning algorithm to explore the multiple inter-connected relations so that important attributes are not excluded when mining such relational repositories. However, from a data privacy perspective, it becomes difficult to identify all possible relationships between attributes from the different relations, considering a complex database schema. That is, seemingly harmless attributes may be linked to confidential information, leading to data leaks when building a model. Thus, we are at risk of disclosing unwanted knowledge when publishing the results of a data mining exercise. For instance, consider a financial database classification task to determine whether a loan is considered high risk. Suppose that we are aware that the database contains another confidential attribute, such as income level, that should not be divulged. One may thus choose to eliminate, or distort, the income level from the database to prevent potential privacy leakage. However, even after distortion, a learning model against the modified database may accurately determine the income level values. It follows that the database is still unsafe and may be compromised. This paper demonstrates this potential for privacy leakage in multi-relational classification and illustrates how such potential leaks may be detected. We propose a method to generate a ranked list of subschemas that maintains the predictive performance on the class attribute, while limiting the disclosure risk, and predictive accuracy, of confidential attributes. We illustrate and demonstrate the effectiveness of our method against a financial database and an insurance database.

A Comparative Study of Medical Data Classification Methods Based on Decision Tree and System Reconstruction Analysis

  • Tang, Tzung-I;Zheng, Gang;Huang, Yalou;Shu, Guangfu;Wang, Pengtao
    • Industrial Engineering and Management Systems
    • /
    • v.4 no.1
    • /
    • pp.102-108
    • /
    • 2005
  • This paper studies medical data classification methods, comparing decision tree and system reconstruction analysis as applied to heart disease medical data mining. The data we study is collected from patients with coronary heart disease. It has 1,723 records of 71 attributes each. We use the system-reconstruction method to weight it. We use decision tree algorithms, such as induction of decision trees (ID3), classification and regression tree (C4.5), classification and regression tree (CART), Chi-square automatic interaction detector (CHAID), and exhausted CHAID. We use the results to compare the correction rate, leaf number, and tree depth of different decision-tree algorithms. According to the experiments, we know that weighted data can improve the correction rate of coronary heart disease data but has little effect on the tree depth and leaf number.

Classification System of Fashion Emotion for the Standardization of Data (데이터 표준화를 위한 패션 감성 분류 체계)

  • Park, Nanghee;Choi, Yoonmi
    • Journal of the Korean Society of Clothing and Textiles
    • /
    • v.45 no.6
    • /
    • pp.949-964
    • /
    • 2021
  • Accumulation of high-quality data is crucial for AI learning. The goal of using AI in fashion service is to propose of a creative, personalized solution that is close to the know-how of a human operator. These customized solutions require an understanding of fashion products and emotions. Therefore, it is necessary to accumulate data on the attributes of fashion products and fashion emotion. The first step for accumulating fashion data is to standardize the attribute with coherent system. The purpose of this study is to propose a fashion emotional classification system. For this, images of fashion products were collected, and metadata was obtained by allowing consumers to describe their emotions about fashion images freely. An emotional classification system with a hierarchical structure, was then constructed by performing frequency and CONCOR analyses on metadata. A final classification system was proposed by supplementing attribute values with reference to findings from previous studies and SNS data.

Advancements in Unmanned Aerial Vehicle Classification, Tracking, and Detection Algorithms

  • Ahmed Abdulhakim Al-Absi
    • International journal of advanced smart convergence
    • /
    • v.12 no.3
    • /
    • pp.32-39
    • /
    • 2023
  • This paper provides a comprehensive overview of UAV classification, tracking, and detection, offering researchers a clear understanding of these fundamental concepts. It elucidates how classification categorizes UAVs based on attributes, how tracking monitors real-time positions, and how detection identifies UAV presence. The interconnectedness of these aspects is highlighted, with detection enhancing tracking and classification aiding in anomaly identification. Moreover, the paper emphasizes the relevance of simulations in the context of drones and UAVs, underscoring their pivotal role in training, testing, and research. By succinctly presenting these core concepts and their practical implications, the paper equips researchers with a solid foundation to comprehend and explore the complexities of UAV operations and the role of simulations in advancing this dynamic field.

Interpolation on data with multiple attributes by a neural network

  • Azumi, Hiroshi;Hiraoka, Kazuyuki;Mishima, Taketoshi
    • Proceedings of the IEEK Conference
    • /
    • 2002.07b
    • /
    • pp.814-817
    • /
    • 2002
  • High-dimensional data with two or more attributes are considered. A typical example of such data is face images of various individuals and expressions. In these cases, collecting a complete data set is often difficult since the number of combinations can be large. In the present study, we propose a method to interpolate data of missing combinations from other data. If this becomes possible, robust recognition of multiple attributes is expectable. The key of this subject is appropriate extraction of the similarity that the face images of same individual or same expression have. Bilinear model [1]has been proposed as a solution of this subjcet. However, experiments on application of bilinear model to classification of face images resulted in low performance [2]. In order to overcome the limit of bilinear model, in this research, a nonlinear model on a neural network is adopted and usefulness of this model is experimentally confirmed.

  • PDF

Hybrid Feature Selection Method Based on Genetic Algorithm for the Diagnosis of Coronary Heart Disease

  • Wiharto, Wiharto;Suryani, Esti;Setyawan, Sigit;Putra, Bintang PE
    • Journal of information and communication convergence engineering
    • /
    • v.20 no.1
    • /
    • pp.31-40
    • /
    • 2022
  • Coronary heart disease (CHD) is a comorbidity of COVID-19; therefore, routine early diagnosis is crucial. A large number of examination attributes in the context of diagnosing CHD is a distinct obstacle during the pandemic when the number of health service users is significant. The development of a precise machine learning model for diagnosis with a minimum number of examination attributes can allow examinations and healthcare actions to be undertaken quickly. This study proposes a CHD diagnosis model based on feature selection, data balancing, and ensemble-based classification methods. In the feature selection stage, a hybrid SVM-GA combined with fast correlation-based filter (FCBF) is used. The proposed system achieved an accuracy of 94.60% and area under the curve (AUC) of 97.5% when tested on the z-Alizadeh Sani dataset and used only 8 of 54 inspection attributes. In terms of performance, the proposed model can be placed in the very good category.

A Study of Landscape Construction Work Classification for System Instruction of New Estimation System based on Historical Construction data. - With regard to Housing Landscape Construction - (실적공사비 적산방식 도입을 위한 조경공사 공종분류체계에 관한 연구 -주택단지 조경공사를 중심으로-)

  • 박원규;김두하;안동만
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.25 no.1
    • /
    • pp.82-99
    • /
    • 1997
  • The purpose of this study is to establish work classification system of landscape construction in order to offer the basis of new estimation system of public landscape construction. New estimation system is based on historical construction data. For application of this system, the standard work classification system is necessary. Because extensive cost data should be accumulated under an unified construction work classification system. In the study of new estimation system carried by KICT(Korea Institute of Construction Technology), landscaping works belong to earth work of civil engineering. It looks very unreasonable work classification, because landscape archtecture has its own specialties and professional domain. In this study, information classification systems in the construction industry and various landscaping works of housing developments are analysed. As a result. a standard work classification system of housing landscape construction is proposed in section VI-3. This standard work classification structure consists of three levels divisions (i.e large work division, middle work division, small work division) . Now in this study, housing landscape construction works are divided into four large works and twenty six middle works. According to work attributes, middle and small work division is possible to subdivide into details.

  • PDF

A New Decision Tree Algorithm Based on Rough Set and Entity Relationship (러프셋 이론과 개체 관계 비교를 통한 의사결정나무 구성)

  • Han, Sang-Wook;Kim, Jae-Yearn
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.33 no.2
    • /
    • pp.183-190
    • /
    • 2007
  • We present a new decision tree classification algorithm using rough set theory that can induce classification rules, the construction of which is based on core attributes and relationship between objects. Although decision trees have been widely used in machine learning and artificial intelligence, little research has focused on improving classification quality. We propose a new decision tree construction algorithm that can be simplified and provides an improved classification quality. We also compare the new algorithm with the ID3 algorithm in terms of the number of rules.

An Application of the Rough Set Approach to credit Rating

  • Kim, Jae-Kyeong;Cho, Sung-Sik
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 1999.10a
    • /
    • pp.347-354
    • /
    • 1999
  • The credit rating represents an assessment of the relative level of risk associated with the timely payments required by the debt obligation. In this paper, we present a new approach to credit rating of customers based on the rough set theory. The concept of a rough set appeared to be an effective tool for the analysis of customer information systems representing knowledge gained by experience. The customer information system describes a set of customers by a set of multi-valued attributes, called condition attributes. The customers are classified into groups of risk subject to an expert's opinion, called decision attribute. A natural problem of knowledge analysis consists then in discovering relationships, in terms of decision rules, between description of customers by condition attributes and particular decisions. The rough set approach enables one to discover minimal subsets of condition attributes ensuring an acceptable quality of classification of the customers analyzed and to derive decision rules from the customer information system which can be used to support decisions about rating new customers. Using the rough set approach one analyses only facts hidden in data, it does not need any additional information about data and does not correct inconsistencies manifested in data; instead, rules produced are categorized into certain and possible. A real problem of the evaluation of the evaluation of credit rating by a department store is studied using the rough set approach.

  • PDF