• Title/Summary/Keyword: Data Classification Systems

Search Result 1,432, Processing Time 0.025 seconds

A Model-based Collaborative Filtering Through Regularized Discriminant Analysis Using Market Basket Data

  • Lee, Jong-Seok;Jun, Chi-Hyuck;Lee, Jae-Wook;Kim, Soo-Young
    • Management Science and Financial Engineering
    • /
    • v.12 no.2
    • /
    • pp.71-85
    • /
    • 2006
  • Collaborative filtering, among other recommender systems, has been known as the most successful recommendation technique. However, it requires the user-item rating data, which may not be easily available. As an alternative, some collaborative filtering algorithms have been developed recently by utilizing the market basket data in the form of the binary user-item matrix. Viewing the recommendation scheme as a two-class classification problem, we proposed a new collaborative filtering scheme using a regularized discriminant analysis applied to the binary user-item data. The proposed discriminant model was built in terms of the major principal components and was used for predicting the probability of purchasing a particular item by an active user. The proposed scheme was illustrated with two modified real data sets and its performance was compared with the existing user-based approach in terms of the recommendation precision.

The Study about the Comparison of Oriental-Western Medicine on the Classification and Diagnosis of Headache (두통의 분류와 진단의 동서의학적 고찰)

  • Jung, Chan-Yung;Kim, Eun-Jung;Jang, Min-Gee;Yoon, Eun-Hye;Nam, Dong-Woo;Kang, Jung-Won;Lee, Seung-Deok;Lee, Jae-Dong;Kim, Kap-Sung
    • Journal of Acupuncture Research
    • /
    • v.26 no.6
    • /
    • pp.225-239
    • /
    • 2009
  • Objectives : To establish a well organized and systematic oriental medicine classification of headache, the western and oriental medicine diagnosis and treatment systems of headache were reviewed. Methods : The history and development process of western medicine classification of headache were studied. A literature review of oriental medicine classification of headache was done. The characters of each classification systems were assessed. Results : In western medicine, many international societies concerning headache have been established. Through these societies, a classification of headache which can be used by both researchers and practitioners has been suggested. And the suggested classification system is highly recommended to be used in studies in order to increase utilization. As data is accumulated, new versions of the classification system were updated. But in the case of oriental medicine, various classification systems of headache are presented in numerous literatures. But the effort to unify and systemize the oriental medicine headache classification has been in lack. Conclusions : Establishment and utilization of a standardized oriental medicine headache classification system, based on various classifications and detailed descriptions is needed.

  • PDF

Pillar and Vehicle Classification using Ultrasonic Sensors and Statistical Regression Method (통계적 회귀 기법을 활용한 초음파 센서 기반의 기둥 및 차량 분류 알고리즘)

  • Lee, Chung-Su;Park, Eun-Soo;Lee, Jong-Hwan;Kim, Jong-Hee;Kim, Hakil
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.20 no.4
    • /
    • pp.428-436
    • /
    • 2014
  • This paper proposes a statistical regression method for classifying pillars and vehicles in parking area using a single ultrasonic sensor. There are three types of information provided by the ultrasonic sensor: TOF, the peak and the width of a pulse, from which 67 different features are extracted through segmentation and data preprocessing. The classification using the multiple SVM and the multinomial logistic regression are applied to the set of extracted features, and has achieved the accuracy of 85% and 89.67%, respectively, over a set of real-world data. The experimental result proves that the proposed feature extraction and classification scheme is applicable to the object classification using an ultrasonic sensor.

A Study on the Analysis and Classification of Types and Causes of Railway Accidents (철도사고 위험분류 및 원인분석에 관한 연구)

  • Park Chan-Woo;Park Joo-Nam;Wang Jong-Bae;Cho Yun-ok
    • Proceedings of the KSR Conference
    • /
    • 2005.11a
    • /
    • pp.599-604
    • /
    • 2005
  • As a public transportation possible to convey a large quantity, the railway is safe and keeps time, but it has hazards to cause a disaster if the accidents such as collision, derailment, and fire occur. So advanced countries carry out System Safety Plan with various program activities which have connected orders to maintain or improve safety level by finding hazards, evaluation, taking measures and practice, and improving problems. Especially they systematically manage hazards to cause railway accidents and the factors which possibly threat safety, using national classification of risk and causes with analysis of the related data such as establishing accident/incident data and safety regulations/standards. As executing railway safety regulations, domestic railway is currently trying to improve railway safety management system. The research of classification system of accidents/incidents is one thing to make railway safety management systems better. In this research, we reviewed hazardous factors of railway systems and classification of the causes as the beginning of system safety management, and we conducted study on development of railway accident classification based on findings of this research. The results are able to be used in identifying hazards and activities of systemic safety management at the step of railway accident report and investigation.

  • PDF

Multi-Label Classification for Corporate Review Text: A Local Grammar Approach (머신러닝 기반의 기업 리뷰 다중 분류: 부분 문법 적용을 중심으로)

  • HyeYeon Baek;Young Kyun Chang
    • Information Systems Review
    • /
    • v.25 no.3
    • /
    • pp.27-41
    • /
    • 2023
  • Unlike the previous works focusing on the state-of-the-art methodologies to improve the performance of machine learning models, this study improves the 'quality' of training data used in machine learning. We propose a method to enhance the quality of training data through the processing of 'local grammar,' frequently used in corpus analysis. We collected a vast amount of unstructured corporate review text data posted by employees working in the top 100 companies in Korea. After improving the data quality using the local grammar process, we confirmed that the classification model with local grammar outperformed the model without it in terms of classification performance. We defined five factors of work engagement as classification categories, and analyzed how the pattern of reviews changed before and after the COVID-19 pandemic. Through this study, we provide evidence that shows the value of the local grammar-based automatic identification and classification of employee experiences, and offer some clues for significant organizational cultural phenomena.

SEMISUPERVISED CLASSIFICATION FOR FAULT DIAGNOSIS IN NUCLEAR POWER PLANTS

  • MA, JIANPING;JIANG, JIN
    • Nuclear Engineering and Technology
    • /
    • v.47 no.2
    • /
    • pp.176-186
    • /
    • 2015
  • Pattern classifications have become important tools for fault diagnosis in nuclear power plants (NPP). However, it is often difficult to obtain training data under fault conditions to train a supervised classification model. By contrast, normal plant operating data can be easily made available through increased deployment of supervisory, control, and data acquisition systems. Such data can also be used to train classification models to improve the performance of fault diagnosis scheme. In this paper, a fault diagnosis scheme based on semisupervised classification (SSC) scheme is developed. In this scheme, new measurements collected from the plant are integrated with data observed under fault conditions to train the SSC models. The trained models are subsequently applied to new measurements for fault diagnosis. In comparison with supervised classifiers, the proposed scheme requires significantly fewer data collected under fault conditions to train the classifier. The developed scheme has been validated using different fault scenarios on a desktop NPP simulator as well as on a physical NPP simulator using a graph-based SSC algorithm. All the considered faults have been successfully diagnosed. The results have demonstrated that SSC is a promising tool for fault diagnosis in NPPs.

On the Fuzzy Membership Function of Fuzzy Support Vector Machines for Pattern Classification of Time Series Data (퍼지서포트벡터기계의 시계열자료 패턴분류를 위한 퍼지소속 함수에 관한 연구)

  • Lee, Soo-Yong
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.17 no.6
    • /
    • pp.799-803
    • /
    • 2007
  • In this paper, we propose a new fuzzy membership function for FSVM(Fuzzy Support Vector Machines). We apply a fuzzy membership to each input point of SVM and reformulate SVM into fuzzy SVM (FSVM) such that different input points can make different contributions to the learning of decision surface. The proposed method enhances the SVM in reducing the effect of outliers and noises in data points. This paper compares classification and estimated performance of SVM, FSVM(1), and FSVM(2) model that are getting into the spotlight in time series prediction.

Improved Feature Extraction of Hand Movement EEG Signals based on Independent Component Analysis and Spatial Filter

  • Nguyen, Thanh Ha;Park, Seung-Min;Ko, Kwang-Eun;Sim, Kwee-Bo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.22 no.4
    • /
    • pp.515-520
    • /
    • 2012
  • In brain computer interface (BCI) system, the most important part is classification of human thoughts in order to translate into commands. The more accuracy result in classification the system gets, the more effective BCI system is. To increase the quality of BCI system, we proposed to reduce noise and artifact from the recording data to analyzing data. We used auditory stimuli instead of visual ones to eliminate the eye movement, unwanted visual activation, gaze control. We applied independent component analysis (ICA) algorithm to purify the sources which constructed the raw signals. One of the most famous spatial filter in BCI context is common spatial patterns (CSP), which maximize one class while minimize the other by using covariance matrix. ICA and CSP also do the filter job, as a raw filter and refinement, which increase the classification result of linear discriminant analysis (LDA).

Radar and Vision Sensor Fusion for Primary Vehicle Detection (레이더와 비전센서 융합을 통한 전방 차량 인식 알고리즘 개발)

  • Yang, Seung-Han;Song, Bong-Sob;Um, Jae-Young
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.16 no.7
    • /
    • pp.639-645
    • /
    • 2010
  • This paper presents the sensor fusion algorithm that recognizes a primary vehicle by fusing radar and monocular vision data. In general, most of commercial radars may lose tracking of the primary vehicle, i.e., the closest preceding vehicle in the same lane, when it stops or goes with other preceding vehicles in the adjacent lane with similar velocity and range. In order to improve the performance degradation of radar, vehicle detection information from vision sensor and path prediction predicted by ego vehicle sensors will be combined for target classification. Then, the target classification will work with probabilistic association filters to track a primary vehicle. Finally the performance of the proposed sensor fusion algorithm is validated using field test data on highway.

A Fuzzy-Rough Classification Method to Minimize the Coupling Problem of Rules (규칙의 커플링문제를 최소화하기 위한 퍼지-러프 분류방법)

  • Son, Chang-S.;Chung, Hwan-M.;Seo, Suk-T.;Kwon, Soon-H.
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.17 no.4
    • /
    • pp.460-465
    • /
    • 2007
  • In this paper, we propose a novel pattern classification method based on statistical properties of the given data and fuzzy-rough set to minimize the coupling problem of the rules. In the proposed method, statistical properties is used by a selection criteria for deciding a partition number of antecedent fuzzy sets, and for minimizing an coupling problem of the generated rules. Moreover, rough set is used as a tool to remove unnecessary attributes between generated rules from the numerical data. In order to verify the validity of the proposed method, we compared the classification results (i.e, classification precision) of the proposed with the conventional pattern classification methods on the Fisher's IRIS data. From experiment results, we can conclude that the proposed method shows relatively better performance than those of the classification methods based on the conventional approaches.