• Title/Summary/Keyword: variable feature

Search Result 391, Processing Time 0.026 seconds

Investigating Dynamic Mutation Process of Issues Using Unstructured Text Analysis (부도예측을 위한 KNN 앙상블 모형의 동시 최적화)

  • Min, Sung-Hwan
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.1
    • /
    • pp.139-157
    • /
    • 2016
  • Bankruptcy involves considerable costs, so it can have significant effects on a country's economy. Thus, bankruptcy prediction is an important issue. Over the past several decades, many researchers have addressed topics associated with bankruptcy prediction. Early research on bankruptcy prediction employed conventional statistical methods such as univariate analysis, discriminant analysis, multiple regression, and logistic regression. Later on, many studies began utilizing artificial intelligence techniques such as inductive learning, neural networks, and case-based reasoning. Currently, ensemble models are being utilized to enhance the accuracy of bankruptcy prediction. Ensemble classification involves combining multiple classifiers to obtain more accurate predictions than those obtained using individual models. Ensemble learning techniques are known to be very useful for improving the generalization ability of the classifier. Base classifiers in the ensemble must be as accurate and diverse as possible in order to enhance the generalization ability of an ensemble model. Commonly used methods for constructing ensemble classifiers include bagging, boosting, and random subspace. The random subspace method selects a random feature subset for each classifier from the original feature space to diversify the base classifiers of an ensemble. Each ensemble member is trained by a randomly chosen feature subspace from the original feature set, and predictions from each ensemble member are combined by an aggregation method. The k-nearest neighbors (KNN) classifier is robust with respect to variations in the dataset but is very sensitive to changes in the feature space. For this reason, KNN is a good classifier for the random subspace method. The KNN random subspace ensemble model has been shown to be very effective for improving an individual KNN model. The k parameter of KNN base classifiers and selected feature subsets for base classifiers play an important role in determining the performance of the KNN ensemble model. However, few studies have focused on optimizing the k parameter and feature subsets of base classifiers in the ensemble. This study proposed a new ensemble method that improves upon the performance KNN ensemble model by optimizing both k parameters and feature subsets of base classifiers. A genetic algorithm was used to optimize the KNN ensemble model and improve the prediction accuracy of the ensemble model. The proposed model was applied to a bankruptcy prediction problem by using a real dataset from Korean companies. The research data included 1800 externally non-audited firms that filed for bankruptcy (900 cases) or non-bankruptcy (900 cases). Initially, the dataset consisted of 134 financial ratios. Prior to the experiments, 75 financial ratios were selected based on an independent sample t-test of each financial ratio as an input variable and bankruptcy or non-bankruptcy as an output variable. Of these, 24 financial ratios were selected by using a logistic regression backward feature selection method. The complete dataset was separated into two parts: training and validation. The training dataset was further divided into two portions: one for the training model and the other to avoid overfitting. The prediction accuracy against this dataset was used to determine the fitness value in order to avoid overfitting. The validation dataset was used to evaluate the effectiveness of the final model. A 10-fold cross-validation was implemented to compare the performances of the proposed model and other models. To evaluate the effectiveness of the proposed model, the classification accuracy of the proposed model was compared with that of other models. The Q-statistic values and average classification accuracies of base classifiers were investigated. The experimental results showed that the proposed model outperformed other models, such as the single model and random subspace ensemble model.

Improving Efficiency of Food Hygiene Surveillance System by Using Machine Learning-Based Approaches (기계학습을 이용한 식품위생점검 체계의 효율성 개선 연구)

  • Cho, Sanggoo;Cho, Seung Yong
    • The Journal of Bigdata
    • /
    • v.5 no.2
    • /
    • pp.53-67
    • /
    • 2020
  • This study employees a supervised learning prediction model to detect nonconformity in advance of processed food manufacturing and processing businesses. The study was conducted according to the standard procedure of machine learning, such as definition of objective function, data preprocessing and feature engineering and model selection and evaluation. The dependent variable was set as the number of supervised inspection detections over the past five years from 2014 to 2018, and the objective function was to maximize the probability of detecting the nonconforming companies. The data was preprocessed by reflecting not only basic attributes such as revenues, operating duration, number of employees, but also the inspections track records and extraneous climate data. After applying the feature variable extraction method, the machine learning algorithm was applied to the data by deriving the company's risk, item risk, environmental risk, and past violation history as feature variables that affect the determination of nonconformity. The f1-score of the decision tree, one of ensemble models, was much higher than those of other models. Based on the results of this study, it is expected that the official food control for food safety management will be enhanced and geared into the data-evidence based management as well as scientific administrative system.

Comparison of Prediction Accuracy Between Classification and Convolution Algorithm in Fault Diagnosis of Rotatory Machines at Varying Speed (회전수가 변하는 기기의 고장진단에 있어서 특성 기반 분류와 합성곱 기반 알고리즘의 예측 정확도 비교)

  • Moon, Ki-Yeong;Kim, Hyung-Jin;Hwang, Se-Yun;Lee, Jang Hyun
    • Journal of Navigation and Port Research
    • /
    • v.46 no.3
    • /
    • pp.280-288
    • /
    • 2022
  • This study examined the diagnostics of abnormalities and faults of equipment, whose rotational speed changes even during regular operation. The purpose of this study was to suggest a procedure that can properly apply machine learning to the time series data, comprising non-stationary characteristics as the rotational speed changes. Anomaly and fault diagnosis was performed using machine learning: k-Nearest Neighbor (k-NN), Support Vector Machine (SVM), and Random Forest. To compare the diagnostic accuracy, an autoencoder was used for anomaly detection and a convolution based Conv1D was additionally used for fault diagnosis. Feature vectors comprising statistical and frequency attributes were extracted, and normalization & dimensional reduction were applied to the extracted feature vectors. Changes in the diagnostic accuracy of machine learning according to feature selection, normalization, and dimensional reduction are explained. The hyperparameter optimization process and the layered structure are also described for each algorithm. Finally, results show that machine learning can accurately diagnose the failure of a variable-rotation machine under the appropriate feature treatment, although the convolution algorithms have been widely applied to the considered problem.

Design of Nonlinear Controller for Variable Speed Wind Turbines based on Kalman Filter and Artificial Neural Network (칼만필터 및 인공신경망에 기반한 가변속 풍력발전 시스템을 위한 비선형 제어기 설계)

  • Moon, Dae-Sun;Kim, Sung-Ho
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.20 no.2
    • /
    • pp.243-250
    • /
    • 2010
  • As the wind has become one of the fastest growing renewable energy sources, the key issue of wind energy conversion systems is how to efficiently operate the wind turbines in a wide range of wind speeds. Compared to fixed speed turbines, variable speed wind turbines feature higher energy yields, lower component stress and fewer grid connection power peaks. Generally, measurement of wind speed is required for the control of variable speed wind turbine system. However, wind speed measured by anemometers is not accurate owing to various reasons. In this work, a new control algorithm for variable speed wind turbine system based on Kalman filter which can be used for the estimation of wind speed and artificial neural network which can generate optimum rotor speed is proposed. Also, to verify the feasibility of the proposed scheme, various simulation studies are carried out by using Simulink in Matlab.

Night Time Leading Vehicle Detection Using Statistical Feature Based SVM (통계적 특징 기반 SVM을 이용한 야간 전방 차량 검출 기법)

  • Joung, Jung-Eun;Kim, Hyun-Koo;Park, Ju-Hyun;Jung, Ho-Youl
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.7 no.4
    • /
    • pp.163-172
    • /
    • 2012
  • A driver assistance system is critical to improve a convenience and stability of vehicle driving. Several systems have been already commercialized such as adaptive cruise control system and forward collision warning system. Efficient vehicle detection is very important to improve such driver assistance systems. Most existing vehicle detection systems are based on a radar system, which measures distance between a host and leading (or oncoming) vehicles under various weather conditions. However, it requires high deployment cost and complexity overload when there are many vehicles. A camera based vehicle detection technique is also good alternative method because of low cost and simple implementation. In general, night time vehicle detection is more complicated than day time vehicle detection, because it is much more difficult to distinguish the vehicle's features such as outline and color under the dim environment. This paper proposes a method to detect vehicles at night time using analysis of a captured color space with reduction of reflection and other light sources in images. Four colors spaces, namely RGB, YCbCr, normalized RGB and Ruta-RGB, are compared each other and evaluated. A suboptimal threshold value is determined by Otsu algorithm and applied to extract candidates of taillights of leading vehicles. Statistical features such as mean, variance, skewness, kurtosis, and entropy are extracted from the candidate regions and used as feature vector for SVM(Support Vector Machine) classifier. According to our simulation results, the proposed statistical feature based SVM provides relatively high performances of leading vehicle detection with various distances in variable nighttime environments.

An analysis on Effect of Use Intention of Mean automated Store Customer -focused on franchisee (무인점포 고객의 이용의도에 미치는 영향 분석 -프랜차이즈 가맹점 중심으로)

  • Kang, Seong-Cheol;Han, Kyeong-Seok;Jeon, Woo-Jae
    • Journal of Digital Contents Society
    • /
    • v.19 no.7
    • /
    • pp.1313-1322
    • /
    • 2018
  • This study has tested by positive analysis to investigate use intention of self service shop offered by franchisee. The independent variable of self service shop consisted of technology based self service and feature of shop largely. Convenience of technology based self service, speed, functionality of shop feature, suitability, and cost of self service shop had been selected. As parameter, expectation confirmation and satisfaction by using Expectation confirmation theory had been selected and a dependent variable had been selected use intention lastly. The study conducted a survey of self service shop customers and 181's answers out of them had been used. According to the analysis result, convenience, speed, functionality, and suitability had a positive effect on expectation and convenience and suitability variables only had positive effect on satisfaction. The cost did not have an effect on use intention and expectation confirmation and satisfaction had a positive effect on use intention lastly.

Role of Features in Plasma Information Based Virtual Metrology (PI-VM) for SiO2 Etching Depth (플라즈마 정보인자를 활용한 SiO2 식각 깊이 가상 계측 모델의 특성 인자 역할 분석)

  • Jang, Yun Chang;Park, Seol Hye;Jeong, Sang Min;Ryu, Sang Won;Kim, Gon Ho
    • Journal of the Semiconductor & Display Technology
    • /
    • v.18 no.4
    • /
    • pp.30-34
    • /
    • 2019
  • We analyzed how the features in plasma information based virtual metrology (PI-VM) for SiO2 etching depth with variation of 5% contribute to the prediction accuracy, which is previously developed by Jang. As a single feature, the explanatory power to the process results is in the order of plasma information about electron energy distribution function (PIEEDF), equipment, and optical emission spectroscopy (OES) features. In the procedure of stepwise variable selection (SVS), OES features are selected after PIEEDF. Informative vector for developed PI-VM also shows relatively high correlation between OES features and etching depth. This is because the reaction rate of each chemical species that governs the etching depth can be sensitively monitored when OES features are used with PIEEDF. Securing PIEEDF is important for the development of virtual metrology (VM) for prediction of process results. The role of PIEEDF as an independent feature and the ability to monitor variation of plasma thermal state can make other features in the procedure of SVS more sensitive to the process results. It is expected that fault detection and classification (FDC) can be effectively developed by using the PI-VM.

A Development Method of Web System Combining Service Oriented Architecture with Multi-Software Product Line (서비스지향 아키텍처와 멀티소프트웨어 프로덕트라인을 결합한 웹 시스템 개발 방법)

  • Jung, IlKwon
    • The Journal of Society for e-Business Studies
    • /
    • v.24 no.3
    • /
    • pp.53-71
    • /
    • 2019
  • As software systems become more complex and larger, software systems require a way to reuse software components or modules to provide new functionality. This paper designed a development method of web system combining SOA(Service Oriented Architecture) with MPSL(Multi-Software Product Line). According to provides SOA and MPSL, this paper suggested to service providers and service users to provide and reuse variable services. From the viewpoint of service provider, the suggested method identifies and implements reusable variable services as features by syntax-based, functional-based, and behavior-based methods applying feature identification guidelines and manages them as reuse assets. From the user's point of view, it is possible to develop a web system by constructing a service by workflow model as a method of structure and reconfigure services. As a result of measuring the reuse of the web system constructed in this paper by the function point, the cost reduction effect was verified by applying it to the similar project with the increase of reuse.

A Study on the Design Plan of Naval Combat System Software to Reduce Cost of Hardware Discontinuation Replacement

  • Jeong-Woo, Son
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.1
    • /
    • pp.71-78
    • /
    • 2023
  • In this paper, we analyze the structure of TV video software, one of the warship combat management system software, and propose a standard architecture that minimizes software modification due to the discontinuation replacement of warship hardware. The class structure was newly designed to minimize the class modified when replacing the warship hardware by separating the variable elements and common elements of TV video software through FORM(Feature-Oriented Reuse Method), the common part that communicates with the warship combat management system and displays the TV screen and the variable part that communicates between the operator and the TV camera. In addition, the Strategy design pattern is applied to efficiently add and modify classes that directly use hardware-dependent APIs when replacing hardware discontinuation, and to make both discontinued and replacements available software. Finally, the reliability testing time and functional testing time of the existing TV video software and the proposed software were measured and compared, and finally, it was confirmed that the hardware discontinuation replacement cost was reduced.

A Variable Parameter Model based on SSMS for an On-line Speech and Character Combined Recognition System (음성 문자 공용인식기를 위한 SSMS 기반 가변 파라미터 모델)

  • 석수영;정호열;정현열
    • The Journal of the Acoustical Society of Korea
    • /
    • v.22 no.7
    • /
    • pp.528-538
    • /
    • 2003
  • A SCCRS (Speech and Character Combined Recognition System) is developed for working on mobile devices such as PDA (Personal Digital Assistants). In SCCRS, the feature extraction is separately carried out for speech and for hand-written character, but the recognition is performed in a common engine. The recognition engine employs essentially CHMM (Continuous Hidden Markov Model), which consists of variable parameter topology in order to minimize the number of model parameters and to reduce recognition time. For generating contort independent variable parameter model, we propose the SSMS(Successive State and Mixture Splitting), which gives appropriate numbers of mixture and of states through splitting in mixture domain and in time domain. The recognition results show that the proposed SSMS method can reduce the total number of GOPDD (Gaussian Output Probability Density Distribution) up to 40.0% compared to the conventional method with fixed parameter model, at the same recognition performance in speech recognition system.