• Title/Summary/Keyword: Global feature selection

Search Result 35, Processing Time 0.02 seconds

A Study on Storing Node Addition and Instance Leveling Using DIS Message in RPL (RPL에서 DIS 메시지를 이용한 Storing 노드 추가 및 Instance 평준화 기법 연구)

  • Bae, Sung-Hyun;Yun, Jeong-Oh
    • Journal of IKEEE
    • /
    • v.22 no.3
    • /
    • pp.590-598
    • /
    • 2018
  • Recently, interest in IoT(Internet of Things) technology, which provides Internet services to objects, is increasing. IoT offers a variety of services in home networks, healthcare, and disaster alerts. IoT with LLN(Low Power & Lossy Networks) feature frequently loses sensor node. RPL, the standard routing protocol of IoT, performs global repair when data loss occurs in a sensor node. However, frequent loss of sensor nodes due to lower sensor nodes causes network performance degradation due to frequent full path reset. In this paper, we propose an additional selection method of the storage mode sensor node to solve the network degradation problem due to the frequent path resetting problem even after selecting the storage mode sensor node, and propose a method of equalizing the total path resetting number of each instance.

Deriving Local Association Rules by User Segmentation (사용자 구분에 의한 지역적 연관규칙의 유도)

  • Park, Se-Il;Lee, Soo-Wun
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.1_2
    • /
    • pp.53-64
    • /
    • 2002
  • Association rule discovery is a method that detects associative relationships between items or attributes in transactions. It is one of the most widely studied problems in data mining because it offers useful insight into the types of dependencies that exist in a data set. However, most studies on association rule discovery have the drawback that they can not discover association rules among user groups that have common characteristics. To solve this problem, we segment the set of users into user-subgroups by using feature selection and the user segmentation, thus local association rules in user-subgroup can be discovered. To evaluate that the local association rules are more appropriated than the global association rules in each user-subgroup, derived local association rules are compared with global association rules in terms of several evaluation measures.

Automatic scoring of mathematics descriptive assessment using random forest algorithm (랜덤 포레스트 알고리즘을 활용한 수학 서술형 자동 채점)

  • Inyong Choi;Hwa Kyung Kim;In Woo Chung;Min Ho Song
    • The Mathematical Education
    • /
    • v.63 no.2
    • /
    • pp.165-186
    • /
    • 2024
  • Despite the growing attention on artificial intelligence-based automated scoring technology as a support method for the introduction of descriptive items in school environments and large-scale assessments, there is a noticeable lack of foundational research in mathematics compared to other subjects. This study developed an automated scoring model for two descriptive items in first-year middle school mathematics using the Random Forest algorithm, evaluated its performance, and explored ways to enhance this performance. The accuracy of the final models for the two items was found to be between 0.95 to 1.00 and 0.73 to 0.89, respectively, which is relatively high compared to automated scoring models in other subjects. We discovered that the strategic selection of the number of evaluation categories, taking into account the amount of data, is crucial for the effective development and performance of automated scoring models. Additionally, text preprocessing by mathematics education experts proved effective in improving both the performance and interpretability of the automated scoring model. Selecting a vectorization method that matches the characteristics of the items and data was identified as one way to enhance model performance. Furthermore, we confirmed that oversampling is a useful method to supplement performance in situations where practical limitations hinder balanced data collection. To enhance educational utility, further research is needed on how to utilize feature importance derived from the Random Forest-based automated scoring model to generate useful information for teaching and learning, such as feedback. This study is significant as foundational research in the field of mathematics descriptive automatic scoring, and there is a need for various subsequent studies through close collaboration between AI experts and math education experts.

Corporate Default Prediction Model Using Deep Learning Time Series Algorithm, RNN and LSTM (딥러닝 시계열 알고리즘 적용한 기업부도예측모형 유용성 검증)

  • Cha, Sungjae;Kang, Jungseok
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.4
    • /
    • pp.1-32
    • /
    • 2018
  • In addition to stakeholders including managers, employees, creditors, and investors of bankrupt companies, corporate defaults have a ripple effect on the local and national economy. Before the Asian financial crisis, the Korean government only analyzed SMEs and tried to improve the forecasting power of a default prediction model, rather than developing various corporate default models. As a result, even large corporations called 'chaebol enterprises' become bankrupt. Even after that, the analysis of past corporate defaults has been focused on specific variables, and when the government restructured immediately after the global financial crisis, they only focused on certain main variables such as 'debt ratio'. A multifaceted study of corporate default prediction models is essential to ensure diverse interests, to avoid situations like the 'Lehman Brothers Case' of the global financial crisis, to avoid total collapse in a single moment. The key variables used in corporate defaults vary over time. This is confirmed by Beaver (1967, 1968) and Altman's (1968) analysis that Deakins'(1972) study shows that the major factors affecting corporate failure have changed. In Grice's (2001) study, the importance of predictive variables was also found through Zmijewski's (1984) and Ohlson's (1980) models. However, the studies that have been carried out in the past use static models. Most of them do not consider the changes that occur in the course of time. Therefore, in order to construct consistent prediction models, it is necessary to compensate the time-dependent bias by means of a time series analysis algorithm reflecting dynamic change. Based on the global financial crisis, which has had a significant impact on Korea, this study is conducted using 10 years of annual corporate data from 2000 to 2009. Data are divided into training data, validation data, and test data respectively, and are divided into 7, 2, and 1 years respectively. In order to construct a consistent bankruptcy model in the flow of time change, we first train a time series deep learning algorithm model using the data before the financial crisis (2000~2006). The parameter tuning of the existing model and the deep learning time series algorithm is conducted with validation data including the financial crisis period (2007~2008). As a result, we construct a model that shows similar pattern to the results of the learning data and shows excellent prediction power. After that, each bankruptcy prediction model is restructured by integrating the learning data and validation data again (2000 ~ 2008), applying the optimal parameters as in the previous validation. Finally, each corporate default prediction model is evaluated and compared using test data (2009) based on the trained models over nine years. Then, the usefulness of the corporate default prediction model based on the deep learning time series algorithm is proved. In addition, by adding the Lasso regression analysis to the existing methods (multiple discriminant analysis, logit model) which select the variables, it is proved that the deep learning time series algorithm model based on the three bundles of variables is useful for robust corporate default prediction. The definition of bankruptcy used is the same as that of Lee (2015). Independent variables include financial information such as financial ratios used in previous studies. Multivariate discriminant analysis, logit model, and Lasso regression model are used to select the optimal variable group. The influence of the Multivariate discriminant analysis model proposed by Altman (1968), the Logit model proposed by Ohlson (1980), the non-time series machine learning algorithms, and the deep learning time series algorithms are compared. In the case of corporate data, there are limitations of 'nonlinear variables', 'multi-collinearity' of variables, and 'lack of data'. While the logit model is nonlinear, the Lasso regression model solves the multi-collinearity problem, and the deep learning time series algorithm using the variable data generation method complements the lack of data. Big Data Technology, a leading technology in the future, is moving from simple human analysis, to automated AI analysis, and finally towards future intertwined AI applications. Although the study of the corporate default prediction model using the time series algorithm is still in its early stages, deep learning algorithm is much faster than regression analysis at corporate default prediction modeling. Also, it is more effective on prediction power. Through the Fourth Industrial Revolution, the current government and other overseas governments are working hard to integrate the system in everyday life of their nation and society. Yet the field of deep learning time series research for the financial industry is still insufficient. This is an initial study on deep learning time series algorithm analysis of corporate defaults. Therefore it is hoped that it will be used as a comparative analysis data for non-specialists who start a study combining financial data and deep learning time series algorithm.

Enhancement of Inter-Image Statistical Correlation for Accurate Multi-Sensor Image Registration (정밀한 다중센서 영상정합을 위한 통계적 상관성의 증대기법)

  • Kim, Kyoung-Soo;Lee, Jin-Hak;Ra, Jong-Beom
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.42 no.4 s.304
    • /
    • pp.1-12
    • /
    • 2005
  • Image registration is a process to establish the spatial correspondence between images of the same scene, which are acquired at different view points, at different times, or by different sensors. This paper presents a new algorithm for robust registration of the images acquired by multiple sensors having different modalities; the EO (electro-optic) and IR(infrared) ones in the paper. The two feature-based and intensity-based approaches are usually possible for image registration. In the former selection of accurate common features is crucial for high performance, but features in the EO image are often not the same as those in the R image. Hence, this approach is inadequate to register the E0/IR images. In the latter normalized mutual Information (nHr) has been widely used as a similarity measure due to its high accuracy and robustness, and NMI-based image registration methods assume that statistical correlation between two images should be global. Unfortunately, since we find out that EO and IR images don't often satisfy this assumption, registration accuracy is not high enough to apply to some applications. In this paper, we propose a two-stage NMI-based registration method based on the analysis of statistical correlation between E0/1R images. In the first stage, for robust registration, we propose two preprocessing schemes: extraction of statistically correlated regions (ESCR) and enhancement of statistical correlation by filtering (ESCF). For each image, ESCR automatically extracts the regions that are highly correlated to the corresponding regions in the other image. And ESCF adaptively filters out each image to enhance statistical correlation between them. In the second stage, two output images are registered by using NMI-based algorithm. The proposed method provides prospective results for various E0/1R sensor image pairs in terms of accuracy, robustness, and speed.