Search | Korea Science

A Study on Efficient AI Model Drift Detection Methods for MLOps (MLOps를 위한 효율적인 AI 모델 드리프트 탐지방안 연구)

Ye-eun Lee;Tae-jin Lee
- Journal of Internet Computing and Services
- /
- v.24 no.5
- /
- pp.17-27
- /
- 2023
Today, as AI (Artificial Intelligence) technology develops and its practicality increases, it is widely used in various application fields in real life. At this time, the AI model is basically learned based on various statistical properties of the learning data and then distributed to the system, but unexpected changes in the data in a rapidly changing data situation cause a decrease in the model's performance. In particular, as it becomes important to find drift signals of deployed models in order to respond to new and unknown attacks that are constantly created in the security field, the need for lifecycle management of the entire model is gradually emerging. In general, it can be detected through performance changes in the model's accuracy and error rate (loss), but there are limitations in the usage environment in that an actual label for the model prediction result is required, and the detection of the point where the actual drift occurs is uncertain. there is. This is because the model's error rate is greatly influenced by various external environmental factors, model selection and parameter settings, and new input data, so it is necessary to precisely determine when actual drift in the data occurs based only on the corresponding value. There are limits to this. Therefore, this paper proposes a method to detect when actual drift occurs through an Anomaly analysis technique based on XAI (eXplainable Artificial Intelligence). As a result of testing a classification model that detects DGA (Domain Generation Algorithm), anomaly scores were extracted through the SHAP(Shapley Additive exPlanations) Value of the data after distribution, and as a result, it was confirmed that efficient drift point detection was possible.
https://doi.org/10.7472/jksii.2023.24.5.17 인용 PDF HTML

Prediction of field failure rate using data mining in the Automotive semiconductor (데이터 마이닝 기법을 이용한 차량용 반도체의 불량률 예측 연구)

Yun, Gyungsik;Jung, Hee-Won;Park, Seungbum
- Journal of Technology Innovation
- /
- v.26 no.3
- /
- pp.37-68
- /
- 2018
Since the 20th century, automobiles, which are the most common means of transportation, have been evolving as the use of electronic control devices and automotive semiconductors increases dramatically. Automotive semiconductors are a key component in automotive electronic control devices and are used to provide stability, efficiency of fuel use, and stability of operation to consumers. For example, automotive semiconductors include engines control, technologies for managing electric motors, transmission control units, hybrid vehicle control, start/stop systems, electronic motor control, automotive radar and LIDAR, smart head lamps, head-up displays, lane keeping systems. As such, semiconductors are being applied to almost all electronic control devices that make up an automobile, and they are creating more effects than simply combining mechanical devices. Since automotive semiconductors have a high data rate basically, a microprocessor unit is being used instead of a micro control unit. For example, semiconductors based on ARM processors are being used in telematics, audio/video multi-medias and navigation. Automotive semiconductors require characteristics such as high reliability, durability and long-term supply, considering the period of use of the automobile for more than 10 years. The reliability of automotive semiconductors is directly linked to the safety of automobiles. The semiconductor industry uses JEDEC and AEC standards to evaluate the reliability of automotive semiconductors. In addition, the life expectancy of the product is estimated at the early stage of development and at the early stage of mass production by using the reliability test method and results that are presented as standard in the automobile industry. However, there are limitations in predicting the failure rate caused by various parameters such as customer's various conditions of use and usage time. To overcome these limitations, much research has been done in academia and industry. Among them, researches using data mining techniques have been carried out in many semiconductor fields, but application and research on automotive semiconductors have not yet been studied. In this regard, this study investigates the relationship between data generated during semiconductor assembly and package test process by using data mining technique, and uses data mining technique suitable for predicting potential failure rate using customer bad data.
https://doi.org/10.14386/SIME.2018.26.3.37 인용 PDF

Prediction of Expected Residual Useful Life of Rubble-Mound Breakwaters Using Stochastic Gamma Process (추계학적 감마 확률과정을 이용한 경사제의 기대 잔류유효수명 예측)

Lee, Cheol-Eung
- Journal of Korean Society of Coastal and Ocean Engineers
- /
- v.31 no.3
- /
- pp.158-169
- /
- 2019
A probabilistic model that can predict the residual useful lifetime of structure is formulated by using the gamma process which is one of the stochastic processes. The formulated stochastic model can take into account both the sampling uncertainty associated with damages measured up to now and the temporal uncertainty of cumulative damage over time. A method estimating several parameters of stochastic model is additionally proposed by introducing of the least square method and the method of moments, so that the age of a structure, the operational environment, and the evolution of damage with time can be considered. Some features related to the residual useful lifetime are firstly investigated into through the sensitivity analysis on parameters under a simple setting of single damage data measured at the current age. The stochastic model are then applied to the rubble-mound breakwater straightforwardly. The parameters of gamma process can be estimated for several experimental data on the damage processes of armor rocks of rubble-mound breakwater. The expected damage levels over time, which are numerically simulated with the estimated parameters, are in very good agreement with those from the flume testing. It has been found from various numerical calculations that the probabilities exceeding the failure limit are converged to the constraint that the model must be satisfied after lasting for a long time from now. Meanwhile, the expected residual useful lifetimes evaluated from the failure probabilities are seen to be different with respect to the behavior of damage history. As the coefficient of variation of cumulative damage is becoming large, in particular, it has been shown that the expected residual useful lifetimes have significant discrepancies from those of the deterministic regression model. This is mainly due to the effect of sampling and temporal uncertainties associated with damage, by which the first time to failure tends to be widely distributed. Therefore, the stochastic model presented in this paper for predicting the residual useful lifetime of structure can properly implement the probabilistic assessment on current damage state of structure as well as take account of the temporal uncertainty of future cumulative damage.
https://doi.org/10.9765/KSCOE.2019.31.3.158 인용 PDF KSCI

The Simulation of Pore Size Distribution from Unsaturated Hydraulic Conductivity Data Using the Hydraulic Functions (토양 수리학적 함수를 이용한 불포화 수리전도도로부터 공극크기분포의 모사)

Yoon, Young-Man;Kim, Jeong-Gyu;Shin, Kook-Sik
- Korean Journal of Soil Science and Fertilizer
- /
- v.43 no.4
- /
- pp.407-414
- /
- 2010
Until now, the pore size distribution, PSD, of soil profile has been calculated from soil moisture characteristic data by water release method or mercury porosimetry using the capillary rise equation. But the current methods are often difficult to use and time consuming. Thus, in this work, theoretical framework for an easy and fast technique was suggested to estimate the PSD from unsaturated hydraulic conductivity data in an undisturbed field soil profile. In this study, unsaturated hydraulic conductivity data were collected and simulated by the variation of soil parameters in the given boundary conditions (Brooks and Corey soil parameters, ${\alpha}_{BC}=1-5L^{-1}$, b = 1 - 10; van Genuchten soil parameters, ${\alpha}_{VG}=0.001-1.0L^{-1}$, m = 0.1 - 0.9). Then, $K_s$ (1.0 cm $h^{-1})$ was used as the fixed input parameter for the simulation of each models. The PSDs were estimated from the collected K(h) data by model simulation. In the simulation of Brooks-Corey parameter, the saturated hydraulic conductivity, $K_s$, played a role of scaling factor for unsaturated hydraulic conductivity, K(h) Changes of parameter b explained the shape of PSD curve of soil intimately, and a ${\alpha}_{BC}$ affected on the sensitivity of PSD curve. In the case of van Genuchten model, $K_s$ and ${\alpha}_{VG}$ played the role of scaling factor for a vertical axis and a horizontal axis, respectively. Parameter m described the shape of PSD curve and K(h) systematically. This study suggests that the new theoretical technique can be applied to the in situ prediction of PSD in undisturbed field soil.
PDF KSCI

Prediction of patent lifespan and analysis of influencing factors using machine learning (기계학습을 활용한 특허수명 예측 및 영향요인 분석)

Kim, Yongwoo;Kim, Min Gu;Kim, Young-Min
- Journal of Intelligence and Information Systems
- /
- v.28 no.2
- /
- pp.147-170
- /
- 2022
Although the number of patent which is one of the core outputs of technological innovation continues to increase, the number of low-value patents also hugely increased. Therefore, efficient evaluation of patents has become important. Estimation of patent lifespan which represents private value of a patent, has been studied for a long time, but in most cases it relied on a linear model. Even if machine learning methods were used, interpretation or explanation of the relationship between explanatory variables and patent lifespan was insufficient. In this study, patent lifespan (number of renewals) is predicted based on the idea that patent lifespan represents the value of the patent. For the research, 4,033,414 patents applied between 1996 and 2017 and finally granted were collected from USPTO (US Patent and Trademark Office). To predict the patent lifespan, we use variables that can reflect the characteristics of the patent, the patent owner's characteristics, and the inventor's characteristics. We build four different models (Ridge Regression, Random Forest, Feed Forward Neural Network, Gradient Boosting Models) and perform hyperparameter tuning through 5-fold Cross Validation. Then, the performance of the generated models are evaluated, and the relative importance of predictors is also presented. In addition, based on the Gradient Boosting Model which have excellent performance, Accumulated Local Effects Plot is presented to visualize the relationship between predictors and patent lifespan. Finally, we apply Kernal SHAP (SHapley Additive exPlanations) to present the evaluation reason of individual patents, and discuss applicability to the patent evaluation system. This study has academic significance in that it cumulatively contributes to the existing patent life estimation research and supplements the limitations of existing patent life estimation studies based on linearity. It is academically meaningful that this study contributes cumulatively to the existing studies which estimate patent lifespan, and that it supplements the limitations of linear models. Also, it is practically meaningful to suggest a method for deriving the evaluation basis for individual patent value and examine the applicability to patent evaluation systems.
https://doi.org/10.13088/jiis.2022.28.2.147 인용 PDF KSCI

Effect of the Suicide Prevention Program to the Impulsive Psychology of the Elementary School Student (자살예방 프로그램이 초등학교 충동심리에 미치는 영향)

Kang, Soo Jin;Kang, Ho Jung;Cho, Won Cheol;Lee, Tae Shik
- Journal of Korean Society of Disaster and Security
- /
- v.6 no.1
- /
- pp.65-72
- /
- 2013
In this study, the early suicide prevention program was applied to the elementary school students and compared the prior & post effect of the program, and verified the status of psychology change like emotional status, or temptation to take a suicide, and presented the possibility as a suicide prevention program. The period of adolescence is the very unstable period in the process of growth being cognitively immature, emotionally impulsive period. It is the period emotionally unstable and unpredictable possible to select the method of suicide as an extreme method to escape the reality, or impulsive problem solving against small conflict or dispute situation. Many stress of the student such as recent nuclear family, expectation of parents to their children, education problem, socio-environmental elements, individual psychological factor lead students to the extreme activity of suicide in recent days. In this study, the scope of stress experienced in the elementary school as well as idea and degree of temptation regarding suicide by the suicide prevention program were identified, and through prevention program such as meditation training, breath training and through experience of anger control, emotion-expression, self overcome and establish positive self-identity and make understanding Self-control, Self-esteem & preciousness of life based on which the effect to suicide prevention was analyzed. The study was made targeting 51 students of 2 classes of 6th grade of elementary school of Goyang-si and processed 30 minutes every morning focused on through experience & activity of the principle & method of brain science. The data was collected for 20 times before starting morning class by using Suicide Probability Scale(herein SPS-A) designed to predict effectively suicide Probability, suicide risk prediction scale, surveyed by 7 areas such as Positive outlook, Within the family closeness, Impulsivity, Interpersonal hostility, Hopelessness, Hopelessness syndrome, suicide accident. Analytical methods and validation was used the Wilcoxon's signed rank test using SPSS Program. Though the process of program in short period, but there was a effective and positive results in the 7 areas in the average comparison. But in the t-test result, there was a different outcome. It indicated changes in the 3 questionnaires (No.7, No.14, No.19) out of 31 SPS-A questionnaires, and there was a no change to the rest item. It also indicated more changes of the students in the class A than class B. And in case of the class A students, psychological changes were verified in the areas of Hopelessness syndrome, suicide accident among 7 areas after the program was processed. Through this study, it could be verified that different results could be derived depending on the Student tendency, program professional(teacher in charge, processing lecturer). The suicide prevention program presented in this article can be a help in learning and suicide prevention with consistent systematization, activation through emotion and impulse control based on emotional stress relief and positive self-identity recovery, stabilization of brain waves, and let the short period program not to be died out but to be continued connecting from childhood to adolescence capable to make surrounding environment for spiritual, physical healthy growth for which this could be an effective program for suicide prevention of the social problem.
https://doi.org/10.21729/ksds.2013.6.1.065 인용 PDF

Research about feature selection that use heuristic function (휴리스틱 함수를 이용한 feature selection에 관한 연구)

Hong, Seok-Mi;Jung, Kyung-Sook;Chung, Tae-Choong
- The KIPS Transactions:PartB
- /
- v.10B no.3
- /
- pp.281-286
- /
- 2003
A large number of features are collected for problem solving in real life, but to utilize ail the features collected would be difficult. It is not so easy to collect of correct data about all features. In case it takes advantage of all collected data to learn, complicated learning model is created and good performance result can't get. Also exist interrelationships or hierarchical relations among the features. We can reduce feature's number analyzing relation among the features using heuristic knowledge or statistical method. Heuristic technique refers to learning through repetitive trial and errors and experience. Experts can approach to relevant problem domain through opinion collection process by experience. These properties can be utilized to reduce the number of feature used in learning. Experts generate a new feature (highly abstract) using raw data. This paper describes machine learning model that reduce the number of features used in learning using heuristic function and use abstracted feature by neural network's input value. We have applied this model to the win/lose prediction in pro-baseball games. The result shows the model mixing two techniques not only reduces the complexity of the neural network model but also significantly improves the classification accuracy than when neural network and heuristic model are used separately.
https://doi.org/10.3745/KIPSTB.2003.10B.3.281 인용 PDF KSCI

A Study on Web-based Technology Valuation System (웹기반 지능형 기술가치평가 시스템에 관한 연구)

Sung, Tae-Eung;Jun, Seung-Pyo;Kim, Sang-Gook;Park, Hyun-Woo
- Journal of Intelligence and Information Systems
- /
- v.23 no.1
- /
- pp.23-46
- /
- 2017
Although there have been cases of evaluating the value of specific companies or projects which have centralized on developed countries in North America and Europe from the early 2000s, the system and methodology for estimating the economic value of individual technologies or patents has been activated on and on. Of course, there exist several online systems that qualitatively evaluate the technology's grade or the patent rating of the technology to be evaluated, as in 'KTRS' of the KIBO and 'SMART 3.1' of the Korea Invention Promotion Association. However, a web-based technology valuation system, referred to as 'STAR-Value system' that calculates the quantitative values of the subject technology for various purposes such as business feasibility analysis, investment attraction, tax/litigation, etc., has been officially opened and recently spreading. In this study, we introduce the type of methodology and evaluation model, reference information supporting these theories, and how database associated are utilized, focusing various modules and frameworks embedded in STAR-Value system. In particular, there are six valuation methods, including the discounted cash flow method (DCF), which is a representative one based on the income approach that anticipates future economic income to be valued at present, and the relief-from-royalty method, which calculates the present value of royalties' where we consider the contribution of the subject technology towards the business value created as the royalty rate. We look at how models and related support information (technology life, corporate (business) financial information, discount rate, industrial technology factors, etc.) can be used and linked in a intelligent manner. Based on the classification of information such as International Patent Classification (IPC) or Korea Standard Industry Classification (KSIC) for technology to be evaluated, the STAR-Value system automatically returns meta data such as technology cycle time (TCT), sales growth rate and profitability data of similar company or industry sector, weighted average cost of capital (WACC), indices of industrial technology factors, etc., and apply adjustment factors to them, so that the result of technology value calculation has high reliability and objectivity. Furthermore, if the information on the potential market size of the target technology and the market share of the commercialization subject refers to data-driven information, or if the estimated value range of similar technologies by industry sector is provided from the evaluation cases which are already completed and accumulated in database, the STAR-Value is anticipated that it will enable to present highly accurate value range in real time by intelligently linking various support modules. Including the explanation of the various valuation models and relevant primary variables as presented in this paper, the STAR-Value system intends to utilize more systematically and in a data-driven way by supporting the optimal model selection guideline module, intelligent technology value range reasoning module, and similar company selection based market share prediction module, etc. In addition, the research on the development and intelligence of the web-based STAR-Value system is significant in that it widely spread the web-based system that can be used in the validation and application to practices of the theoretical feasibility of the technology valuation field, and it is expected that it could be utilized in various fields of technology commercialization.
https://doi.org/10.13088/jiis.2017.23.1.023 인용 PDF KSCI

Extension Method of Association Rules Using Social Network Analysis (사회연결망 분석을 활용한 연관규칙 확장기법)

Lee, Dongwon
- Journal of Intelligence and Information Systems
- /
- v.23 no.4
- /
- pp.111-126
- /
- 2017
Recommender systems based on association rule mining significantly contribute to seller's sales by reducing consumers' time to search for products that they want. Recommendations based on the frequency of transactions such as orders can effectively screen out the products that are statistically marketable among multiple products. A product with a high possibility of sales, however, can be omitted from the recommendation if it records insufficient number of transactions at the beginning of the sale. Products missing from the associated recommendations may lose the chance of exposure to consumers, which leads to a decline in the number of transactions. In turn, diminished transactions may create a vicious circle of lost opportunity to be recommended. Thus, initial sales are likely to remain stagnant for a certain period of time. Products that are susceptible to fashion or seasonality, such as clothing, may be greatly affected. This study was aimed at expanding association rules to include into the list of recommendations those products whose initial trading frequency of transactions is low despite the possibility of high sales. The particular purpose is to predict the strength of the direct connection of two unconnected items through the properties of the paths located between them. An association between two items revealed in transactions can be interpreted as the interaction between them, which can be expressed as a link in a social network whose nodes are items. The first step calculates the centralities of the nodes in the middle of the paths that indirectly connect the two nodes without direct connection. The next step identifies the number of the paths and the shortest among them. These extracts are used as independent variables in the regression analysis to predict future connection strength between the nodes. The strength of the connection between the two nodes of the model, which is defined by the number of nodes between the two nodes, is measured after a certain period of time. The regression analysis results confirm that the number of paths between the two products, the distance of the shortest path, and the number of neighboring items connected to the products are significantly related to their potential strength. This study used actual order transaction data collected for three months from February to April in 2016 from an online commerce company. To reduce the complexity of analytics as the scale of the network grows, the analysis was performed only on miscellaneous goods. Two consecutively purchased items were chosen from each customer's transactions to obtain a pair of antecedent and consequent, which secures a link needed for constituting a social network. The direction of the link was determined in the order in which the goods were purchased. Except for the last ten days of the data collection period, the social network of associated items was built for the extraction of independent variables. The model predicts the number of links to be connected in the next ten days from the explanatory variables. Of the 5,711 previously unconnected links, 611 were newly connected for the last ten days. Through experiments, the proposed model demonstrated excellent predictions. Of the 571 links that the proposed model predicts, 269 were confirmed to have been connected. This is 4.4 times more than the average of 61, which can be found without any prediction model. This study is expected to be useful regarding industries whose new products launch quickly with short life cycles, since their exposure time is critical. Also, it can be used to detect diseases that are rarely found in the early stages of medical treatment because of the low incidence of outbreaks. Since the complexity of the social networking analysis is sensitive to the number of nodes and links that make up the network, this study was conducted in a particular category of miscellaneous goods. Future research should consider that this condition may limit the opportunity to detect unexpected associations between products belonging to different categories of classification.
https://doi.org/10.13088/jiis.2017.23.4.111 인용 PDF KSCI

Search Result 709, Processing Time 0.03 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)