• Title/Summary/Keyword: Generation Prediction

Search Result 808, Processing Time 0.024 seconds

Knowledge based Genetic Algorithm for the Prediction of Peptides binding to HLA alleles common in Koreans (지식기반 유전자알고리즘을 이용한 한국인 빈발 HLA 대립유전자에 대한 결합 펩타이드 예측)

  • Cho, Yeon-Jin;Oh, Heung-Bum;Kim, Hyeon-Cheol
    • Journal of Internet Computing and Services
    • /
    • v.13 no.4
    • /
    • pp.45-52
    • /
    • 2012
  • T cells induce immune responses and thereby eliminate infected micro-organisms when peptides from the microbial proteins are bound to HLAs in the host cell surfaces, It is known that the more stable the binding of peptide to HLA is, the stronger the T cell response gets to remove more effectively the source of infection. Accordingly, if peptides (HLA binder) which can be bound stably to a certain HLA are found, those peptieds are utilized to the development of peptide vaccine to prevent infectious diseases or even to cancer. However, HLA is highly polymorphic so that HLA has a large number of alleles with some frequencies even in one population. Therefore, it is very inefficient to find the peptides stably bound to a number of HLAs by testing random possible peptides for all the various alleles frequent in the population. In order to solve this problem, computational methods have recently been developed to predict peptides which are stably bound to a certain HLA. These methods could markedly decrease the number of candidate peptides to be examined by biological experiments. Accordingly, this paper not only introduces a method of machine learning to predict peptides binding to an HLA, but also suggests a new prediction model so called 'knowledge-based genetic algorithm' that has never been tried for HLA binding peptide prediction. Although based on genetic algorithm (GA). it showed more enhanced performance than GA by incorporating expert knowledge in the process of the algorithm. Furthermore, it could extract rules predicting the binding peptide of the HLA alleles common in Koreans.

Domain Knowledge Incorporated Counterfactual Example-Based Explanation for Bankruptcy Prediction Model (부도예측모형에서 도메인 지식을 통합한 반사실적 예시 기반 설명력 증진 방법)

  • Cho, Soo Hyun;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.2
    • /
    • pp.307-332
    • /
    • 2022
  • One of the most intensively conducted research areas in business application study is a bankruptcy prediction model, a representative classification problem related to loan lending, investment decision making, and profitability to financial institutions. Many research demonstrated outstanding performance for bankruptcy prediction models using artificial intelligence techniques. However, since most machine learning algorithms are "black-box," AI has been identified as a prominent research topic for providing users with an explanation. Although there are many different approaches for explanations, this study focuses on explaining a bankruptcy prediction model using a counterfactual example. Users can obtain desired output from the model by using a counterfactual-based explanation, which provides an alternative case. This study introduces a counterfactual generation technique based on a genetic algorithm (GA) that leverages both domain knowledge (i.e., causal feasibility) and feature importance from a black-box model along with other critical counterfactual variables, including proximity, distribution, and sparsity. The proposed method was evaluated quantitatively and qualitatively to measure the quality and the validity.

Characteristics of Signal-to-Noise Paradox and Limits of Potential Predictive Skill in the KMA's Climate Prediction System (GloSea) through Ensemble Expansion (기상청 기후예측시스템(GloSea)의 앙상블 확대를 통해 살펴본 신호대잡음의 역설적 특징(Signal-to-Noise Paradox)과 예측 스킬의 한계)

  • Yu-Kyung Hyun;Yeon-Hee Park;Johan Lee;Hee-Sook Ji;Kyung-On Boo
    • Atmosphere
    • /
    • v.34 no.1
    • /
    • pp.55-67
    • /
    • 2024
  • This paper aims to provide a detailed introduction to the concept of the Ratio of Predictable Component (RPC) and the Signal-to-Noise Paradox. Then, we derive insights from them by exploring the paradoxical features by conducting a seasonal and regional analysis through ensemble expansion in KMA's climate prediction system (GloSea). We also provide an explanation of the ensemble generation method, with a specific focus on stochastic physics. Through this study, we can provide the predictability limits of our forecasting system, and find way to enhance it. On a global scale, RPC reaches a value of 1 when the ensemble is expanded to a maximum of 56 members, underlining the significance of ensemble expansion in the climate prediction system. The feature indicating RPC paradoxically exceeding 1 becomes particularly evident in the winter North Atlantic and the summer North Pacific. In the Siberian Continent, predictability is notably low, persisting even as the ensemble size increases. This region, characterized by a low RPC, is considered challenging for making reliable predictions, highlighting the need for further improvement in the model and initialization processes related to land processes. In contrast, the tropical ocean demonstrates robust predictability while maintaining an RPC of 1. Through this study, we have brought to attention the limitations of potential predictability within the climate prediction system, emphasizing the necessity of leveraging predictable signals with high RPC values. We also underscore the importance of continuous efforts aimed at improving models and initializations to overcome these limitations.

Domain Knowledge Incorporated Local Rule-based Explanation for ML-based Bankruptcy Prediction Model (머신러닝 기반 부도예측모형에서 로컬영역의 도메인 지식 통합 규칙 기반 설명 방법)

  • Soo Hyun Cho;Kyung-shik Shin
    • Information Systems Review
    • /
    • v.24 no.1
    • /
    • pp.105-123
    • /
    • 2022
  • Thanks to the remarkable success of Artificial Intelligence (A.I.) techniques, a new possibility for its application on the real-world problem has begun. One of the prominent applications is the bankruptcy prediction model as it is often used as a basic knowledge base for credit scoring models in the financial industry. As a result, there has been extensive research on how to improve the prediction accuracy of the model. However, despite its impressive performance, it is difficult to implement machine learning (ML)-based models due to its intrinsic trait of obscurity, especially when the field requires or values an explanation about the result obtained by the model. The financial domain is one of the areas where explanation matters to stakeholders such as domain experts and customers. In this paper, we propose a novel approach to incorporate financial domain knowledge into local rule generation to provide explanations for the bankruptcy prediction model at instance level. The result shows the proposed method successfully selects and classifies the extracted rules based on the feasibility and information they convey to the users.

Development of penetration rate model and optimum operational conditions of shield TBM for electricity transmission tunnels (터널식 전력구를 위한 순굴진율 모델 개발 및 이를 활용한 쉴드TBM 최적운전 조건 제안)

  • Kim, Jeong-Ju;Ryu, Hui-Hwan;Kim, Gyeong-Yeol;Hong, Seong-Yeon;Jeong, Ju-Hwan;Bae, Du-San
    • Journal of Korean Tunnelling and Underground Space Association
    • /
    • v.22 no.6
    • /
    • pp.623-641
    • /
    • 2020
  • About 5 km length of tunnels were constructed by mechanized tunnelling method using closed type shield TBM. In order to avoid construction delay problems for ensuring timely electricity transmission, it is necessary to increase the prediction accuracy of the excavation process involving machines according to rock mass types. This is important to corroborate the project duration and optimum operation for various considerations involved in the machine. So, full-scale tunnelling tests were performed for developing the advance rate model to be appropriately used for 3.6 m diameter shield TBM. About 100 test cases were established and performed using various operational parameters such as thrust force and rotational speed of cuttterhead in representative uniaxial compressive strengths. Accordingly, relationships between normal force and penetration depth and, between UCS and torque were suggested which consider UCS and thrust force conditions according to weathered, soft, hard rocks. Capacity analysis of cutterhead was performed and optimum operational conditions were also suggested based on the developed model. Based on this study, it can be expected that the project construction duration can be reduced and users can benefit from the provision of earlier service.

Evaluation on Resource Recovery Potential by Landfill Gas Production (매립가스 발생량에 따른 자원화 가능성 평가)

  • Lee, Hae-Seung
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.12 no.10
    • /
    • pp.4679-4688
    • /
    • 2011
  • This study was performed to the municipal waste generation amounts and characteristics for B city in Gangwon province, predicted the methane gas generation rate emitted from landfill, and analyzed the possibility of energy recovery to RDF(Refuse Derived Fuel) using combustible waste. The study results showed that the average bulk density of municipal waste for B city was 144.0 kg/$m^3$, and the average ratios of combustible waste were 36.0 % of paper, 21.6 % of vinyl, and 19.7 % of food waste. respectively. In the experiment for heating value, high and low heating value(moisture) was measured to 3,471 $kca{\ell}$/kg and 2,941 $kca{\ell}$/kg, respectively. After the prohibition of burying of food waste in landfill, the heating value of municipal waste was dramatically increased due to increase of the ratio of paper, vinyl, and plastic waste. The prediction results of methane gas generation rate emitted from landfill showed that the gas generation rate is increasing to 2,505.7 CH4 ton/year in 2021. After then, the rate is decreasing gradually. When the RDF facility is installed, the rate is decreasing after peaking at 1,956.9 CH4 ton/year in 2013. The generation rate of LFG emitted from waste landfill of B city was analyzed to 9.92 $m^3$/min, similar to 10.11 $m^3$/min for other city.

Numerical Experiment of Driftwood Generation and Deposition Patterns by Tsunami (쓰나미에 의한 유목의 생성과 퇴적패턴의 수치모의실험)

  • Kang, Tae Un;Jang, Chang-Lae;Lee, Nam Joo;Lee, Won Ho
    • Ecology and Resilient Infrastructure
    • /
    • v.8 no.4
    • /
    • pp.165-178
    • /
    • 2021
  • We studied driftwood behaviors including generation and deposition in a tsunami using a numerical simulation. We used an integrated two-dimensional numerical model, which included a driftwood dynamics model. The study area was Sendai, Japan. Observation data collected by Inagaki et al. (2012) were used to verify the simulation results by comparing them with driftwood deposition patterns. A simplified model was developed to consider the threshold of driftwood generation by the drag force of water flows. To consider the volume of driftwood generated, we estimated the total wood number in the study area using Google Earth. Therefore, we simulated more than 13,000 pieces of driftwood that were generated and transported inland from approximately 300,000 trees that were growing in the forest. The final distribution of the driftwood was similar to the observation data. The reproducibility of the generation and deposition patterns of driftwood showed good agreement in terms of longitudinal deposition pattern. In the future, a sensitivity analysis on driftwood parameters, such as the size of the wood, boundary conditions, and grid size, will be implemented to predict the travel patterns of driftwood. Such modeling will be a useful methodology for disaster prediction based on water flow and driftwood.

Bankruptcy prediction using an improved bagging ensemble (개선된 배깅 앙상블을 활용한 기업부도예측)

  • Min, Sung-Hwan
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.4
    • /
    • pp.121-139
    • /
    • 2014
  • Predicting corporate failure has been an important topic in accounting and finance. The costs associated with bankruptcy are high, so the accuracy of bankruptcy prediction is greatly important for financial institutions. Lots of researchers have dealt with the topic associated with bankruptcy prediction in the past three decades. The current research attempts to use ensemble models for improving the performance of bankruptcy prediction. Ensemble classification is to combine individually trained classifiers in order to gain more accurate prediction than individual models. Ensemble techniques are shown to be very useful for improving the generalization ability of the classifier. Bagging is the most commonly used methods for constructing ensemble classifiers. In bagging, the different training data subsets are randomly drawn with replacement from the original training dataset. Base classifiers are trained on the different bootstrap samples. Instance selection is to select critical instances while deleting and removing irrelevant and harmful instances from the original set. Instance selection and bagging are quite well known in data mining. However, few studies have dealt with the integration of instance selection and bagging. This study proposes an improved bagging ensemble based on instance selection using genetic algorithms (GA) for improving the performance of SVM. GA is an efficient optimization procedure based on the theory of natural selection and evolution. GA uses the idea of survival of the fittest by progressively accepting better solutions to the problems. GA searches by maintaining a population of solutions from which better solutions are created rather than making incremental changes to a single solution to the problem. The initial solution population is generated randomly and evolves into the next generation by genetic operators such as selection, crossover and mutation. The solutions coded by strings are evaluated by the fitness function. The proposed model consists of two phases: GA based Instance Selection and Instance based Bagging. In the first phase, GA is used to select optimal instance subset that is used as input data of bagging model. In this study, the chromosome is encoded as a form of binary string for the instance subset. In this phase, the population size was set to 100 while maximum number of generations was set to 150. We set the crossover rate and mutation rate to 0.7 and 0.1 respectively. We used the prediction accuracy of model as the fitness function of GA. SVM model is trained on training data set using the selected instance subset. The prediction accuracy of SVM model over test data set is used as fitness value in order to avoid overfitting. In the second phase, we used the optimal instance subset selected in the first phase as input data of bagging model. We used SVM model as base classifier for bagging ensemble. The majority voting scheme was used as a combining method in this study. This study applies the proposed model to the bankruptcy prediction problem using a real data set from Korean companies. The research data used in this study contains 1832 externally non-audited firms which filed for bankruptcy (916 cases) and non-bankruptcy (916 cases). Financial ratios categorized as stability, profitability, growth, activity and cash flow were investigated through literature review and basic statistical methods and we selected 8 financial ratios as the final input variables. We separated the whole data into three subsets as training, test and validation data set. In this study, we compared the proposed model with several comparative models including the simple individual SVM model, the simple bagging model and the instance selection based SVM model. The McNemar tests were used to examine whether the proposed model significantly outperforms the other models. The experimental results show that the proposed model outperforms the other models.

A study on the Effects of the Bearing Parameters on the Main Spindle Design of Machine Tool (공작기계 주축설계에 영향을 미치는 베어링 파라미터에 관한 연구)

  • Yeo, Eun Gu;Kim, Yeop Rae;Han, Gang Geun;Park, Myeon Ung;Yu, Heon Il;Lee, Yong Sin
    • Journal of the Korean Society of Manufacturing Technology Engineers
    • /
    • v.7 no.1
    • /
    • pp.119-119
    • /
    • 1998
  • The purpose of this study is to investigate the effects of operation factors of a typical main spindle system on the efficiency of machine tool. In this study. both static and dynamic analysis of typical main spindle system of the machine tool are performed using a finite element method. These finite element results are then used to predict the bearing stiffness. the amount of heat generation as well as the bearing life in the spindle system. Effects of material type of ball-bearing. bearing-lubricant type and main spindle bearing preload are examined.

A Study on the Prediction Technical for Critical Slip surface Using Genetic Algorithm (유전자 알고리즘을 이용한 사면의 임계파괴면 예측기법에 관한 연구)

  • 김홍택;강인규;황정순;장원호
    • Proceedings of the Korean Geotechical Society Conference
    • /
    • 1999.03a
    • /
    • pp.331-338
    • /
    • 1999
  • In the present study, a searching technique for critical slip surface in two dimensional slope stability analysis is proposed. The failure surface generation and analysis has been usually limited to simple geometric shapes. However, more random surfaces need to be examined for some particular ground conditions. For this purpose, random searching technique is developed using genetic algorithm. The generalized limit equilibrium method is employed as the method of stability analysis. Using this technique, the factor of safety is compared with the result by using simplified Bishop's method. In addition, the convergent trend of fitness value is analyzed.

  • PDF