• Title/Summary/Keyword: Software Evolution

Search Result 194, Processing Time 0.034 seconds

Receptor binding motif surrounding sites in the Spike 1 protein of infectious bronchitis virus have high susceptibility to mutation related to selective pressure

  • Seung-Min Hong;Seung-Ji Kim;Se-Hee An;Jiye Kim;Eun-Jin Ha;Howon Kim;Hyuk-Joon Kwon;Kang-Seuk Choi
    • Journal of Veterinary Science
    • /
    • v.24 no.4
    • /
    • pp.51.1-51.17
    • /
    • 2023
  • Background: To date, various genotypes of infectious bronchitis virus (IBV) have co-circulated and in Korea, GI-15 and GI-19 lineages were prevailing. The spike protein, particularly S1 subunit, is responsible for receptor binding, contains hypervariable regions and is also responsible for the emerging of novel variants. Objective: This study aims to investigate the putative major amino acid substitutions for the variants in GI-19. Methods: The S1 sequence data of IBV isolated from 1986 to 2021 in Korea (n = 188) were analyzed. Sequence alignments were carried out using Multiple alignment using Fast Fourier Transform of Geneious prime. The phylogenetic tree was generated using MEGA-11 (ver. 11.0.10) and Bayesian analysis was performed by BEAST v1.10.4. Selective pressure was analyzed via online server Datamonkey. Highlights and visualization of putative critical amino acid were conducted by using PyMol software (version 2.3). Results: Most (93.5%) belonged to the GI-19 lineage in Korea, and the GI-19 lineage was further divided into seven subgroups: KM91-like (Clade A and B), K40/09-like, QX-like (I-IV). Positive selection was identified at nine and six residues in S1 for KM91-like and QX-like IBVs, respectively. In addition, several positive selection sites of S1-NTD were indicated to have mutations at common locations even when new clades were generated. They were all located on the lateral surface of the quaternary structure of the S1 subunits in close proximity to the receptor-binding motif (RBM), putative RBM motif and neutralizing antigenic sites in S1. Conclusions: Our results suggest RBM surrounding sites in the S1 subunit of IBV are highly susceptible to mutation by selective pressure during evolution.

Optimization of Multiclass Support Vector Machine using Genetic Algorithm: Application to the Prediction of Corporate Credit Rating (유전자 알고리즘을 이용한 다분류 SVM의 최적화: 기업신용등급 예측에의 응용)

  • Ahn, Hyunchul
    • Information Systems Review
    • /
    • v.16 no.3
    • /
    • pp.161-177
    • /
    • 2014
  • Corporate credit rating assessment consists of complicated processes in which various factors describing a company are taken into consideration. Such assessment is known to be very expensive since domain experts should be employed to assess the ratings. As a result, the data-driven corporate credit rating prediction using statistical and artificial intelligence (AI) techniques has received considerable attention from researchers and practitioners. In particular, statistical methods such as multiple discriminant analysis (MDA) and multinomial logistic regression analysis (MLOGIT), and AI methods including case-based reasoning (CBR), artificial neural network (ANN), and multiclass support vector machine (MSVM) have been applied to corporate credit rating.2) Among them, MSVM has recently become popular because of its robustness and high prediction accuracy. In this study, we propose a novel optimized MSVM model, and appy it to corporate credit rating prediction in order to enhance the accuracy. Our model, named 'GAMSVM (Genetic Algorithm-optimized Multiclass Support Vector Machine),' is designed to simultaneously optimize the kernel parameters and the feature subset selection. Prior studies like Lorena and de Carvalho (2008), and Chatterjee (2013) show that proper kernel parameters may improve the performance of MSVMs. Also, the results from the studies such as Shieh and Yang (2008) and Chatterjee (2013) imply that appropriate feature selection may lead to higher prediction accuracy. Based on these prior studies, we propose to apply GAMSVM to corporate credit rating prediction. As a tool for optimizing the kernel parameters and the feature subset selection, we suggest genetic algorithm (GA). GA is known as an efficient and effective search method that attempts to simulate the biological evolution phenomenon. By applying genetic operations such as selection, crossover, and mutation, it is designed to gradually improve the search results. Especially, mutation operator prevents GA from falling into the local optima, thus we can find the globally optimal or near-optimal solution using it. GA has popularly been applied to search optimal parameters or feature subset selections of AI techniques including MSVM. With these reasons, we also adopt GA as an optimization tool. To empirically validate the usefulness of GAMSVM, we applied it to a real-world case of credit rating in Korea. Our application is in bond rating, which is the most frequently studied area of credit rating for specific debt issues or other financial obligations. The experimental dataset was collected from a large credit rating company in South Korea. It contained 39 financial ratios of 1,295 companies in the manufacturing industry, and their credit ratings. Using various statistical methods including the one-way ANOVA and the stepwise MDA, we selected 14 financial ratios as the candidate independent variables. The dependent variable, i.e. credit rating, was labeled as four classes: 1(A1); 2(A2); 3(A3); 4(B and C). 80 percent of total data for each class was used for training, and remaining 20 percent was used for validation. And, to overcome small sample size, we applied five-fold cross validation to our dataset. In order to examine the competitiveness of the proposed model, we also experimented several comparative models including MDA, MLOGIT, CBR, ANN and MSVM. In case of MSVM, we adopted One-Against-One (OAO) and DAGSVM (Directed Acyclic Graph SVM) approaches because they are known to be the most accurate approaches among various MSVM approaches. GAMSVM was implemented using LIBSVM-an open-source software, and Evolver 5.5-a commercial software enables GA. Other comparative models were experimented using various statistical and AI packages such as SPSS for Windows, Neuroshell, and Microsoft Excel VBA (Visual Basic for Applications). Experimental results showed that the proposed model-GAMSVM-outperformed all the competitive models. In addition, the model was found to use less independent variables, but to show higher accuracy. In our experiments, five variables such as X7 (total debt), X9 (sales per employee), X13 (years after founded), X15 (accumulated earning to total asset), and X39 (the index related to the cash flows from operating activity) were found to be the most important factors in predicting the corporate credit ratings. However, the values of the finally selected kernel parameters were found to be almost same among the data subsets. To examine whether the predictive performance of GAMSVM was significantly greater than those of other models, we used the McNemar test. As a result, we found that GAMSVM was better than MDA, MLOGIT, CBR, and ANN at the 1% significance level, and better than OAO and DAGSVM at the 5% significance level.

Incremental Ensemble Learning for The Combination of Multiple Models of Locally Weighted Regression Using Genetic Algorithm (유전 알고리즘을 이용한 국소가중회귀의 다중모델 결합을 위한 점진적 앙상블 학습)

  • Kim, Sang Hun;Chung, Byung Hee;Lee, Gun Ho
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.7 no.9
    • /
    • pp.351-360
    • /
    • 2018
  • The LWR (Locally Weighted Regression) model, which is traditionally a lazy learning model, is designed to obtain the solution of the prediction according to the input variable, the query point, and it is a kind of the regression equation in the short interval obtained as a result of the learning that gives a higher weight value closer to the query point. We study on an incremental ensemble learning approach for LWR, a form of lazy learning and memory-based learning. The proposed incremental ensemble learning method of LWR is to sequentially generate and integrate LWR models over time using a genetic algorithm to obtain a solution of a specific query point. The weaknesses of existing LWR models are that multiple LWR models can be generated based on the indicator function and data sample selection, and the quality of the predictions can also vary depending on this model. However, no research has been conducted to solve the problem of selection or combination of multiple LWR models. In this study, after generating the initial LWR model according to the indicator function and the sample data set, we iterate evolution learning process to obtain the proper indicator function and assess the LWR models applied to the other sample data sets to overcome the data set bias. We adopt Eager learning method to generate and store LWR model gradually when data is generated for all sections. In order to obtain a prediction solution at a specific point in time, an LWR model is generated based on newly generated data within a predetermined interval and then combined with existing LWR models in a section using a genetic algorithm. The proposed method shows better results than the method of selecting multiple LWR models using the simple average method. The results of this study are compared with the predicted results using multiple regression analysis by applying the real data such as the amount of traffic per hour in a specific area and hourly sales of a resting place of the highway, etc.

S-MADP : Service based Development Process for Mobile Applications of Medium-Large Scale Project (S-MADP : 중대형 프로젝트의 모바일 애플리케이션을 위한 서비스 기반 개발 프로세스)

  • Kang, Tae Deok;Kim, Kyung Baek;Cheng, Ki Ju
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.8
    • /
    • pp.555-564
    • /
    • 2013
  • Innovative evolution in mobile devices along with recent spread of Tablet PCs and Smart Phones makes a new change not only in individual life but also in enterprise applications. Especially, in the case of medium-large mobile applications for large enterprises which generally takes more than 3 months of development periods, importance and complexity increase significantly. Generally Agile-methodology is used for a development process for the medium-large scale mobile applications, but some issues arise such as high dependency on skilled developers and lack of detail development directives. In this paper, S-MADP (Smart Mobile Application Development Process) is proposed to mitigate these issues. S-MADP is a service oriented development process extending a object-oriented development process, for medium-large scale mobile applications. S-MADP provides detail development directives for each activities during the entire process for defining services as server-based or client-based and providing the way of reuse of services. Also, in order to support various user interfaces, S-MADP provides detail UI development directives. To evaluate the performance of S-MADP, three mobile application development projects were conducted and the results were analyzed. The projects are 'TBS(TB Mobile Service) 3.0' in TB company, mobile app-store in TS company, and mobile groupware in TG group. As a result of the projects, S-MADP accounts for more detailed design information about 'Minimizing the use of resources', 'Service-based designing' and 'User interface optimized for mobile devices' which are needed to be largely considered for mobile application development environment when we compare with existing Agile-methodology. Therefore, it improves the usability, maintainability, efficiency of developed mobile applications. Through field tests, it is observed that S-MADP outperforms about 25% than a Agile-methodology in the aspect of the required man-month for developing a medium-large mobile application.