• Title/Summary/Keyword: Gene Prediction

검색결과 292건 처리시간 0.029초

ORF Miner: a Web-based ORF Search Tool

  • Park, Sin-Gi;Kim, Ki-Bong
    • Genomics & Informatics
    • /
    • 제7권4호
    • /
    • pp.217-219
    • /
    • 2009
  • The primary clue for locating protein-coding regions is the open reading frame and the determination of ORFs (Open Reading Frames) is the first step toward the gene prediction, especially for prokaryotes. In this respect, we have developed a web-based ORF search tool called ORF Miner. The ORF Miner is a graphical analysis utility which determines all possible open reading frames of a selectable minimum size in an input sequence. This tool identifies all open reading frames using alternative genetic codes as well as the standard one and reports a list of ORFs with corresponding deduced amino acid sequences. The ORF Miner can be employed for sequence annotation and give a crucial clue to determination of actual protein-coding regions.

A prediction model for strength and strain of CFRP-confined concrete cylinders using gene expression programming

  • Sema, Alacali
    • Computers and Concrete
    • /
    • 제30권6호
    • /
    • pp.377-391
    • /
    • 2022
  • The use of carbon fiber-reinforced polymers (CFRP) has widely increased due to its enhancement in the ultimate strength and ductility of the reinforced concrete (RC) structures. This study presents a prediction model for the axial compressive strength and strain of normal-strength concrete cylinders confined with CFRP. Besides, soft computing approaches have been extensively used to model in many areas of civil engineering applications. Therefore, the genetic expression programming (GEP) models to predict axial compressive strength and strain of CFRP-confined concrete specimens were used in this study. For this purpose, the parameters of 283 CFRP-confined concrete specimens collected from 38 experimental studies in the literature were taken into account as input variables to predict GEP based models. Then, the results of GEP models were statistically compared with those of models proposed by various researchers. The values of R2 for strength and strain of CFRP-confined concrete were obtained as 0.897 and 0.713, respectively. The results of the comparison reveal that the proposed GEP-based models for CFRP-confined concrete have the best efficiency among the existing models and provide the best performance.

Prediction of earthquake-induced crest settlement of embankment dams using gene expression programming

  • Evren, Seyrek;Sadettin, Topcu
    • Geomechanics and Engineering
    • /
    • 제31권6호
    • /
    • pp.637-651
    • /
    • 2022
  • The seismic design of embankment dams requires more comprehensive studies to understand the behaviour of dams. Deformations primarily control this behaviour occur during or after earthquake loading. Dam failures and incidents show that the impacts of deformations should be reviewed for existing and new embankment dams. Overtopping erosion failure can occur if crest deformations exceed the freeboard at the time of the deformations. Therefore, crest settlement is one of the most critical deformations. This study developed empirical formulas using Gene Expression Programming (GEP) based on 88 cases. In the analyses, dam height (Hd), alluvium thickness (Ha), the magnitude-acceleration-factor (MAF) values developed based on earthquake magnitude (Mw) and peak ground acceleration (PGA) within this study have been chosen as variables. Results show that GEP models developed in the paper are remarkably robust and accessible tools to predict earthquake-induced crest settlement of embankment dams and perform superior to the existing formulation. Also, dam engineering professionals can use them practically because the variables of prediction equations are easily accessible after the earthquake.

유방암에서 자기공명영상 근거 영상표현형과 유전자 발현 프로파일 근거 위험도의 관계 (Correlation between MR Image-Based Radiomics Features and Risk Scores Associated with Gene Expression Profiles in Breast Cancer)

  • 김가람;구유진;김준호;김은경
    • 대한영상의학회지
    • /
    • 제81권3호
    • /
    • pp.632-643
    • /
    • 2020
  • 목적 자기공명영상 근거 영상표현형과 생체분자학적 아형, 유전자 발현 프로파일 근거 위험도 등 유방암 유전체 특징의 관계를 분석하고자 하였다. 대상과 방법 The Cancer Genome Atlas와 and the Cancer Imaging Archive에 공개된 자료를 이용하였다. 122개의 유방암의 자기공명영상에서 영상표현형이 추출되었다. 유전자 발현 프로파일에 따라 PAM50아형을 분류하고 위험도를 지정하였다. 영상표현형과 생체분자학적 특징의 관계를 분석하였다. 예측모델을 알아보기 위해 penalized generalized regression analysis를 이용하였다. 결과 PAM50아형은 maximum 2D diameter (p = 0.0189), degree of correlation (p = 0.0386), 그리고 inverse difference moment normalized (p = 0.0337)와 유의하게 관련이 있었다. 위험도 시스템 중에 GGI와 GENE70이 통계적으로 유의하게 8개의 영상표현형 특징을 서로 공유하였다(p = 0.0008~0.0492). Maximum 2D diameter가 두 위험도 시스템에서 가장 유의하게 관련있는 특징이었으나(p = 0.0139, p = 0.0008) 예측모델의 전반적인 연관 정도는 약했고 가장 높은 연관계수는 GENE70이 0.2171이었다. 결론 영상표현형 중에 maximum 2D diameter, degree of correlation, 그리고 inverse difference moment normalized가 PAM50 아형 그리고 GENE70과 같은 유전자 발현 프로파일 근거 위험도와 그 연관도는 약하였으나 유의한 관련을 보였다.

Classification of Genes Based on Age-Related Differential Expression in Breast Cancer

  • Lee, Gunhee;Lee, Minho
    • Genomics & Informatics
    • /
    • 제15권4호
    • /
    • pp.156-161
    • /
    • 2017
  • Transcriptome analysis has been widely used to make biomarker panels to diagnose cancers. In breast cancer, the age of the patient has been known to be associated with clinical features. As clinical transcriptome data have accumulated significantly, we classified all human genes based on age-specific differential expression between normal and breast cancer cells using public data. We retrieved the values for gene expression levels in breast cancer and matched normal cells from The Cancer Genome Atlas. We divided genes into two classes by paired t test without considering age in the first classification. We carried out a secondary classification of genes for each class into eight groups, based on the patterns of the p-values, which were calculated for each of the three age groups we defined. Through this two-step classification, gene expression was eventually grouped into 16 classes. We showed that this classification method could be applied to establish a more accurate prediction model to diagnose breast cancer by comparing the performance of prediction models with different combinations of genes. We expect that our scheme of classification could be used for other types of cancer data.

Structure Prediction of the Peptide Synthesized with the Nonribosomal Peptide Synthetase Gene from Bradyrhizobium japonicum

  • JUNG BO-RA;LEE YUKYUNG;LIM YOONGHO;AHN JOONG-HOON
    • Journal of Microbiology and Biotechnology
    • /
    • 제15권3호
    • /
    • pp.656-659
    • /
    • 2005
  • Small peptides synthesized by nonribosomal peptide synthetases (NRPSs) genes are found in bacteria and fungi. While some microbial taxa have few, others make a large number and variety. However, biochemical characterization of the products synthesized by NPRS demands a great deal of efforts. Since the completion of genome projects of numerous microorganisms, the numbers of available NRPSs genes are being expanded. Prediction of the peptides encoded by NRPS could save time and efforts. We chose the NRPS gene from Bradyrhizobium japonicum as a model to predict the peptide structure encoded by NRPS genes. Using computational analyses, the domain structure of this gene was defined, and the structure of a peptide synthesized by this NRPS was deduced. It was found that it encoded a tripeptide consisting of proline-serine-phenylalanine. This method would be helpful to predict the structure of small peptides with various NPRS genes from the genome sequence.

Multi-gene genetic programming for the prediction of the compressive strength of concrete mixtures

  • Ghahremani, Behzad;Rizzo, Piervincenzo
    • Computers and Concrete
    • /
    • 제30권3호
    • /
    • pp.225-236
    • /
    • 2022
  • In this article, Multi-Gene Genetic Programming (MGGP) is proposed for the estimation of the compressive strength of concrete. MGGP is known to be a powerful algorithm able to find a relationship between certain input space features and a desired output vector. With respect to most conventional machine learning algorithms, which are often used as "black boxes" that do not provide a mathematical formulation of the output-input relationship, MGGP is able to identify a closed-form formula for the input-output relationship. In the study presented in this article, MGPP was used to predict the compressive strength of plain concrete, concrete with fly ash, and concrete with furnace slag. A formula was extracted for each mixture and the performance and the accuracy of the predictions were compared to the results of Artificial Neural Network (ANN) and Extreme Learning Machine (ELM) algorithms, which are conventional and well-established machine learning techniques. The results of the study showed that MGGP can achieve a desirable performance, as the coefficients of determination for plain concrete, concrete with ash, and concrete with slag from the testing phase were equal to 0.928, 0.906, 0.890, respectively. In addition, it was found that MGGP outperforms ELM in all cases and its' accuracy is slightly less than ANN's accuracy. However, MGGP models are practical and easy-to-use since they extract closed-form formulas that may be implemented and used for the prediction of compressive strength.

Statistical Analysis for Feature Subset Selection Procedures.

  • Kim, In-Young;Lee, Sun-Ho;Kim, Sang-Cheol;Rha, Sun-Young;Chung, Hyun-Cheol;Kim, Byung-Soo
    • 한국생물정보학회:학술대회논문집
    • /
    • 한국생물정보시스템생물학회 2003년도 제2차 연례학술대회 발표논문집
    • /
    • pp.101-106
    • /
    • 2003
  • In this paper, we propose using Hotelling's T2 statistic for the detection of a set of a set of differentially expressed (DE) genes in colorectal cancer based on its gene expression level in tumor tissues compared with those in normal tissues and to evaluate its predictivity which let us rank genes for the development of biomarkers for population screening of colorectal cancer. We compared the prediction rate based on the DE genes selected by Hotelling's T2 statistic and univariate t statistic using various prediction methods, a regulized discrimination analysis and a support vector machine. The result shows that the prediction rate based on T2 is better than that of univatiate t. This implies that it may not be sufficient to look at each gene in a separate universe and that evaluating combinations of genes reveals interesting information that will not be discovered otherwise.

  • PDF

Partial AUC maximization for essential gene prediction using genetic algorithms

  • Hwang, Kyu-Baek;Ha, Beom-Yong;Ju, Sanghun;Kim, Sangsoo
    • BMB Reports
    • /
    • 제46권1호
    • /
    • pp.41-46
    • /
    • 2013
  • Identifying genes indispensable for an organism's life and their characteristics is one of the central questions in current biological research, and hence it would be helpful to develop computational approaches towards the prediction of essential genes. The performance of a predictor is usually measured by the area under the receiver operating characteristic curve (AUC). We propose a novel method by implementing genetic algorithms to maximize the partial AUC that is restricted to a specific interval of lower false positive rate (FPR), the region relevant to follow-up experimental validation. Our predictor uses various features based on sequence information, protein-protein interaction network topology, and gene expression profiles. A feature selection wrapper was developed to alleviate the over-fitting problem and to weigh each feature's relevance to prediction. We evaluated our method using the proteome of budding yeast. Our implementation of genetic algorithms maximizing the partial AUC below 0.05 or 0.10 of FPR outperformed other popular classification methods.

Prediction of creep in concrete using genetic programming hybridized with ANN

  • Hodhod, Osama A.;Said, Tamer E.;Ataya, Abdulaziz M.
    • Computers and Concrete
    • /
    • 제21권5호
    • /
    • pp.513-523
    • /
    • 2018
  • Time dependent strain due to creep is a significant factor in structural design. Multi-gene genetic programming (MGGP) and artificial neural network (ANN) are used to develop two models for prediction of creep compliance in concrete. The first model was developed by MGGP technique and the second model by hybridized MGGP-ANN. In the MGGP-ANN, the ANN is working in parallel with MGGP to predict errors in MGGP model. A total of 187 experimental data sets that contain 4242 data points are filtered from the NU-ITI database. These data are used in developing the MGGP and MGGP-ANN models. These models contain six input variables which are: average compressive strength at 28 days, relative humidity, volume to surface ratio, cement type, age at start of loading and age at the creep measurement. Practical equation based on MGGP was developed. A parametric study carried out with a group of hypothetical data generated among the range of data used to check the generalization ability of MGGP and MGGP-ANN models. To confirm validity of MGGP and MGGP-ANN models; two creep prediction code models (ACI209 and CEB), two empirical models (B3 and GL 2000) are used to compare their results with NU-ITI database.