• Title/Summary/Keyword: Gene Prediction

Search Result 292, Processing Time 0.026 seconds

Eukaryotic Gene Structure Prediction Using Duration HMM (Duration HMM을 이용한 진핵생물 유전자 구조 예측)

  • Tae, Hong-Seok;Park, Kie-Jung
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2003.10a
    • /
    • pp.200-209
    • /
    • 2003
  • 주어진 염기서열에서 유전자 영역을 예측하는 유전자 구조 예측은 유전체 프로젝트의 중요한 과정 중 하나이며 유전체 프로젝트 전체에 큰 영향을 준다. 진핵생물의 유전체가 원핵생물의 유전체에 비해 더 복잡한 구조를 가지기 때문에 진핵생물의 유전자 구조 예측 모델 역시원핵생물에 비해 다양한 모델이 제안되었다. 본 연구팀은 duration hidden markov model을 기본형태로 하여 EGSP(Eukaryotic Gene Structure Prediction)프로그램을 개발하였다. 현재 개발된 진핵생물의 유전자 구조 예측 알고리즘 중에서 GenScan이 가장 정교한 젓으로 보고 되고 있는데, EGSP의 결과분석을 위해 Genscan과 함께 GeneID, Morgan의 예측결과를 여러 가지 기준에서 비교하였다. EGSP는 정교한 예측모델을 가지고 있음에도 각 구성모듈에 대한 파라메터의 정교함에서 부족한 면이 나타나므로, 모델의 개선과 각 모듈의 조율을 통해 더욱 개선된 결과를 가지게 될 것이다.

  • PDF

PromoterWizard: An Integrated Promoter Prediction Program Using Hybrid Methods

  • Park, Kie-Jung;Kim, Ki-Bong
    • Genomics & Informatics
    • /
    • v.9 no.4
    • /
    • pp.194-196
    • /
    • 2011
  • Promoter prediction is a very important problem and is closely related to the main problems of bioinformatics such as the construction of gene regulatory networks and gene function annotation. In this context, we developed an integrated promoter prediction program using hybrid methods, PromoterWizard, which can be employed to detect the core promoter region and the transcription start site (TSS) in vertebrate genomic DNA sequences, an issue of obvious importance for genome annotation efforts. PromoterWizard consists of three main modules and two auxiliary modules. The three main modules include CDRM (Composite Dependency Reflecting Model) module, SVM (Support Vector Machine) module, and ICM (Interpolated Context Model) module. The two auxiliary modules are CpG Island Detector and GCPlot that may contribute to improving the predictive accuracy of the three main modules and facilitating human curator to decide on the final annotation.

MicroRNA-Gene Association Prediction Method using Deep Learning Models

  • Seung-Won Yoon;In-Woo Hwang;Kyu-Chul Lee
    • Journal of information and communication convergence engineering
    • /
    • v.21 no.4
    • /
    • pp.294-299
    • /
    • 2023
  • Micro ribonucleic acids (miRNAs) can regulate the protein expression levels of genes in the human body and have recently been reported to be closely related to the cause of disease. Determining the genes related to miRNAs will aid in understanding the mechanisms underlying complex miRNAs. However, the identification of miRNA-related genes through wet experiments (in vivo, traditional methods are time- and cost-consuming). To overcome these problems, recent studies have investigated the prediction of miRNA relevance using deep learning models. This study presents a method for predicting the relationships between miRNAs and genes. First, we reconstruct a negative dataset using the proposed method. We then extracted the feature using an autoencoder, after which the feature vector was concatenated with the original data. Thereafter, the concatenated data were used to train a long short-term memory model. Our model exhibited an area under the curve of 0.9609, outperforming previously reported models trained using the same dataset.

Association of MC4R Gene Polymorphisms with Growth and Body Composition Traits in Chicken

  • Li, Chun-Yu;Li, Hui
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.19 no.6
    • /
    • pp.763-768
    • /
    • 2006
  • Genetic and pharmacological studies in mice have demonstrated a complementary role for the melanocortin 4 receptor (MC4R) in the control of food intake, energy balance and body weight. This study was designed to investigate the associations of a MC4R gene polymorphism on chicken growth and body composition traits in broiler lines divergently selected for abdominal fat. A SNP (G54C) was found in CDS region of chicken MC4R gene. The analysis of the least squares and variance revealed a significant association between the G54C SNP and BW, CW and SL at 7 wk of age, and there were significant differences in different genotypes (p<0.05). The results from protein secondary structure prediction and tertiary structure prediction showed that it appeared a helix in $13^{th}$ amino acid and two strands at $14^{th}$ and $15^{th}$ amino acid in mutant protein, respectively. It maybe induce the change of the activity or function of MC4R gene in poultry.

Identification of Combined Biomarker for Predicting Alzheimer's Disease Using Machine Learning

  • Ki-Yeol Kim
    • Korean Journal of Biological Psychiatry
    • /
    • v.30 no.1
    • /
    • pp.24-30
    • /
    • 2023
  • Objectives Alzheimer's disease (AD) is the most common form of dementia in older adults, damaging the brain and resulting in impaired memory, thinking, and behavior. The identification of differentially expressed genes and related pathways among affected brain regions can provide more information on the mechanisms of AD. The aim of our study was to identify differentially expressed genes associated with AD and combined biomarkers among them to improve AD risk prediction accuracy. Methods Machine learning methods were used to compare the performance of the identified combined biomarkers. In this study, three publicly available gene expression datasets from the hippocampal brain region were used. Results We detected 31 significant common genes from two different microarray datasets using the limma package. Some of them belonged to 11 biological pathways. Combined biomarkers were identified in two microarray datasets and were evaluated in a different dataset. The performance of the predictive models using the combined biomarkers was superior to those of models using a single gene. When two genes were combined, the most predictive gene set in the evaluation dataset was ATR and PRKCB when linear discriminant analysis was applied. Conclusions Combined biomarkers showed good performance in predicting the risk of AD. The constructed predictive nomogram using combined biomarkers could easily be used by clinicians to identify high-risk individuals so that more efficient trials could be designed to reduce the incidence of AD.

Clinical significance of APOB inactivation in hepatocellular carcinoma

  • Lee, Gena;Jeong, Yun Seong;Kim, Do Won;Kwak, Min Jun;Koh, Jiwon;Joo, Eun Wook;Lee, Ju-Seog;Kah, Susie;Sim, Yeong-Eun;Yim, Sun Young
    • Experimental and Molecular Medicine
    • /
    • v.50 no.11
    • /
    • pp.7.1-7.12
    • /
    • 2018
  • Recent findings from The Cancer Genome Atlas project have provided a comprehensive map of genomic alterations that occur in hepatocellular carcinoma (HCC), including unexpected mutations in apolipoprotein B (APOB). We aimed to determine the clinical significance of this non-oncogenetic mutation in HCC. An Apob gene signature was derived from genes that differed between control mice and mice treated with siRNA specific for Apob (1.5-fold difference; P < 0.005). Human gene expression data were collected from four independent HCC cohorts (n = 941). A prediction model was constructed using Bayesian compound covariate prediction, and the robustness of the APOB gene signature was validated in HCC cohorts. The correlation of the APOB signature with previously validated gene signatures was performed, and network analysis was conducted using ingenuity pathway analysis. APOB inactivation was associated with poor prognosis when the APOB gene signature was applied in all human HCC cohorts. Poor prognosis with APOB inactivation was consistently observed through cross-validation with previously reported gene signatures (NCIP A, HS, high-recurrence SNUR, and high RS subtypes). Knowledge-based gene network analysis using genes that differed between low-APOB and high-APOB groups in all four cohorts revealed that low-APOB activity was associated with upregulation of oncogenic and metastatic regulators, such as HGF, MTIF, ERBB2, FOXM1, and CD44, and inhibition of tumor suppressors, such as TP53 and PTEN. In conclusion, APOB inactivation is associated with poor outcome in patients with HCC, and APOB may play a role in regulating multiple genes involved in HCC development.

Promoter Prediction using Genetic Algorithm (유전자 알고리즘을 이용한 Promoter 예측)

  • 오민경;김창훈;김기봉;공은배;김승목
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 1999.10b
    • /
    • pp.12-14
    • /
    • 1999
  • Promoter는 transcript start site 앞부분에 위치하여 RNA polymerase가 높은 친화성을 보이며 바인당하는 DNA상의 특별한 부위로서 여기서부터 DNA transcription이 시작된다. function이나 tissue-specific gene들의 그룹별로 그 promoter들의 특이한 패턴들의 조합을 발견함으로써 Specific한 transcription을 조절하는 것으로 알려져 있어 promoter로 인한 그 gene의 정보를 어느 정도 알 수가 있다. 사람의 housekeeping gene promoter들을 EPD(eukaryotic promoter database)와 EMBL nucleic acid sequence database로부터 수집하여 이것들 간에 의미 있게 나타나는 모든 패턴들을 optimization algorithm으로 알려진 genetic algorithm을 이용해서 찾아보았다.

  • PDF

A Study on Sasang Constitutional Gene Selection Using DNA Chips by Multivariate Analysis (유전자 칩 및 다변량 분석방법을 이용한 사상체질 유전자 선별에 관한 연구)

  • Kim, Pan-Joon;Seo, Eun-Hee;Lee, Jung-Hwan;Ha, Jin-Ho;Choi, Hong-Sik;Jung, Tae-Young;Goo, Deok-Mo
    • Journal of Sasang Constitutional Medicine
    • /
    • v.18 no.3
    • /
    • pp.131-144
    • /
    • 2006
  • 1. Objectives This research uses the DNA chip, which includes 16,383 gene code, and various statistic prediction way that shows objectification index for the objectification of constitution diagnosis. 2. Methods Drawing blood whose constitution is confirmed, and analyze its gene information by using 1.7k DNA chip to find the gene correlation through multivariate statistical method. 3. Results and Conclusions Distinctive genes such as AK001919, U09384, NM_001805, X99962, NM_004796, AK026738, AL050148, BC002538, AK027074, AK026219, AF087962, AL390142, NM_015372, AL157466, NM_002446, AK024523, NM_014706, NM_014746 and AL137544 were related to Taeumin; AL157448, NM_005957, NM_005656, NM_017548, AK027246, NM_003025, NM_012302 and NM_005905 were represented in Soeumin, while AK026503, AF147325, NM_002076, AF147307, AK001375, NM_003740, NM_005114, AB007890, NM_005505, NM_015900, NM_014936, Z70694, AB023154, U52076, NM_004360, NM_005835, NM_017528, AF087987, NM_014897, AK021720, NM_006420, AJ277915, AK002118 and AK021918 were for Soyangin. This study figured out the possibility to develop the prediction system by sorting each constitution's gene, and research each constitution's distinctive character of manifestation pattern.

  • PDF

Applying a modified AUC to gene ranking

  • Yu, Wenbao;Chang, Yuan-Chin Ivan;Park, Eunsik
    • Communications for Statistical Applications and Methods
    • /
    • v.25 no.3
    • /
    • pp.307-319
    • /
    • 2018
  • High-throughput technologies enable the simultaneous evaluation of thousands of genes that could discriminate different subclasses of complex diseases. Ranking genes according to differential expression is an important screening step for follow-up analysis. Many statistical measures have been proposed for this purpose. A good ranked list should provide a stable rank (at least for top-ranked gene), and the top ranked genes should have a high power in differentiating different disease status. However, there is a lack of emphasis in the literature on ranking genes based on these two criteria simultaneously. To achieve the above two criteria simultaneously, we proposed to apply a previously reported metric, the modified area under the receiver operating characteristic cure, to gene ranking. The proposed ranking method is found to be promising in leading to a stable ranking list and good prediction performances of top ranked genes. The findings are illustrated through studies on both synthesized data and real microarray gene expression data. The proposed method is recommended for ranking genes or other biomarkers for high-dimensional omics studies.

Deep learning for stage prediction in neuroblastoma using gene expression data

  • Park, Aron;Nam, Seungyoon
    • Genomics & Informatics
    • /
    • v.17 no.3
    • /
    • pp.30.1-30.4
    • /
    • 2019
  • Neuroblastoma is a major cause of cancer death in early childhood, and its timely and correct diagnosis is critical. Gene expression datasets have recently been considered as a powerful tool for cancer diagnosis and subtype classification. However, no attempts have yet been made to apply deep learning using gene expression to neuroblastoma classification, although deep learning has been applied to cancer diagnosis using image data. Taking the International Neuroblastoma Staging System stages as multiple classes, we designed a deep neural network using the gene expression patterns and stages of neuroblastoma patients. Despite a small patient population (n = 280), stage 1 and 4 patients were well distinguished. If it is possible to replicate this approach in a larger population, deep learning could play an important role in neuroblastoma staging.