• Title/Summary/Keyword: Significant gene-set

Search Result 72, Processing Time 0.022 seconds

Association of a miR-502-Binding Site Single Nucleotide Polymorphism in the 3'-Untranslated Region of SET8 and the TP53 Codon 72 Polymorphism with Cervical Cancer in the Chinese Population

  • Yang, Shao-Di;Cai, Yan-Lin;Jiang, Pei;Li, Wen;Tang, Jian-Xin
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.15 no.16
    • /
    • pp.6505-6510
    • /
    • 2014
  • Objective: This study was conducted to identify whether polymorphic variants of set domain-containing protein 8 (SET8) and tumor protein p53 (TP53) codon 72, either independently or jointly, might be associated with increased risk for cervical cancer. Methods: We genotyped SET8 and TP53 codon 72 polymorphisms of peripheral blood DNA from 114 cervical cancer patients and 200 controls using the polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP) and direct DNA sequencing. Results: The frequency of SET8 CC (odds ratios (OR) = 2.717, 95% CI=1.436-5.141) or TP53 GG (OR=2.168, 95% CI=1.149-4.089) genotype was associated with an increased risk of cervical cancer on comparison with the SET8 TT or TP53 CC genotypes, respectively. In additional, interaction between the SET8 and TP53 polymorphisms increased the risk of cervical cancer in a synergistic manner, with the OR being 9.913 (95% CI=2.028-48.459) for subjects carrying both SET8 CC and TP53 GG genotypes. Conclusion: These data suggest that there are significant associations between the miR-502-binding site SNP in the 3'-UTR of SET8 and the TP53 codon 72 polymorphism with cervical cancer in Chinese, and there is a gene-gene interaction.

Correlation Analysis between Regulatory Sequence Motifs and Expression Profiles by Kernel CCA

  • Rhee, Je-Keun;Joung, Je-Gun;Chang, Jeong-Ho;Zhang, Byoung-Tak
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2005.09a
    • /
    • pp.63-68
    • /
    • 2005
  • Transcription factors regulate gene expression by binding to gene upstream region. Each transcription factor has the specific binding site in promoter region. So the analysis of gene upstream sequence is necessary for understanding regulatory mechanism of genes, under a plausible idea that assumption that DNA sequence motif profiles are closely related to gene expression behaviors of the corresponding genes. Here, we present an effective approach to the analysis of the relation between gene expression profiles and gene upstream sequences on the basis of kernel canonical correlation analysis (kernel CCA). Kernel CCA is a useful method for finding relationships underlying between two different data sets. In the application to a yeast cell cycle data set, it is shown that gene upstream sequence profile is closely related to gene expression patterns in terms of canonical correlation scores. By the further analysis of the contributing values or weights of sequence motifs in the construction of a pair of sequence motif profiles and expression profiles, we show that the proposed method can identify significant DNA sequence motifs involved with some specific gene expression patterns, including some well known motifs and those putative, in the process of the yeast cell cycle.

  • PDF

GEDA: New Knowledge Base of Gene Expression in Drug Addiction

  • Suh, Young-Ju;Yang, Moon-Hee;Yoon, Suk-Joon;Park, Jong-Hoon
    • BMB Reports
    • /
    • v.39 no.4
    • /
    • pp.441-447
    • /
    • 2006
  • Abuse of drugs can elicit compulsive drug seeking behaviors upon repeated administration, and ultimately leads to the phenomenon of addiction. We developed a procedure for the standardization of microarray gene expression data of rat brain in drug addiction and stored them in a single integrated database system, focusing on more effective data processing and interpretation. Another characteristic of the present database is that it has a systematic flexibility for statistical analysis and linking with other databases. Basically, we adopt an intelligent SQL querying system, as the foundation of our DB, in order to set up an interactive module which can automatically read the raw gene expression data in the standardized format. We maximize the usability of this DB, helping users study significant gene expression and identify biological function of the genes through integrated up-to-date gene information such as GO annotation and metabolic pathway. For collecting the latest information of selected gene from the database, we also set up the local BLAST search engine and non-redundant sequence database updated by NCBI server on a daily basis. We find that the present database is a useful query interface and data-mining tool, specifically for finding out the genes related to drug addiction. We apply this system to the identification and characterization of methamphetamine-induced genes' behavior in rat brain.

Quantitative Analysis of Human- and Cow-Specific 16S rRNA Gene Markers for Assessment of Fecal Pollution in River Waters by Real-Time PCR

  • Jeong, Ju-Yong;Park, Hee-Deung;Lee, Kyong-Hee;Hwang, Jae-Hong;Ka, Jong-Ok
    • Journal of Microbiology and Biotechnology
    • /
    • v.20 no.2
    • /
    • pp.245-253
    • /
    • 2010
  • The base sequences representing human- and cow-specific 168 rRNA gene markers identified in a T-RFLP analysis were recovered from clone libraries. The human- and cow-specific primers were designed from these sequences and their specificities were analyzed with fecal DNAs from human, cow, and pig. The AllBac primer set showed positive results for all human, cow, and pig samples, whereas the human-specific primer set showed positive result only for the human sample but not for the cow or pig samples. Likewise, the cow-specific primer set showed positive results only for the cow sample but not for the human or pig samples. Real-time PCR assay with these primers was developed for the identification and quantification of fecal pollution in the river water. The human- and cow-specific markers were detected in the order of 9 $\log_{10}$ copies per gram wet feces, which were two orders of magnitude lower than those of total Bacteroidales. For the river water samples, the human-specific marker was detected in $1.7-6.2\;\log_{10}$ copies/100 ml water, which was 2.4-4.9 orders of magnitude lower than those of total Bacteroidales. There was no significant correlation between total Bacteroidales and conventional fecal indicators, but there was a high correlation between Bacteroidales and the human-specific marker. This assay could reliably identify and quantify the fecal pollution sources, enabling effective measures in the watersheds and facilitating water quality management.

Endo-sulfatase Sulf-1 Protein Expression is Down-regulated in Gastric Cancer

  • Gopal, Gopisetty;Shirley, Sundersingh;Raja, Uthandaraman Mahalinga;Rajkumar, Thangarajan
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.13 no.2
    • /
    • pp.641-646
    • /
    • 2012
  • In our recent report on gene expression in gastric cancer we identified the endo-sulfatase Sulf-1 gene to be up-regulated in gastric tumors relative to apparently normal (AN), and paired normal (PN) gastric tissue samples. In the present report we investigate the protein expression levels of Sulf-1 gene in gastric tumors, AN and PN samples using tissue microarray (TMA) and immunohistochemistry. Expression data was collected from two sets of TMA's containing replicate sections of tissue samples. Scoring data from TMA set-1 revealed a significant difference in Sulf-1 immunoreactivity between tumors and "normals" (PN and AN) (p-value = 0.001928). Also, Sulf-1 expression in tumors was also significantly different from either PN (p-value = 0.019) or AN (p-value = 0.006) samples. Similar results were obtained from analysis of scoring data from the second set of arrays. Comparison of mRNA expression and protein expression in gastric tumor tissues revealed that in 6/20 (30%) tumor samples showed up-regulated protein expression concordant with over-expression of mRNA. However, a discord with mRNA being over-expressed relative to down regulated protein expression was observed in majority 14/20 (70%) of tumor samples. Our study indicates down regulation of Sulf-1 protein expression in gastric tumors relative to PN and AN samples which is discordant with mRNA over-expression seen in tumors.

Finding significant genes using factor analysis (요인 분석을 이용한 유의한 유전자 추출)

  • Lee, Jeong-Wha;Lee, Hye-Seon;Park, Hae-Sang;Jun, Chi-Hyuck
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 2006.11a
    • /
    • pp.427-430
    • /
    • 2006
  • Clustering for gene expression data without filtering out noise genes may be distorted or derived inappropriate inference. Identifying significant genes and deleting noise before major analysis is necessary fur meaningful discovery from genes expression pattern. We proposed a new method of finding significant genes using factor analysis which is done on transposed data matrix. We construct significance score that is sum of factor loadings for declared significant number of factor, and set threshold through replication. Our proposed method works well for simulated time-course data for finding significant genes even though variance level gets larger.

  • PDF

Identifying differentially expressed genes using the Polya urn scheme

  • Saraiva, Erlandson Ferreira;Suzuki, Adriano Kamimura;Milan, Luis Aparecido
    • Communications for Statistical Applications and Methods
    • /
    • v.24 no.6
    • /
    • pp.627-640
    • /
    • 2017
  • A common interest in gene expression data analysis is to identify genes that present significant changes in expression levels among biological experimental conditions. In this paper, we develop a Bayesian approach to make a gene-by-gene comparison in the case with a control and more than one treatment experimental condition. The proposed approach is within a Bayesian framework with a Dirichlet process prior. The comparison procedure is based on a model selection procedure developed using the discreteness of the Dirichlet process and its representation via Polya urn scheme. The posterior probabilities for models considered are calculated using a Gibbs sampling algorithm. A numerical simulation study is conducted to understand and compare the performance of the proposed method in relation to usual methods based on analysis of variance (ANOVA) followed by a Tukey test. The comparison among methods is made in terms of a true positive rate and false discovery rate. We find that proposed method outperforms the other methods based on ANOVA followed by a Tukey test. We also apply the methodologies to a publicly available data set on Plasmodium falciparum protein.

Identification of Combined Biomarker for Predicting Alzheimer's Disease Using Machine Learning

  • Ki-Yeol Kim
    • Korean Journal of Biological Psychiatry
    • /
    • v.30 no.1
    • /
    • pp.24-30
    • /
    • 2023
  • Objectives Alzheimer's disease (AD) is the most common form of dementia in older adults, damaging the brain and resulting in impaired memory, thinking, and behavior. The identification of differentially expressed genes and related pathways among affected brain regions can provide more information on the mechanisms of AD. The aim of our study was to identify differentially expressed genes associated with AD and combined biomarkers among them to improve AD risk prediction accuracy. Methods Machine learning methods were used to compare the performance of the identified combined biomarkers. In this study, three publicly available gene expression datasets from the hippocampal brain region were used. Results We detected 31 significant common genes from two different microarray datasets using the limma package. Some of them belonged to 11 biological pathways. Combined biomarkers were identified in two microarray datasets and were evaluated in a different dataset. The performance of the predictive models using the combined biomarkers was superior to those of models using a single gene. When two genes were combined, the most predictive gene set in the evaluation dataset was ATR and PRKCB when linear discriminant analysis was applied. Conclusions Combined biomarkers showed good performance in predicting the risk of AD. The constructed predictive nomogram using combined biomarkers could easily be used by clinicians to identify high-risk individuals so that more efficient trials could be designed to reduce the incidence of AD.

Prediction of Lung Cancer Based on Serum Biomarkers by Gene Expression Programming Methods

  • Yu, Zhuang;Chen, Xiao-Zheng;Cui, Lian-Hua;Si, Hong-Zong;Lu, Hai-Jiao;Liu, Shi-Hai
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.15 no.21
    • /
    • pp.9367-9373
    • /
    • 2014
  • In diagnosis of lung cancer, rapid distinction between small cell lung cancer (SCLC) and non-small cell lung cancer (NSCLC) tumors is very important. Serum markers, including lactate dehydrogenase (LDH), C-reactive protein (CRP), carcino-embryonic antigen (CEA), neurone specific enolase (NSE) and Cyfra21-1, are reported to reflect lung cancer characteristics. In this study classification of lung tumors was made based on biomarkers (measured in 120 NSCLC and 60 SCLC patients) by setting up optimal biomarker joint models with a powerful computerized tool - gene expression programming (GEP). GEP is a learning algorithm that combines the advantages of genetic programming (GP) and genetic algorithms (GA). It specifically focuses on relationships between variables in sets of data and then builds models to explain these relationships, and has been successfully used in formula finding and function mining. As a basis for defining a GEP environment for SCLC and NSCLC prediction, three explicit predictive models were constructed. CEA and NSE are requentlyused lung cancer markers in clinical trials, CRP, LDH and Cyfra21-1 have significant meaning in lung cancer, basis on CEA and NSE we set up three GEP models-GEP 1(CEA, NSE, Cyfra21-1), GEP2 (CEA, NSE, LDH), GEP3 (CEA, NSE, CRP). The best classification result of GEP gained when CEA, NSE and Cyfra21-1 were combined: 128 of 135 subjects in the training set and 40 of 45 subjects in the test set were classified correctly, the accuracy rate is 94.8% in training set; on collection of samples for testing, the accuracy rate is 88.9%. With GEP2, the accuracy was significantly decreased by 1.5% and 6.6% in training set and test set, in GEP3 was 0.82% and 4.45% respectively. Serum Cyfra21-1 is a useful and sensitive serum biomarker in discriminating between NSCLC and SCLC. GEP modeling is a promising and excellent tool in diagnosis of lung cancer.

14-bp Insertion/Deletion Polymorphism of the HLA-G gene in Breast Cancer among Women from North Western Iran

  • Haghi, Mehdi;Feizi, Mohammad Ali Hosseinpour;Sadeghizadeh, Majid;Lotfi, Abbas Sahebghadam
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.16 no.14
    • /
    • pp.6155-6158
    • /
    • 2015
  • Background: The human leukocyte antigen-G (HLA-G) gene is highly expressed in cancer pathologies and is one strategy used by tumor cells to escape immune surveillance. A 14-bp insertion/deletion (InDel) polymorphism of the HLA-G gene has been suggested to be associated with HLA-G mRNA stability and the expression of HLA-G. The aim of present study was to assess any genetic association between this polymorphism and breast cancer among Iranian-Azeri women. Materials and Methods: In this study 227 women affected with breast cancer, in addition to 255 age-sex and ethnically matched healthy individuals as the control group, participated. Genotyping was performed using polymerase chain reaction and electrophoresis assays. The data were compiled according to the genotype and allele frequencies, compared using the Chi-square test. Statistical significance was set at P<0.05. Results: In this case-control study, no significant difference was found between the case and control groups at allelic and genotype levels, although there is a slightly higher allele frequency of HLA-G 14bp deletion in breast cancer affected group. However,when the stage I subgroup was compared with stage II plus stage III subgroup of affected breast cancer, a significant difference was seen with the 14 bp deletion allele frequency. The stage II-III subgroup patients had higher frequency of deletion allele (57.4% vs 45.8%) than stage I cases (${\chi}^2=4.16$, p-value=0.041). Conclusions: Our data support a possible action of HLA-G 14bp InDel polymorphism as a potential genetic risk factor for progression of breast cancer. This finding highlights the necessity of future studies of this gene to establish the exact role of HLA-G in progression steps of breast cancer.