• 제목/요약/키워드: cancer genome database

검색결과 46건 처리시간 0.024초

바이오메디컬 데이터 처리를 위한 데이터마이닝 활용 (Application of Data Mining for Biomedical Data Processing)

  • 손호선;김경옥;차은종;김경아
    • 전기학회논문지
    • /
    • 제65권7호
    • /
    • pp.1236-1241
    • /
    • 2016
  • Cancer has been the most frequent in Korea, and pathogenesis and progression of cancer have been known to be occurred through various causes and stages. Recently, the research of chromosomal and genetic disorder and the research about prognostic factor to predict occurrence, recurrence and progress of chromosomal and genetic disorder have been performed actively. In this paper, we analyzed DNA methylation data downloaded from TCGA (The Cancer Genome Atlas), open database, to research bladder cancer which is the most frequent among urinary system cancers. Using three level of methylation data which had the most preprocessing, 59 candidate CpG island were extracted from 480,000 CpG island, and then we analyzed extracted CpG island applying data mining technique. As a result, cg12840719 CpG island were analyzed significant, and in Cox's regression we can find the CpG island with high relative risk in comparison with other CpG island. Shown in the result of classification analysis, the CpG island which have high correlation with bladder cancer are cg03146993, cg07323648, cg12840719, cg14676825 and classification accuracy is about 76%. Also we found out that positive predictive value, the probability which predicts cancer in case of cancer was 72.4%. Through the verification of candidate CpG island from the result, we can utilize this method for diagnosing and treating cancer.

A genomic and bioinformatic-based approach to identify genetic variants for liver cancer across multiple continents

  • Muhammad Ma'ruf;Lalu Muhammad Irham;Wirawan Adikusuma;Made Ary Sarasmita;Sabiah Khairi;Barkah Djaka Purwanto;Rockie Chong;Maulida Mazaya;Lalu Muhammad Harmain Siswanto
    • Genomics & Informatics
    • /
    • 제21권4호
    • /
    • pp.48.1-48.8
    • /
    • 2023
  • Liver cancer is the fourth leading cause of death worldwide. Well-known risk factors include hepatitis B virus and hepatitis C virus, along with exposure to aflatoxins, excessive alcohol consumption, obesity, and type 2 diabetes. Genomic variants play a crucial role in mediating the associations between these risk factors and liver cancer. However, the specific variants involved in this process remain under-explored. This study utilized a bioinformatics approach to identify genetic variants associated with liver cancer from various continents. Single-nucleotide polymorphisms associated with liver cancer were retrieved from the genome-wide association studies catalog. Prioritization was then performed using functional annotation with HaploReg v4.1 and the Ensembl database. The prevalence and allele frequencies of each variant were evaluated using Pearson correlation coefficients. Two variants, rs2294915 and rs2896019, encoded by the PNPLA3 gene, were found to be highly expressed in the liver tissue, as well as in the skin, cell-cultured fibroblasts, and adipose-subcutaneous tissue, all of which contribute to the risk of liver cancer. We further found that these two SNPs (rs2294915 and rs2896019) were positively correlated with the prevalence rate. Positive associations with the prevalence rate were more frequent in East Asian and African populations. We highlight the utility of this population-specific PNPLA3 genetic variant for genetic association studies and for the early prognosis and treatment of liver cancer. This study highlights the potential of integrating genomic databases with bioinformatic analysis to identify genetic variations involved in the pathogenesis of liver cancer. The genetic variants investigated in this study are likely to predispose to liver cancer and could affect its progression and aggressiveness. We recommend future research prioritizing the validation of these variations in clinical settings.

hpvPDB: An Online Proteome Reserve for Human Papillomavirus

  • Kumar, Satish;Jena, Lingaraja;Daf, Sangeeta;Mohod, Kanchan;Goyal, Peyush;Varma, Ashok K.
    • Genomics & Informatics
    • /
    • 제11권4호
    • /
    • pp.289-291
    • /
    • 2013
  • Human papillomavirus (HPV) infection is the leading cause of cancer mortality among women worldwide. The molecular understanding of HPV proteins has significant connotation for understanding their intrusion in the host and designing novel protein vaccines and anti-viral agents, etc. Genomic, proteomic, structural, and disease-related information on HPV is available on the web; yet, with trivial annotations and more so, it is not well customized for data analysis, host-pathogen interaction, strain-disease association, drug designing, and sequence analysis, etc. We attempted to design an online reserve with comprehensive information on HPV for the end users desiring the same. The Human Papillomavirus Proteome Database (hpvPDB) domiciles proteomic and genomic information on 150 HPV strains sequenced to date. Simultaneous easy expandability and retrieval of the strain-specific data, with a provision for sequence analysis and exploration potential of predicted structures, and easy access for curation and annotation through a range of search options at one platform are a few of its important features. Affluent information in this reserve could be of help for researchers involved in structural virology, cancer research, drug discovery, and vaccine design.

Odorant receptors in cancer

  • Chung, Chan;Cho, Hee Jin;Lee, ChaeEun;Koo, JaeHyung
    • BMB Reports
    • /
    • 제55권2호
    • /
    • pp.72-80
    • /
    • 2022
  • Odorant receptors (ORs), the largest subfamily of G protein-coupled receptors, detect odorants in the nose. In addition, ORs were recently shown to be expressed in many nonolfactory tissues and cells, indicating that these receptors have physiological and pathophysiological roles beyond olfaction. Many ORs are expressed by tumor cells and tissues, suggesting that they may be associated with cancer progression or may be cancer biomarkers. This review describes OR expression in various types of cancer and the association of these receptors with various types of signaling mechanisms. In addition, the clinical relevance and significance of the levels of OR expression were evaluated. Namely, levels of OR expression in cancer were analyzed based on RNA-sequencing data reported in the Cancer Genome Atlas; OR expression patterns were visualized using t-distributed stochastic neighbor embedding (t-SNE); and the associations between patient survival and levels of OR expression were analyzed. These analyses of the relationships between patient survival and expression patterns obtained from an open mRNA database in cancer patients indicate that ORs may be cancer biomarkers and therapeutic targets.

Classification of Human Papillomavirus (HPV) Risk Type via Text Mining

  • Park, Seong-Bae;Hwang, Sohyun;Zhang, Byoung-Tak
    • Genomics & Informatics
    • /
    • 제1권2호
    • /
    • pp.80-86
    • /
    • 2003
  • Human Papillomavirus (HPV) infection is known as the main factor for cervical cancer which is a leading cause of cancer deaths in women worldwide. Because there are more than 100 types in HPV, it is critical to discriminate the HPVs related with cervical cancer from those not related with it. In this paper, the risk type of HPVs using their textual explanation. The important issue in this problem is to distinguish false negatives from false positives. That is, we must find high-risk HPVs as many as possible though we may miss some low-risk HPVs. For this purpose, the AdaCost, a cost-sensitive learner is adopted to consider different costs between training examples. The experimental results on the HPV sequence database show that the consideration of costs gives higher performance. The improvement in F-score is higher than that of the accuracy, which implies that the number of high-risk HPVs found is increased.

IdMapper: A Java Application for ID Mapping across Multiple Cross-referencing Providers

  • Lee, Hoo-Keun;Kim, Hyeon-Jin;Yu, Ung-Sik
    • Genomics & Informatics
    • /
    • 제7권4호
    • /
    • pp.208-211
    • /
    • 2009
  • We developed an identifier mapping application for bioinformatics research in Java programming language. It is easy to use and provides many usability functionalities that are expected as essentials for a professional application. It supports three widely used mapping services and can convert many ids from one source database into many target databases at once. Id mapping across service providers is possible by remapping the resultant ids. Because it adheres to the NetBeans platform architecture, it can be incorporated into other NetBeans platform applications as an id mapping provider without adaption or modification.

Genomic DNA Chip: Genome-wide profiling in Cancer

  • 이종호
    • 한국생물정보학회:학술대회논문집
    • /
    • 한국생물정보시스템생물학회 2001년도 제2회 생물정보 워크샵 (DNA Chip Bioinformatics)
    • /
    • pp.61-86
    • /
    • 2001
  • All cancers are caused by abnormalities in DNA sequence. Throughout life, the DNA in human cells is exposed to mutagens and suffers mistakes in replication, resulting in progressive, subtle changes in the DNA sequence in each cell. Since the development of conventional and molecular cytogenetic methods to the analysis of chromosomal aberrations in cancers, more than 1,800 recurring chromosomal breakpoints have been identified. These breakpoints and regions of nonrandom copy number changes typically point to the location of genes involved in cancer initiation and progression. With the introduction of molecular cytogenetic methodologies based on fluorescence in situ hybridization (FISH), namely, comparative genomic hybridization (CGH) and multicolor FISH (m-FISH) in carcinomas become susceptible to analysis. Conventional CGH has been widely applied for the detection of genomic imbalances in tumor cells, and used normal metaphase chromosomes as targets for the mapping of copy number changes. However, this limits the mapping of such imbalances to the resolution limit of metaphase chromosomes (usually 10 to 20 Mb). Efforts to increase this resolution have led to the "new"concept of genomic DNA chip (1 to 2 Mb), whereby the chromosomal target is replaced with cloned DNA immobilized on such as glass slides. The resulting resolution then depends on the size of the immobilized DNA fragments. We have completed the first draft of its Korean Genome Project. The project proceeded by end sequencing inserts from a library of 96,768 bacterial artificial chromosomes (BACs) containing genomic DNA fragments from Korean ethnicity. The sequenced BAC ends were then compared to the Human Genome Project′s publicly available sequence database and aligned according to known cancer gene sequences. These BAC clones were biotinylated by nick translation, hybridized to cytogenetic preparations of metaphase cells, and detected with fluorescein-conjugated avidin. Only locations of unique or low-copy Portions of the clone are identified, because high-copy interspersed repetitive sequences in the probe were suppressed by the addition of unlabelled Cotl DNA. Banding patterns were produced using DAPI. By this means, every BAC fragment has been matched to its appropriate chromosomal location. We have placed 86 (156 BAC clones) cytogenetically defined landmarks to help with the characterization of known cancer genes. Microarray techniques would be applied in CGH by replacement of metaphase chromosome to arrayed BAC confirming in oncogene and tumor suppressor gene: and an array BAC clones from the collection is used to perform a genome-wide scan for segmental aneuploidy by array-CGH. Therefore, the genomic DNA chip (arrayed BAC) will be undoubtedly provide accurate diagnosis of deletions, duplication, insertions and rearrangements of genomic material related to various human phenotypes, including neoplasias. And our tumor markers based on genetic abnormalities of cancer would be identified and contribute to the screening of the stage of cancers and/or hereditary diseases

  • PDF

StrokeBase: A Database of Cerebrovascular Disease-related Candidate Genes

  • Kim, Young-Uk;Kim, Il-Hyun;Bang, Ok-Sun;Kim, Young-Joo
    • Genomics & Informatics
    • /
    • 제6권3호
    • /
    • pp.153-156
    • /
    • 2008
  • Complex diseases such as stroke and cancer have two or more genetic loci and are affected by environmental factors that contribute to the diseases. Due to the complex characteristics of these diseases, identifying candidate genes requires a system-level analysis of the following: gene ontology, pathway, and interactions. A database and user interface, termed StrokeBase, was developed; StrokeBase provides queries that search for pathways, candidate genes, candidate SNPs, and gene networks. The database was developed by using in silico data mining of HGNC, ENSEMBL, STRING, RefSeq, UCSC, GO, HPRD, KEGG, GAD, and OMIM. Forty candidate genes that are associated with cerebrovascular disease were selected by human experts and public databases. The networked cerebrovascular disease gene maps also were developed; these maps describe genegene interactions and biological pathways. We identified 1127 genes, related indirectly to cerebrovascular disease but directly to the etiology of cerebrovascular disease. We found that a protein-protein interaction (PPI) network that was associated with cerebrovascular disease follows the power-law degree distribution that is evident in other biological networks. Not only was in silico data mining utilized, but also 250K Affymetrix SNP chips were utilized in the 320 control/disease association study to generate associated markers that were pertinent to the cerebrovascular disease as a genome-wide search. The associated genes and the genes that were retrieved from the in silico data mining system were compared and analyzed. We developed a well-curated cerebrovascular disease-associated gene network and provided bioinformatic resources to cerebrovascular disease researchers. This cerebrovascular disease network can be used as a frame of systematic genomic research, applicable to other complex diseases. Therefore, the ongoing database efficiently supports medical and genetic research in order to overcome cerebrovascular disease.

인간 전립선암 PC-3 세포에서 Compound K에 의한 세포주기 조절 및 세포사멸 유전자 발현 변화 (Profile of Gene Expression Changes Treated with Compound K Induced Cell Cycle Arrest and Cell Death of Prostate Cancer PC-3 Cell Line)

  • 김광연;박광일;안순철
    • 대한한의학방제학회지
    • /
    • 제29권4호
    • /
    • pp.267-275
    • /
    • 2021
  • Objectives : Previously, we reported that compound K isolated from fermented ginseng by Aspillus oryzae has a wide biochemical and pharmacological effect, including anti-cancer activity in prostate cancer PC-3 cells. Despite these findings, its signaling pathway and gene expression pattern are not clearly understood. Methods : To confirm the gene expression study of treated with compound K in PC-3 cells, a cDNA microarray chip composed of 44K human cDNA probes was used. MTT assay, western blot analysis, propidium iodide staining, and annexin V/propidium iodide staining were analyzed. Results : We confirmed the differences of gene expression profiles. Then, we analyzed with the cell cycle arrest, cell death and cell proliferation related genes using DAVID database. Conclusions : Our finding should be useful for understanding genome-wide expression patterns of compound K-mediated cell cycle arrest toward induction of cell death and be helpful for finding future cancer therapeutic targets for prostate cancer cells.

Integrated bioinformatics analysis of validated and circulating miRNAs in ovarian cancer

  • Dogan, Berkcan;Gumusoglu, Ece;Ulgen, Ege;Sezerman, Osman Ugur;Gunel, Tuba
    • Genomics & Informatics
    • /
    • 제20권2호
    • /
    • pp.20.1-20.13
    • /
    • 2022
  • Recent studies have focused on the early detection of ovarian cancer (OC) using tumor materials by liquid biopsy. The mechanisms of microRNAs (miRNAs) to impact OC and signaling pathways are still unknown. This study aims to reliably perform functional analysis of previously validated circulating miRNAs' target genes by using pathfindR. Also, overall survival and pathological stage analyses were evaluated with miRNAs' target genes which are common in the The Cancer Genome Atlas and GTEx datasets. Our previous studies have validated three downregulated miRNAs (hsa-miR-885-5p, hsa-miR-1909-5p, and hsa-let7d-3p) having a diagnostic value in OC patients' sera, with high-throughput techniques. The predicted target genes of these miRNAs were retrieved from the miRDB database (v6.0). Active-subnetwork-oriented Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis was conducted by pathfindR using the target genes. Enrichment of KEGG pathways assessed by the analysis of pathfindR indicated that 24 pathways were related to the target genes. Ubiquitin-mediated proteolysis, spliceosome and Notch signaling pathway were the top three pathways with the lowest p-values (p < 0.001). Ninety-three common genes were found to be differentially expressed (p < 0.05) in the datasets. No significant genes were found to be significant in the analysis of overall survival analyses, but 24 genes were found to be significant with pathological stages analysis (p < 0.05). The findings of our study provide in-silico evidence that validated circulating miRNAs' target genes and enriched pathways are related to OC and have potential roles in theranostics applications. Further experimental investigations are required to validate our results which will ultimately provide a new perspective for translational applications in OC management.