• Title/Summary/Keyword: microarray analysis

Search Result 885, Processing Time 0.024 seconds

Basic Concept of Gene Microarray (Gene Microarray의 기본개념)

  • Hwang, Seung Yong
    • Korean Journal of Biological Psychiatry
    • /
    • v.8 no.2
    • /
    • pp.203-207
    • /
    • 2001
  • The genome sequencing project has generated and will continue to generate enormous amounts of sequence data including 5 eukaryotic and about 60 prokaryotic genomes. Given this ever-increasing amounts of sequence information, new strategies are necessary to efficiently pursue the next phase of the genome project-the elucidation of gene expression patterns and gene product function on a whole genome scale. In order to assign functional information to the genome sequence, DNA chip(or gene microarray) technology was developed to efficiently identify the differential expression pattern of independent biological samples. DNA chip provides a new tool for genome expression analysis that may revolutionize many aspects of biotechnology including new drug discovery and disease diagnostics.

  • PDF

A modified partial least squares regression for the analysis of gene expression data with survival information

  • Lee, So-Yoon;Huh, Myung-Hoe;Park, Mira
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.5
    • /
    • pp.1151-1160
    • /
    • 2014
  • In DNA microarray studies, the number of genes far exceeds the number of samples and the gene expression measures are highly correlated. Partial least squares regression (PLSR) is one of the popular methods for dimensional reduction and known to be useful for the classifications of microarray data by several studies. In this study, we suggest a modified version of the partial least squares regression to analyze gene expression data with survival information. The method is designed as a new gene selection method using PLSR with an iterative procedure of imputing censored survival time. Mean square error of prediction criterion is used to determine the dimension of the model. To visualize the data, plot for variables superimposed with samples are used. The method is applied to two microarray data sets, both containing survival time. The results show that the proposed method works well for interpreting gene expression microarray data.

Microarray Data Analysis of Perturbed Pathways in Breast Cancer Tissues

  • Kim, Chang-Sik;Choi, Ji-Won;Yoon, Suk-Joon
    • Genomics & Informatics
    • /
    • v.6 no.4
    • /
    • pp.210-222
    • /
    • 2008
  • Due to the polygenic nature of cancer, it is believed that breast cancer is caused by the perturbation of multiple genes and their complex interactions, which contribute to the wide aspects of disease phenotypes. A systems biology approach for the identification of subnetworks of interconnected genes as functional modules is required to understand the complex nature of diseases such as breast cancer. In this study, we apply a 3-step strategy for the interpretation of microarray data, focusing on identifying significantly perturbed metabolic pathways rather than analyzing a large amount of overexpressed and underexpressed individual genes. The selected pathways are considered to be dysregulated functional modules that putatively contribute to the progression of disease. The subnetwork of protein-protein interactions for these dysregulated pathways are constructed for further detailed analysis. We evaluated the method by analyzing microarray datasets of breast cancer tissues; i.e., normal and invasive breast cancer tissues. Using the strategy of microarray analysis, we selected several significantly perturbed pathways that are implicated in the regulation of progression of breast cancers, including the extracellular matrix-receptor interaction pathway and the focal adhesion pathway. Moreover, these selected pathways include several known breast cancer-related genes. It is concluded from this study that the present strategy is capable of selecting interesting perturbed pathways that putatively play a role in the progression of breast cancer and provides an improved interpretability of networks of protein-protein interactions.

Improving data reliability on oligonucleotide microarray

  • Yoon, Yeo-In;Lee, Young-Hak;Park, Jin-Hyun
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2004.11a
    • /
    • pp.107-116
    • /
    • 2004
  • The advent of microarray technologies gives an opportunity to moni tor the expression of ten thousands of genes, simultaneously. Such microarray data can be deteriorated by experimental errors and image artifacts, which generate non-negligible outliers that are estimated by 15% of typical microarray data. Thus, it is an important issue to detect and correct the se faulty probes prior to high-level data analysis such as classification or clustering. In this paper, we propose a systematic procedure for the detection of faulty probes and its proper correction in Genechip array based on multivariate statistical approaches. Principal component analysis (PCA), one of the most widely used multivariate statistical approaches, has been applied to construct a statistical correlation model with 20 pairs of probes for each gene. And, the faulty probes are identified by inspecting the squared prediction error (SPE) of each probe from the PCA model. Then, the outlying probes are reconstructed by the iterative optimization approach minimizing SPE. We used the public data presented from the gene chip project of human fibroblast cell. Through the application study, the proposed approach showed good performance for probe correction without removing faulty probes, which may be desirable in the viewpoint of the maximum use of data information.

  • PDF

Performance of the Agilent Microarray Platform for One-color Analysis of Gene Expression

  • Song Sunny;Lucas Anne;D'Andrade Petula;Visitacion Marc;Tangvoranuntakul Pam;FulmerSmentek Stephanie
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2006.02a
    • /
    • pp.78-78
    • /
    • 2006
  • Gene expression analysis can be performed by one-color (intensity-based) or two-color (ratio-based) microarray platforms depending on the specific applications and needs of the researcher. The traditional two-color approach is well founded from a historical and scientific standpoint, and the one-color approach, when paired with high quality microarrays and a robust workflow, offers additional flexibility in experimental design. Two of the major requirements of any microarray platform are system reproducibility, which provides the means for high confidence experiments and accurate comparison across multiple samples; and high sensitivity, for the detection of significant gene expression changes, including small fold changes across multiple gene sets. Each of these requirements is fulfilled by the Agilent One-color Gene Expression Platform as illustrated by the data included in this study. As a result, researchers have the ability to take advantage of the enhanced performance and sensitivity of Agilent's 60-mer oligonucleotide microarrays, and experience the first commercial microarray platform compatible with both one- and two-color detection.

  • PDF

An Iterative Normalization Algorithm for cDNA Microarray Medical Data Analysis

  • Kim, Yoonhee;Park, Woong-Yang;Kim, Ho
    • Genomics & Informatics
    • /
    • v.2 no.2
    • /
    • pp.92-98
    • /
    • 2004
  • A cDNA microarray experiment is one of the most useful high-throughput experiments in medical informatics for monitoring gene expression levels. Statistical analysis with a cDNA microarray medical data requires a normalization procedure to reduce the systematic errors that are impossible to control by the experimental conditions. Despite the variety of normalization methods, this. paper suggests a more general and synthetic normalization algorithm with a control gene set based on previous studies of normalization. Iterative normalization method was used to select and include a new control gene set among the whole genes iteratively at every step of the normalization calculation initiated with the housekeeping genes. The objective of this iterative normalization was to maintain the pattern of the original data and to keep the gene expression levels stable. Spatial plots, M&A (ratio and average values of the intensity) plots and box plots showed a convergence to zero of the mean across all genes graphically after applying our iterative normalization. The practicability of the algorithm was demonstrated by applying our method to the data for the human photo aging study.

Microarray Analysis of Extracranial Arteriovenous Malformation Endothelial Cells

  • Lee, Joon Seok;Oh, Eun Jung;Kim, Hyun Mi;Kwak, Suin;Lee, Seok-Jong;Lee, Jongmin;Huh, Seung;Kim, Ji Yoon;Chung, Ho Yun
    • Journal of Interdisciplinary Genomics
    • /
    • v.4 no.2
    • /
    • pp.31-34
    • /
    • 2022
  • Background: Arteriovenous malformations (AVMs) are rare diseases comprising abnormally dilated arteries and veins with an absence of a capillary network. Since these diseases are intractable after diagnosis, various treatment strategies have been examined, with continuous efforts to identify target genes. Here, we report relevant new target genes selected via gene microarray. Methods: Endothelial cells were isolated from samples collected from three patients with AVM and three healthy individuals, followed by microarray analysis. Additionally, quantitative PCR was performed to select genes highly relevant to AVM. Results: In the vascular endothelial cells derived from the tissues of patients with AVM, the expression of ANGPT1, ANGPT2, DLL4, IL6, NRG1, TGFBR1, and VEGFA was typically higher compared to those derived from normal tissues. Conclusion: Seven candidate genes were selected to analyze the pathophysiological mechanism of AVM. These results may aid in future directions of diagnosis and treatment.

Biological Pathway Extension Using Microarray Gene Expression Data

  • Chung, Tae-Su;Kim, Ji-Hun;Kim, Kee-Won;Kim, Ju-Han
    • Genomics & Informatics
    • /
    • v.6 no.4
    • /
    • pp.202-209
    • /
    • 2008
  • Biological pathways are known as collections of knowledge of certain biological processes. Although knowledge about a pathway is quite significant to further analysis, it covers only tiny portion of genes that exists. In this paper, we suggest a model to extend each individual pathway using a microarray expression data based on the known knowledge about the pathway. We take the Rosetta compendium dataset to extend pathways of Saccharomyces cerevisiae obtained from KEGG (Kyoto Encyclopedia of genes and genomes) database. Before applying our model, we verify the underlying assumption that microarray data reflect the interactive knowledge from pathway, and we evaluate our scoring system by introducing performance function. In the last step, we validate proposed candidates with the help of another type of biological information. We introduced a pathway extending model using its intrinsic structure and microarray expression data. The model provides the suitable candidate genes for each single biological pathway to extend it.

Standard-based Integration of Heterogeneous Large-scale DNA Microarray Data for Improving Reusability

  • Jung, Yong;Seo, Hwa-Jeong;Park, Yu-Rang;Kim, Ji-Hun;Bien, Sang Jay;Kim, Ju-Han
    • Genomics & Informatics
    • /
    • v.9 no.1
    • /
    • pp.19-27
    • /
    • 2011
  • Gene Expression Omnibus (GEO) has kept the largest amount of gene-expression microarray data that have grown exponentially. Microarray data in GEO have been generated in many different formats and often lack standardized annotation and documentation. It is hard to know if preprocessing has been applied to a dataset or not and in what way. Standard-based integration of heterogeneous data formats and metadata is necessary for comprehensive data query, analysis and mining. We attempted to integrate the heterogeneous microarray data in GEO based on Minimum Information About a Microarray Experiment (MIAME) standard. We unified the data fields of GEO Data table and mapped the attributes of GEO metadata into MIAME elements. We also discriminated non-preprocessed raw datasets from others and processed ones by using a two-step classification method. Most of the procedures were developed as semi-automated algorithms with some degree of text mining techniques. We localized 2,967 Platforms, 4,867 Series and 103,590 Samples with covering 279 organisms, integrated them into a standard-based relational schema and developed a comprehensive query interface to extract. Our tool, GEOQuest is available at http://www.snubi.org/software/GEOQuest/.