• 제목/요약/키워드: Microarray Data

Search Result 471, Processing Time 0.022 seconds

Comparison of Normalizations for cDNA Microarray Data

  • Kim, Yun-Hui;Kim, Ho;Park, Ung-Yang;Seo, Jin-Yeong;Jeong, Jin-Ho
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2002.05a
    • /
    • pp.175-181
    • /
    • 2002
  • cDNA microarray experiments permit us to investigate the expression levels of thousands of genes simultaneously and to make it easy to compare gene expression from different populations. However, researchers are asked to be cautious in interpreting the results because of the unexpected sources of variation such as systematic errors from the microarrayer and the difference of cDNA dye intensity. And the scanner itself calculates both of mean and median of the signal and background pixels, so it follows a selection which raw data will be used in analysis. In this paper, we compare the results in each case of using mean and median from the raw data and normalization methods in reducing the systematic errors with arm's skin cells of old and young males. Using median is preferable to mean because the distribution of the test statistic (t-statistic) from the median is more close to normal distribution than that from mean. Scaled print tip normalization is better than global or lowess normalization due to the distribution of the test-statistic.

  • PDF

Deducing Isoform Abundance from Exon Junction Microarray

  • Kim Po-Ra;Oh S.-June;Lee Sang-Hyuk
    • Genomics & Informatics
    • /
    • v.4 no.1
    • /
    • pp.33-39
    • /
    • 2006
  • Alternative splicing (AS) is an important mechanism of producing transcriptome diversity and microarray techniques are being used increasingly to monitor the splice variants. There exist three types of microarrays interrogating AS events-junction, exon, and tiling arrays. Junction probes have the advantage of monitoring the splice site directly. Johnson et al., performed a genome-wide survey of human alternative pre-mRNA splicing with exon junction microarrays (Science 302:2141-2144, 2003), which monitored splicing at every known exon-exon junctions for more than 10,000 multi-exon human genes in 52 tissues and cell lines. Here, we describe an algorithm to deduce the relative concentration of isoforms from the junction array data. Non-negative Matrix Factorization (NMF) is applied to obtain the transcript structure inferred from the expression data. Then we choose the transcript models consistent with the ECgene model of alternative splicing which is based on mRNA and EST alignment. The probe-transcript matrix is constructed using the NMF-consistent ECgene transcripts, and the isoform abundance is deduced from the non-negative least squares (NNLS) fitting of experimental data. Our method can be easily extended to other types of microarrays with exon or junction probes.

Identification of Differentially Expressed Genes Using Tests Based on Multiple Imputations

  • Kim, Sang Cheol;Yu, Donghyeon
    • Quantitative Bio-Science
    • /
    • v.36 no.1
    • /
    • pp.23-31
    • /
    • 2017
  • Datasets from DNA microarray experiments, which are in the form of large matrices of expression levels of genes, often have missing values. However, the existing statistical methods including the principle components analysis (PCA) and Hotelling's t-test are not directly applicable for the datasets having missing values due to the fact that they assume the observed dataset is complete in general. Many methods have been proposed in previous literature to impute the missing in the observed data. Troyanskaya et al. [1] study the k-nearest neighbor (kNN) imputation, Kim et al. [2] propose the local least squares (LLS) method and Rubin [3] propose the multiple imputation (MI) for missing values. To identify differentially expressed genes, we propose a new testing procedure when the missing exists in the observed data. The proposed procedure uses the Stouffer's z-scores and combines the test results of individual imputed samples, which are dependent to each other. We numerically show that the proposed test procedure based on MI performs better than the existing test procedures based on single imputation (SI) by comparing their ROC curves. We apply the proposed method to analyzing a public microarray data.

Surface Modification of Glass Chip for Peptide Microarray (펩타이드 Microarray를 위한 유리 칩의 표면 개질)

  • Cho, Hyung-Min;Lim, Chang-Hwan;Neff, Silke;Jungbauer, Alois;Lee, Eun-Kyu
    • KSBB Journal
    • /
    • v.22 no.4
    • /
    • pp.260-264
    • /
    • 2007
  • Peptides are frequently studied as candidates for new drug development. Recently, synthesized peptide library is screened for a certain functionality on a microarray biochip format. In this study, in order to replace the conventional cellulose membrane with glass for a microarray chip substrate for peptide library screening, we modified the glass surface from amines to thiols and covalently immobilized the peptides. Using trypsin-FITC (fluorescein isothiocyanate) conjugate that could specifically bind to a trypsin binding domain consisting of a 7-amino acid peptide, we checked the degree of surface modification. Because of the relatively lower hydrophilicity and reduced surface roughness, the conjugation reaction to the glass required a longer reaction time and a higher temperature. It took approximately 12 hr for the reaction to be completed. From the fluorescence signal intensity, we could differentiate between the target and the control peptides. This difference was confirmed by a separate experiment using QCM. Furthermore, a smaller volume and higher concentration of a spot showed a higher fluorescence intensity. These data would provide the basic conditions for the development of microarray peptide biochips.

Radioprotective Effects of Propolis on the Mouse Testis Exposed to X-ray. (프로폴리스가 X-선에 노출된 마우스 정소에 미치는 방사선 방어 효과)

  • Ji, Tae-Jung;Kim, Jong-Sik;Jeong, Hyung-Jin;Seo, Eul-Won
    • Journal of Life Science
    • /
    • v.17 no.5 s.85
    • /
    • pp.664-670
    • /
    • 2007
  • The propolis is natural product produced by honeybees and is known to have many biologically useful properties such as anti-microbial, anti-oxidative and anti-tumorigenic activity. However, its radio-protective property has not been well studied. To investigate radio-protective effect of propolis on mouse testis, mice were supplemented with propolis after 5 Gy irradiation. The histological changes of testis were detected by TEM. The results indicate that propolis may protect tissue deformation which is induced by 5 Gy of ionizing radiation. Furthermore, to elucidate the potential molecular mechanisms involved in radio-protective property of propolis, we performed microarray experiments using oligo DNA microarray. We found 65 up-regulated genes and 224 down-regulated genes, whose expression levels were affected more than 2-fold by propolis treatment in mice irradiated at 5 Gy. We confirmed microarray data with reverse transcription-PCR using gene specific primers. The results of RT-PCR are highly correlated with those of microarray. These results may help understanding molecular mechanisms of radioprotective effects by propolis in mouse model.

A Graph Model and Analysis Algorithm for cDNA Microarray Image (cDNA 마이크로어레이 이미지를 위한 그래프 모델과 분석 알고리즘)

  • Jung, Ho-Youl;Hwang, Mi-Nyeong;Yu, Young-Jung;Cho, Hwan-Gue
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.29 no.7
    • /
    • pp.411-421
    • /
    • 2002
  • In this Paper we propose a new Image analysis algorithm for microarray processing and a method to locate the position of the grid cell using the topology of the grid spots. Microarray is a device which enables a parallel experiment of 10 to 100 thousands of test genes in order to measure the gene expression. Because of the huge data obtained by a experiment automated image analysis is needed. The final output of this microarray experiment is a set of 16-bit gray level image files which consist of grid-structured spots. In this paper we propose one algorithm which located the address of spots (spot indices) using graph structure from image data and a method which determines the precise location and shape of each spot by measuring the inclination of grid structure. Several experiments are given from real data sets.

Effect of missing values in detecting differentially expressed genes in a cDNA microarray experiment

  • Kim, Byung-Soo;Rha, Sun-Young
    • Bioinformatics and Biosystems
    • /
    • v.1 no.1
    • /
    • pp.67-72
    • /
    • 2006
  • The aim of this paper is to discuss the effect of missing values in detecting differentially expressed genes in a cDNA microarray experiment in the context of a one sample problem. We conducted a cDNA micro array experiment to detect differentially expressed genes for the metastasis of colorectal cancer based on twenty patients who underwent liver resection due to liver metastasis from colorectal cancer. Total RNAs from metastatic liver tumor and adjacent normal liver tissue from a single patient were labeled with cy5 and cy3, respectively, and competitively hybridized to a cDNA microarray with 7775 human genes. We used $M=log_2(R/G)$ for the signal evaluation, where Rand G denoted the fluorescent intensities of Cy5 and Cy3 dyes, respectively. The statistical problem comprises a one sample test of testing E(M)=0 for each gene and involves multiple tests. The twenty cDNA microarray data would comprise a matrix of dimension 7775 by 20, if there were no missing values. However, missing values occur for various reasons. For each gene, the no missing proportion (NMP) was defined to be the proportion of non-missing values out of twenty. In detecting differentially expressed (DE) genes, we used the genes whose NMP is greater than or equal to 0.4 and then sequentially increased NMP by 0.1 for investigating its effect on the detection of DE genes. For each fixed NMP, we imputed the missing values with K-nearest neighbor method (K=10) and applied the nonparametric t-test of Dudoit et al. (2002), SAM by Tusher et al. (2001) and empirical Bayes procedure by $L\ddot{o}nnstedt$ and Speed (2002) to find out the effect of missing values in the final outcome. These three procedures yielded substantially agreeable result in detecting DE genes. Of these three procedures we used SAM for exploring the acceptable NMP level. The result showed that the optimum no missing proportion (NMP) found in this data set turned out to be 80%. It is more desirable to find the optimum level of NMP for each data set by applying the method described in this note, when the plot of (NMP, Number of overlapping genes) shows a turning point.

  • PDF

The Algorithm Design and Implement of Microarray Data Classification using the Byesian Method (베이지안 기법을 적용한 마이크로어레이 데이터 분류 알고리즘 설계와 구현)

  • Park, Su-Young;Jung, Chai-Yeoung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.10 no.12
    • /
    • pp.2283-2288
    • /
    • 2006
  • As development in technology of bioinformatics recently makes it possible to operate micro-level experiments, we can observe the expression pattern of total genome through on chip and analyze the interactions of thousands of genes at the same time. Thus, DNA microarray technology presents the new directions of understandings for complex organisms. Therefore, it is required how to analyze the enormous gene information obtained through this technology effectively. In this thesis, We used sample data of bioinformatics core group in harvard university. It designed and implemented system that evaluate accuracy after dividing in class of two using Bayesian algorithm, ASA, of feature extraction method through normalization process, reducing or removing of noise that occupy by various factor in microarray experiment. It was represented accuracy of 98.23% after Lowess normalization.

CLUSTERING DNA MICROARRAY DATA BY STOCHASTIC ALGORITHM

  • Shon, Ho-Sun;Kim, Sun-Shin;Wang, Ling;Ryu, Keun-Ho
    • Proceedings of the KSRS Conference
    • /
    • 2007.10a
    • /
    • pp.438-441
    • /
    • 2007
  • Recently, due to molecular biology and engineering technology, DNA microarray makes people watch thousands of genes and the state of variation from the tissue samples of living body. With DNA Microarray, it is possible to construct a genetic group that has similar expression patterns and grasp the progress and variation of gene. This paper practices Cluster Analysis which purposes the discovery of biological subgroup or class by using gene expression information. Hence, the purpose of this paper is to predict a new class which is unknown, open leukaemia data are used for the experiment, and MCL (Markov CLustering) algorithm is applied as an analysis method. The MCL algorithm is based on probability and graph flow theory. MCL simulates random walks on a graph using Markov matrices to determine the transition probabilities among nodes of the graph. If you look at closely to the method, first, MCL algorithm should be applied after getting the distance by using Euclidean distance, then inflation and diagonal factors which are tuning modulus should be tuned, and finally the threshold using the average of each column should be gotten to distinguish one class from another class. Our method has improved the accuracy through using the threshold, namely the average of each column. Our experimental result shows about 70% of accuracy in average compared to the class that is known before. Also, for the comparison evaluation to other algorithm, the proposed method compared to and analyzed SOM (Self-Organizing Map) clustering algorithm which is divided into neural network and hierarchical clustering. The method shows the better result when compared to hierarchical clustering. In further study, it should be studied whether there will be a similar result when the parameter of inflation gotten from our experiment is applied to other gene expression data. We are also trying to make a systematic method to improve the accuracy by regulating the factors mentioned above.

  • PDF

Comparison of clustering with yeast microarray gene expression data (효모 마이크로어레이 유전자발현 데이터에 대한 군집화 비교)

  • Lee, Kyung-A;Kim, Jae-Hee
    • Journal of the Korean Data and Information Science Society
    • /
    • v.22 no.4
    • /
    • pp.741-753
    • /
    • 2011
  • We accomplish clustering analyses for yeast cell cycle microarray expression data. We compare model-based clustering, K-means, PAM, SOM and hierarchical Ward method with yeast data. As the validity measure for clustering results, connectivity, Dunn Index and silhouette values are computed and compared.