Browse > Article
http://dx.doi.org/10.5808/GI.2014.12.4.145

A Review of Three Different Studies on Hidden Markov Models for Epigenetic Problems: A Computational Perspective  

Lee, Kyung-Eun (Ewha Information and Telecommunication Institute, Ewha Womans University)
Park, Hyun-Seok (Ewha Information and Telecommunication Institute, Ewha Womans University)
Abstract
Recent technical advances, such as chromatin immunoprecipitation combined with DNA microarrays (ChIp-chip) and chromatin immunoprecipitation-sequencing (ChIP-seq), have generated large quantities of high-throughput data. Considering that epigenomic datasets are arranged over chromosomes, their analysis must account for spatial or temporal characteristics. In that sense, simple clustering or classification methodologies are inadequate for the analysis of multi-track ChIP-chip or ChIP-seq data. Approaches that are based on hidden Markov models (HMMs) can integrate dependencies between directly adjacent measurements in the genome. Here, we review three HMM-based studies that have contributed to epigenetic research, from a computational perspective. We also give a brief tutorial on HMM modelling-targeted at bioinformaticians who are new to the field.
Keywords
chromatin states; epigenomics; hidden Markov models; noncoding DNA;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 Park HS, Galbadrakh B, Kim YM. Recent progresses in the linguistic modeling of biological sequences based on formal language theory. Genomics Inform 2011;9:5-11.   DOI
2 Searls DB. The language of genes. Nature 2002;420:211-217.   DOI
3 Munch K, Krogh A. Automatic generation of gene finders for eukaryotic species. BMC Bioinformatics 2006;7:263.   DOI
4 Durbin R, Eddy SR, Krogh A, Mitchison G. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge: Cambridge University Press, 1998.
5 Liang KC, Wang X, Anastassiou D. Bayesian basecalling for DNA sequence analysis using hidden Markov models. IEEE/ACM Trans Comput Biol Bioinform 2007;4:430-440.
6 Lottaz C, Iseli C, Jongeneel CV, Bucher P. Modeling sequencing errors by combining Hidden Markov models. Bioinformatics 2003;19 Suppl 2:ii103-ii112.
7 Won KJ, Hamelryck T, Prugel-Bennett A, Krogh A. An evolutionary method for learning HMM structure: prediction of protein secondary structure. BMC Bioinformatics 2007;8:357.   DOI
8 Zhang S, Borovok I, Aharonowitz Y, Sharan R, Bafna V. A sequence- based filtering method for ncRNA identification and its application to searching for riboswitch elements. Bioinformatics 2006;22:e557-e565.   DOI
9 Yoon BJ, Vaidyanathan PP. Structural alignment of RNAs using profile-csHMMs and its application to RNA homology search: overview and new results. IEEE Trans Automat Contr 2008;53:10-25.   DOI
10 Harmanci AO, Sharma G, Mathews DH. Efficient pairwise RNA structure prediction using probabilistic alignment constraints in Dynalign. BMC Bioinformatics 2007;8:130.   DOI
11 Weinberg Z, Ruzzo WL. Sequence-based heuristics for faster annotation of non-coding RNA families. Bioinformatics 2006; 22:35-39.   DOI
12 Li W, Meyer CA, Liu XS. A hidden Markov model for analyzing ChIP-chip experiments on genome tiling arrays and its application to p53 binding sequences. Bioinformatics 2005;21 Suppl 1:i274-i282.   DOI
13 Shen L, Waterland RA. Methods of DNA methylation analysis. Curr Opin Clin Nutr Metab Care 2007;10:576-581.   DOI
14 Bailey T, Krajewski P, Ladunga I, Lefebvre C, Li Q, Liu T, et al. Practical guidelines for the comprehensive analysis of ChIP-seq data. PLoS Comput Biol 2013;9:e1003326.   DOI
15 ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 2012;489:57-74.   DOI   ScienceOn
16 Xu H, Wei CL, Lin F, Sung WK. An HMM approach to genome- wide identification of differential histone modification sites from ChIP-seq data. Bioinformatics 2008;24:2344-2349.   DOI
17 Ernst J, Kellis M. Discovery and characterization of chromatin states for systematic annotation of the human genome. Nat Biotechnol 2010;28:817-825.   DOI
18 Lieberfarb ME, Lin M, Lechpammer M, Li C, Tanenbaum DM, Febbo PG, et al. Genome-wide loss of heterozygosity analysis from laser capture microdissected prostate cancer using single nucleotide polymorphic allele (SNP) arrays and a novel bioinformatics platform dChipSNP. Cancer Res 2003;63:4781-4785.
19 Baum LE, Petrie T, Soules G, Weiss N. A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. Ann Math Stat 1970;41:164-171.   DOI   ScienceOn
20 Ji H, Wong WH. TileMap: create chromosomal map of tiling array hybridizations. Bioinformatics 2005;21:3629-3636.   DOI
21 Arand J, Spieler D, Karius T, Branco MR, Meilinger D, Meissner A, et al. In vivo control of CpG and non-CpG DNA methylation by DNA methyltransferases. PLoS Genet 2012;8:e1002750.   DOI
22 Martin-Magniette ML, Mary-Huard T, Berard C, Robin S. ChIPmix: mixture model of regressions for two-color ChIPchip analysis. Bioinformatics 2008;24:i181-i186.   DOI
23 Johannes F, Wardenaar R, Colome-Tatche M, Mousson F, de Graaf P, Mokry M, et al. Comparing genome-wide chromatin profiles using ChIP-chip or ChIP-seq. Bioinformatics 2010;26: 1000-1006.   DOI
24 Seifert M, Cortijo S, Colome-Tatche M, Johannes F, Roudier F, Colot V. MeDIP-HMM: genome-wide identification of distinct DNA methylation states from high-density tiling arrays. Bioinformatics 2012;28:2930-2939.   DOI
25 Jaschek R, Tanay A. Spatial clustering of multivariate genomic and epigenomic information. Res Comput Mol Biol 2009;5541:170-183.   DOI
26 Ernst J, Kheradpour P, Mikkelsen TS, Shoresh N, Ward LD, Epstein CB, et al. Mapping and analysis of chromatin state dynamics in nine human cell types. Nature 2011;473:43-49.   DOI   ScienceOn
27 Moghaddam AM, Roudier F, Seifert M, Bérard C, Magniette ML, Ashtiyani RK, et al. Additive inheritance of histone modifications in Arabidopsis thaliana intra-specific hybrids. Plant J 2011;67:691-700.   DOI
28 Pachter L, Alexandersson M, Cawley S. Applications of generalized pair hidden Markov models to alignment and gene finding problems. J Comput Biol 2002;9:389-399.   DOI
29 ENCODE Project Consortium. The ENCODE (ENCyclopedia Of DNA Elements) Project. Science 2004;306:636-640.   DOI   ScienceOn